Authentication Markers for Five Major Panax Species Developed via

May 22, 2017 - (5, 6) Lack of recombination, low nucleotide substitution rates, and usually uniparental inheritance make chloroplast genomes valuable ...
0 downloads 9 Views 1MB Size
Subscriber access provided by CORNELL UNIVERSITY LIBRARY

Article

Authentication markers for five major Panax species developed via comparative analysis of complete chloroplast genome sequences Van Binh Nguyen, Hyun-Seung Park, Sang-Choon Lee, Junki Lee, Jee Young Park, and Tae-Jin Yang J. Agric. Food Chem., Just Accepted Manuscript • Publication Date (Web): 22 May 2017 Downloaded from http://pubs.acs.org on May 23, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Agricultural and Food Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 30

Journal of Agricultural and Food Chemistry

Authentication markers for five major Panax species developed via comparative analysis of complete chloroplast genome sequences

Van Binh Nguyena, Hyun-Seung Parka, Sang-Choon Leea, Junki Leea, Jee Young Parka, TaeJin Yanga,b* a

Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute

of Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea. b

Crop Biotechnology Institute/GreenBio Science and Technology, Seoul National University,

Pyeongchang 232-916, Republic of Korea. *Corresponding author: Tae-Jin Yang E-mail: [email protected] Tel: +82-2-880-4547, Fax: +82-2-877-4550

1 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

1

Abstract

2

Ginseng represents a set of high-value medicinal plants of different species: Panax

3

ginseng (Asian ginseng), P. quinquefolius (American ginseng), P. notoginseng (Chinese

4

ginseng), P. japonicus (Bamboo ginseng), and P. vietnamensis (Vietnamese ginseng). Each

5

species is pharmacologically and economically important, with differences in efficacy and

6

price. Accordingly, an authentication system is needed to combat economically motivated

7

adulteration of Panax products. We conducted comparative analysis of the chloroplast

8

genome sequences of these five species, identifying 34–124 InDels and 141–560 SNPs.

9

Fourteen InDel markers were developed to authenticate the Panax species. Among these,

10

eight were species-unique markers that successfully differentiated one species from the others.

11

We generated at least one species-unique marker for each of the five species, and any of the

12

species can be authenticated by selection among these markers. The markers are reliable,

13

easily detectable, and valuable for applications in the ginseng industry as well as in related

14

research.

15

Keywords: Panax species, Chloroplast genome, Molecular markers, Ginseng authentication

2 ACS Paragon Plus Environment

Page 2 of 30

Page 3 of 30

16

Journal of Agricultural and Food Chemistry

Introduction

17

The Panax genus belongs to the Araliaceae family and contains many important

18

medicinal species, collectively called ‘ginseng’. Of the 14 known species in the Panax genus,

19

five species, Panax ginseng (Asian ginseng), P. quinquefolius (American ginseng), P.

20

notoginseng (Sanchi ginseng; Chinese ginseng), P. japonicus (Bamboo ginseng), and P.

21

vietnamensis (Vietnamese ginseng), are broadly utilized in Korea, the USA, China, Japan,

22

and Vietnam. Each species is well-known as a traditional medicinal plant in oriental countries,

23

and species such as P. ginseng, P. quinquefolius, and P. notoginseng contain

24

protopanaxadiol-type and protopanaxatriol-type saponins1, while other species like P.

25

japonicus and P. vietnamensis contain high quantities of oleanolic acid-type and ocotillol-

26

type saponins, respectively1,2.

27

The high pharmacological and economical value of ginseng means that many

28

economically motivated adulterations (EMAs) of ginseng products have been developed by

29

substituting morphologically similar plant roots, or by mixing different species. Traditionally,

30

the authentication of herb plants was based on morphological and histological inspection.

31

However, these traditional methods are unable to authenticate some Panax species because of

32

their very similar morphological appearances, especially in terms of root shape. For example,

33

P. ginseng and P. quinquefolius, and P. japonicus and P. vietnamensis cannot easily be

34

distinguished from each other. Moreover, many commercial ginseng products are sold in a

35

processed form, such as red ginseng, ginseng powder, liquid extracts, pellets, shredded slices,

36

or even tea, which cannot be authenticated by traditional methods. Ginsenoside profiling

37

methods have been developed to authenticate ginseng samples3. However, the effects of

38

factors such as growth conditions, developmental stage, internal metabolism, storage

39

conditions, and manufacturing processes on secondary metabolite accumulation in ginseng

40

limits the application of such chemical analyses4. Chemical methods are also expensive and

3 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

41

difficult to utilize in high-throughput analysis. Therefore, reliable and practical methods to

42

authenticate ginseng are in high demand.

43

The chloroplast is a plant-specific organelle containing the entire machinery required for

44

photosynthesis and carbon fixation. Chloroplast genomes are generally highly conserved

45

across land plants at the gene level, with a quadripartite structure comprising two inverted

46

repeat blocks (IRs), one large single copy (LSC) region, and one small single copy (SSC)

47

region. As a result of interspecies sequence divergence and intraspecies sequence

48

conservation, the chloroplast genome is valuable for taxonomic classification and phylogeny

49

reconstruction5,6. Lack of recombination, low nucleotide substitution rates, and usually

50

uniparental inheritance make chloroplast genomes valuable sources of genetic markers for

51

phylogenetic analysis and species identification7,8. Chloroplast sequences such as atpF-atpH,

52

matK, psbK-psbI, rbcL, ropC1, rpoB, and trnH-psbA are commonly used as DNA barcodes

53

for plants9-11. In some cases, these sequences were highly efficient for species identification

54

and phylogenetic studies, but they showed low variation in closely related species10.

55

Recently, DNA markers have been developed to authenticate ginseng, including random

56

amplified polymorphic DNA (RAPD)12, microsatellite13, and expressed sequence tag-simple

57

sequence repeat (EST-SSR)14,15 markers, as well as single nucleotide polymorphism (SNP)

58

and insertion and deletion (InDel) markers derived from chloroplast sequences6,16,17.

59

However, thus far, these markers have been used to identify only P. ginseng cultivars at the

60

intraspecies level, and just a few Panax species at the interspecies level.

61

Recently, we developed an efficient method to obtain complete chloroplast genome and

62

nuclear ribosomal DNA (nrDNA) by de novo assembly using low-coverage whole-genome

63

shotgun next-generation sequencing (dnaLCW)18. Using this method, we obtained complete

64

chloroplast genomes and nrDNA for the five Panax species: P. ginseng, P. quinquefolius, P.

65

notoginseng, P. japonicus, and P. vietnamensis6,19. In the present study, we comparatively

4 ACS Paragon Plus Environment

Page 4 of 30

Page 5 of 30

Journal of Agricultural and Food Chemistry

66

analyzed these five chloroplast genome sequences and developed credible chloroplast

67

genome-derived InDel markers to authenticate these Panax species. These markers are

68

valuable tools for the further study of genetic diversity in the Panax genus. They also may be

69

used to support the ginseng industry, which depends on a number of Panax species, for

70

example, P. ginseng in Korea, China, and Japan, P. quinquefolius in the USA, Canada, and

71

China, P. notoginseng in China, and P. vietnamensis in Vietnam.

72 73

Materials and Methods

74

Plant materials and Genomic DNA extraction

75

P. ginseng (cultivars ‘Chunpoong’ (CP) and ‘Yunpoong’ (YP)) and P. quinquefolius

76

plants were collected from the ginseng farm at Seoul National University in Suwon, Korea. P.

77

notoginseng and P. japonicus plants were collected from Dafang County, Guizhou Province,

78

and Enshi County, Hubei Province, China, respectively. P. vietnamensis plants were collected

79

from Ngoc Linh Mountain, Kon Tum Province, Vietnam. Individual leaves and roots of

80

plants from each species were harvested and stored at −70°C until use. Total genomic DNA

81

was isolated using a modified cetyltrimethylammonium bromide (CTAB) method20. The

82

quality and quantity of extracted genomic DNA was measured using a UV-spectrophotometer.

83

Comparative analysis and characterization of SSRs and large sequence repeats

84

The chloroplast genome sequences of P. ginseng cv. CP (KM088019), P. ginseng cv. YP

85

(KM088020), P. quinquefolius (KM088018), P. notoginseng (KP036468), P. japonicus

86

(KP036469), and P. vietnamensis (KP036471) were obtained from our previous studies6,19.

87

MAFFT

88

(http://genome.lbl.gov/vista/mvista/submit.shtml) programs were used to compare these

89

sequences.

(http://mafft.cbrc.jp/alignment/server/),

MEGA

5 ACS Paragon Plus Environment

621,

and

mVISTA

Journal of Agricultural and Food Chemistry

90

SSRs were predicted using MISA (http://pgrc.ipk-gatersleben.de/misa/misa.html)22 with

91

the parameters set to ≥10 repeat units for mononucleotide SSRs, ≥5 repeat units for

92

dinucleotide SSRs, ≥4 repeat units for trinucleotide SSRs, and ≥3 repeat units for

93

tetranucleotide, pentanucleotide, and hexanucleotide SSRs.

94

The program REPuter23 was used to identify the number and location of repeat sequences,

95

including tandem, dispersed, reverse, and palindromic repeats within chloroplast genomes.

96

For all repeat types, the following constraints were set to a minimum repeat size of 15 bp, and

97

90% minimum cut-off identity between two copies. Overlapping repeats were merged into

98

one repeat motif whenever possible.

99

Development and validation of InDel markers

100

To validate intraspecies and interspecies polymorphism in the chloroplast genomes, and

101

develop DNA markers to authenticate the five major Panax species, specific primers were

102

designed based on InDel polymorphic sites found in Panax chloroplast genomes, including P.

103

ginseng cv. YP (KM088020) as an intraspecies level. Primer pairs were designed using the

104

Primer3 program (http://bioinfo.ut.ee/primer3-0.4.0/).

105

The polymerase chain reaction (PCR) was performed in a 25 µl reaction mixture

106

containing 20 ng of DNA template, 5 pmol of each primer, 1.25 mM deoxynucleotide

107

triphosphate (dNTP), 1.25 units of Taq DNA polymerase (Inclone, Korea), and 2.5 µl of 10×

108

reaction buffer. The PCR reaction was performed in thermocyclers using the following

109

cycling parameters: 94°C (5 min); 35 cycles of 94°C (30 s), 54–58°C (30 s); 72°C (30 s),

110

then 72°C (7 min). PCR products were visualized on agarose gels (1.5–3.0%) after staining

111

with ethidium bromide, and/or analyzed by capillary electrophoresis using a fragment

112

analyzer (Advanced Analytical Technologies Inc., USA).

113 114

Results

6 ACS Paragon Plus Environment

Page 6 of 30

Page 7 of 30

115

Journal of Agricultural and Food Chemistry

Structure of complete chloroplast genomes of five Panax species

116

Complete chloroplast genomes of five Panax species were obtained by dnaLCW and

117

reported in our previous studies6,19. Lengths of the chloroplast genomes ranged from 155,993

118

bp (P. vietnamensis) to 156,466 bp (P. notoginseng). The order, content, and orientation of

119

genes were highly conserved among these five chloroplast genomes, comprising 79 protein-

120

coding genes, 30 tRNA genes, and 4 rRNA genes in common. Each chloroplast genome had

121

the same quadripartite structure with an LSC, an SSC, and two IR regions (Fig. 1).

122

Variations in the copy numbers of SSRs were identified among the five Panax chloroplast

123

genomes. The longest SSRs were 18 nucleotides in length, and the most abundant nucleotides

124

in SSRs were A and T (Table 1). P. vietnamenis and P. notoginseng both contained 42 SSRs;

125

lower than P. japonicus (45), and higher than P. ginseng cv. CP (38) and P. quinquefolius (40)

126

(Table 1). The P. japonicus chloroplast genome has the highest number of homopolymers

127

(24), but no hexapolymers. P. vietnamensis, P. notoginseng, and P. ginseng cv. CP all had 6

128

dipolymers; lower than P. japonicus and P. quinquefolius (7). P. notoginseng had 4

129

tripolymers; more than each of the other Panax species (3). P. vietnamensis had 10

130

tetrapolymers; more than the other Panax species (8). P. japonicus and P. notoginseng had 3

131

pentapolymers each, while the other Panax species contained 2 (Table 1).

132

For comparative analysis, repetitive sequences were grouped into four categories: tandem,

133

dispersed, palindromic, and reverse. To avoid redundancy, repeat sequence analysis of each

134

chloroplast genome was carried out with a single IR region. A total of 45–50 repetitive

135

sequences were identified in each of the five Panax chloroplast genomes (Fig. 2B), including

136

dispersed repeats (44.26%), palindromic repeats (29.79%), tandem repeats (22.13%), and

137

reverse repeats (3.83%). Most repeats were located in intergenic spaces (IGS) (53.83%) and

138

coding sequence (CDS) regions (38.94%); a small number of repeats (7.23%) were found in

139

intron regions (Fig. 2C). Across the five chloroplast genomes, repeat lengths ranged from

7 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

140

17 bp to 67 bp (Fig. 2A). The 67-bp tandem repeat found in P. japonicus was the longest

141

repeat (Fig. 2A). Palindromic and reverse repeats ranged from 17–35 bp and 17–28 bp,

142

respectively (Fig. 2A). Comparing the length and location, 32 repeats were shared by all five

143

Panax species, and 5 repeats were present in four chloroplast genomes. P. notoginseng had

144

the most unique repeats (9), followed by P. ginseng and P. vietnamensis (6), P. japonicus (3),

145

and P. quinquefolius (1) (Fig. 2D).

146

Divergence of chloroplast genomes among Panax species

147

To investigate chloroplast genome divergence among the five Panax species, multiple

148

alignments of six chloroplast genomes were performed with a single IR region. The sequence

149

identity of six chloroplast genomes was plotted using the mVISTA program, using the Panax

150

ginseng cv. CP annotation as the reference (Fig. 3). Two cultivating varieties of ginseng, CP

151

and YP, were almost identical with 3 InDels and 1 SNP overall (Table 2). Five Panax species

152

also showed high sequence similarity with each other (98.9%) (Fig. 3), suggesting that Panax

153

chloroplast genomes are well conserved. Our results are consistent with earlier research on

154

Araliaceae chloroplast genomes8. As expected, coding regions are more conserved than non-

155

coding regions. The most highly conserved chloroplast genome coding regions are the four

156

rRNA and 30 tRNA genes (Fig. 3). The most divergent coding region is the ycf1 gene, which

157

has low sequence identity because of various InDels and repeat sequences (Fig. 3). This has

158

been reported in previous research on other chloroplast genomes8,24.

159

Although chloroplast genome diversity at the intraspecies level is low, abundant

160

polymorphism was identified between species. At the interspecies level, the number of SNPs

161

ranged between 141 (P. ginseng versus P. quinquefolius) and 560 (P. ginseng versus P.

162

vietnamensis), and the number of InDels ranged between 34 (P. ginseng versus P.

163

quinquefolius) and 124 (P. notoginseng versus P. vietnamensis) (Table 2). A/T SNP

164

substitutions were more frequent than other types, in agreement with previous studies5,25.

8 ACS Paragon Plus Environment

Page 8 of 30

Page 9 of 30

Journal of Agricultural and Food Chemistry

165

Fewer substitutions and InDels were found between P. ginseng and P. quinquefolius than

166

between the other three Panax species. The ratios of nucleotide substitution events to InDel

167

events (S/I) for different pairwise comparisons between species ranged between 4.06 (P.

168

ginseng versus P. quinquefolius) and 5.64 (P. quinquefolius versus P. japonicus) (Table 2).

169

S/I ratios are thought to increase with divergence time between genomes26. Our results

170

indicate that P. ginseng is more closely related to P. quinquefolius (S/I = 4.06), and that P.

171

quinquefolius is more highly divergent from P. japonicus (S/I = 5.64), compared to others.

172

Development of InDel markers to identify Panax species

173

Based on chloroplast genome sequence alignment, the 14 most InDel-variable loci were

174

selected to develop 14 potentially discriminate markers (Table 3; Fig. 1). Each of these 14

175

markers were successfully amplified by PCR, and each showed the expected polymorphic

176

band sizes. Eight of these 14 markers had unique amplicon sizes specific to different Panax

177

species (Table 4).

178

The marker gcpm2 was specific to both P. ginseng cultivars (CP and YP), and was

179

derived from a 33-bp tandem repeat (TR) in the rps16–trnQ-UUG region (Table 4; Fig. 4B).

180

The P. notoginseng-specific markers gcpm3, gcpm8, and gcpm10 were derived from a 25-bp

181

TR in the atpH–atpI region, a 38-bp TR in the petA–psbJ region, and a 25-bp TR in the

182

rpl14–rpl16 region, respectively (Table 4; Fig. 4C, H, K). The gcpm4 marker was derived

183

from a 23-bp TR in the rps2–rpoC2 region, and was specific to P. quinquefolius (Table 4; Fig.

184

4D). The marker gcpm6 was derived from a 67-bp TR and a 19-bp InDel in the trnE–trnT

185

region, and was specific to P. japonicus (Table 4; Fig. 4F). Finally, the markers gcpm9 and

186

gcpm14 were derived from a 25-bp TR and a 6-bp SSR in the clpP–psbB region, and a 30-bp

187

InDel in ycf1, respectively, and were specific to P. vietnamensis (Table 4; Fig. 4I, O).

188

Validation results revealed that six markers, gcpm1, 5, 7, 11, 12, and 13, were able to

189

identify more than two species, of which gcpm12 (derived from a 57-bp TR in the ycf1 gene)

9 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

190

was the most variable (Table 4; Fig. 4M). In addition, gcpm12 also distinguished between the

191

two P. ginseng cultivars (Table 4; Fig. 4M).

192 193

Discussion

194

Repetitive sequences in Panax chloroplast genomes

195

Repeat structure plays an important role in genomic rearrangement and divergence in

196

chloroplast genomes via illegitimate recombination and slipped-strand mispairing27,28. SSRs

197

are direct tandem repeated DNA sequences consisting of short (1–10 bp) nucleotide motifs.

198

The highly polymorphic, non-recombinant and uniparentally inherited nature of SSRs means

199

they are often used as genetic molecular markers for population genetics29,30, and the study of

200

ecology and evolution. In this study, we found 207 SSRs (Table 1), varying in number and

201

type between five major Panax species.

202

It is considered that species divergence in chloroplast genomes is mostly caused by

203

repetitive sequences. Indeed, we found many repetitive sequences associated with

204

polymorphic chloroplast sites among the five major Panax species studied. Among the many

205

repeats in the ycf3 gene, the most divergent target region was a 57-bp TR. TRs such as this

206

can be used to develop molecular markers for the identification and authentication of Panax

207

species6.

208

Comparative analysis of Panax chloroplast genomes

209

Polymorphism at the intraspecies level polymorphism is very low compared to that at the

210

interspecies level. In P. ginseng, 12 chloroplast genomes derived from different cultivating

211

varieties revealed over 99.9% sequence similarity, and only 6 InDels and 6 SNPs6.

212

Meanwhile, of a total of 201 InDels and 962 SNPs identified among five Panax chloroplast

213

genome sequences, 34–124 InDels and 141–560 SNPs were shared between two Panax

10 ACS Paragon Plus Environment

Page 10 of 30

Page 11 of 30

Journal of Agricultural and Food Chemistry

214

species (Table 2). However, the chloroplast genome is highly conserved within Panax species,

215

and has high similarity (≥98.9%) at the nucleotide sequence level.

216

In chloroplast genomes, nucleotide substitution has been used to examine plant evolution

217

and genome differentiation between species7,17. These InDel events are mainly attributed to

218

the repetition of an adjacent sequence, probably resulting from slipped-strand mispairing in

219

DNA replication31. InDels play a major role in genome size evolution and are increasingly

220

used in phylogenetic studies32,33. S/I ratios have been reported to increase with divergence

221

time between genomes26. In this study, we identified many SNPs and InDels from the

222

complete chloroplast genomes of five major Panax species; S/I ratios ranged between 4.06 (P.

223

ginseng versus P. quinquefolius) and 5.64 (P. quinquefolius versus P. japonicus) (Table 2).

224

Along with their similar morphologies and distributions, this indicates that P. ginseng is

225

closely related to P. quinquefolius (S/I = 4.06), and is consistent with previous reports34,35. P.

226

quinquefolius is highly divergent from P. japonicus compared to others (S/I = 5.64).

227

Use of molecular markers to authenticate ginseng species

228

Biological diversity is a valuable and vulnerable natural resource. The first steps towards

229

protecting and benefiting from national biodiversity are to sample, identify, and study

230

biological specimens. The use of molecular markers (i.e., DNA barcoding) has a powerful

231

role to play in attaining many of the Millennium Development Goals, and reaching the

232

objectives of the Convention on Biological Diversity, by sustaining natural resources,

233

protecting endangered species, and identifying pests and pathogens at any life stage so as to

234

more easily control them. DNA barcoding can also help to monitor food, water and

235

environment quality by studying contaminating organisms.

236

Although relatively new, the use of molecular markers is well on its way to being

237

accepted as a global standard for species identification, and will play a major role in the

238

future of taxonomy36. DNA-based species identification offers enormous potential for the

11 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

239

biological scientific community, educators, and the interested public. The complete

240

chloroplast genome has a conserved sequence from 110–160 kbp, which far exceeds the

241

length of commonly used molecular markers and provides greater variation to discriminate

242

between closely related species37. The chloroplast genome is smaller in size and hundreds

243

times of higher copy numbers in a cell compared with the nuclear genome and has solid

244

interspecific divergence with lower intraspecific variation, which makes chloroplast genome-

245

based marker is suitable as DNA barcoding target37.

246

It is important to be able to authenticate natural health products (NHPs) from legal,

247

economic, health and conservation viewpoints. NHPs are often trusted by the public to be

248

safe; however, adulterated, counterfeit and low quality products can seriously threaten

249

consumer safety38,39. Ginseng plants have high economic and medicinal value, thus there are

250

many EMA ginseng products. The use of molecular methods to accurately identify the origins

251

of ginseng products is important to the development of the ginseng industry in those countries

252

where ginseng is cultivated, such as Korea, the USA, China, Japan, and Vietnam. Recently,

253

the chloroplast genome sequences rbcL, matK, trnH-psbA, and trnL-trnF, and a nuclear

254

internal transcribed spacer (ITS) of nuclear 45S ribosomal RNA genes, have successfully

255

been used as molecular markers for several plant species9,40, but these loci have little

256

variability in Panax36. Although many studies have sought to authenticate ginseng13-15,41,

257

application remains limited because of a lack of genome information and comprehensive

258

comparative genomic analysis against related Panax species.

259

In this study, we developed chloroplast-derived, species-specific markers for each of five

260

major Panax species. Among 14 molecular markers, three markers were specific to P.

261

notoginseng, and two were specific to P. vietnamensis. P. ginseng, P. quinquefolius and P.

262

japonicus could also each be identified by species-specific markers (Table 4; Fig. 4). We

263

discovered different 57-bp tandem repeats in the ycf1 gene of different Panax chloroplast

12 ACS Paragon Plus Environment

Page 12 of 30

Page 13 of 30

Journal of Agricultural and Food Chemistry

264

genomes, and this polymorphism proved powerful in the identification of Panax species

265

(Table 4; Fig. 4M). However, the gcpm12 marker, derived from the ycf1 gene, unexpectedly

266

produced two amplified bands from P. notoginseng, of which the A allele was the expected

267

allele of the P. ginseng type; allele D was not expected, but was same as the allele of P.

268

Vietnamese, although all other species gave rise to a single band (Table 4; Fig. 4A, M). We

269

assume that these unexpected bands are derived from heteroplasmy of the target sequences

270

caused by different IRA and IRB regions related to ycf1 gene sequences; alternatively, they

271

could be caused by chloroplast DNAs transferred into the nuclear or mitochondrial

272

genomes42-45. In our previous study6, we also found intra-species polymorphism in P. ginseng

273

at this position, indicating that, although highly informative, the marker designed from this

274

region may be confusing for species authentication. Overall, we suggest that intra-species

275

polymorphism and a combination of several markers should be considered for credible

276

authentication between different species.

277 278

Abbreviation Used

279

InDel, insertion or deletion; SNP, single nucleotide polymorphism; DNA, deoxyribonucleic

280

acid; bp, base pair.

281 282

Funding

283

This work was supported by: the Cooperative Research Program for Agriculture Science &

284

Technology Development of the Rural Development Administration (grant number

285

PJ01100801); the Ministry of Food and Drug Safety (grant number 16172MFDS229,

286

awarded in 2016); and the Bio & Medical Technology Development Program of the NRF,

287

MSIP (grant number NRF-2015M3A9A5030733), Republic of Korea.

288

13 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

289

References

290

1. Zhu, S.; Zou, K.; Fushimi, H.; Cai, S.; Komatsu, K. Comparative study on triterpene

291 292 293

saponins of Ginseng drugs. Planta Med. 2004, 70, 666-677. 2. Yamasaki, K. Bioactive saponins in Vietnamese ginseng, Panax vietnamensis. Pharm Biol. 2000, 38, 16-24.

294

3. Chan, T.; But, P.; Cheng, S.; Kwok, I.; Lau, F.; Xu, H. Differentiation and authentication

295

of Panax ginseng, Panax quinquefolius, and ginseng products by using HPLC/MS. Anal

296

Chem. 2000, 72, 1281-1287.

297 298

4. Ngan, F.; Shaw, P.; But, P.; Wang, J. Molecular authentication of Panax species. Phytochemistry. 1999, 50, 787-791.

299

5. Huang, H.; Shi, C.; Liu, Y.; Mao, S. Y.; Gao, L. Z. Thirteen Camellia chloroplast genome

300

sequences determined by high-throughput sequencing: genome structure and phylogenetic

301

relationships. BMC Evol Biol. 2014, 14, 151.

302

6. Kim, K.; Lee, S. C.; Lee, J.; Lee, H. O.; Joh, H. J.; Kim, N. H.; Park, H. S.; Yang, T. J.

303

Comprehensive survey of genetic diversity in chloroplast genomes and 45S nrDNAs

304

within Panax ginseng species. PloS one. 2015, 10.

305

7. Wolfe, K. H.; Li, W. H.; Sharp, P. M. Rates of nucleotide substitution vary greatly among

306

plant mitochondrial, chloroplast, and nuclear DNAs. Proc Natl Acad Sci. 1987, 84, 9054-

307

9058.

308 309

8. Li, R.; Ma, P. F.; Wen, J.; Yi, T. S. Complete sequencing of five Araliaceae chloroplast genomes and the phylogenetic implications. PloS one. 2013, 8, e78568.

310

9. Hollingsworth, P. M.; Forrest, L. L.; Spouge, J. L.; Hajibabaei, M.; Ratnasingham, S.; van

311

der Bank, M.; Chase, M. W.; Cowan, R. S.; Erickson, D. L.; Fazekas, A. J. A DNA

312

barcode for land plants. Proc Natl Acad Sci. 2009, 106, 12794-12797.

14 ACS Paragon Plus Environment

Page 14 of 30

Page 15 of 30

Journal of Agricultural and Food Chemistry

313

10. Dong, W.; Liu, J.; Yu, J.; Wang, L.; Zhou, S. Highly variable chloroplast markers for

314

evaluating plant phylogeny at low taxonomic levels and for DNA barcoding. PloS one.

315

2012, 7, e35071.

316

11. Elansary, H. O.; Ashfaq, M.; Ali, H. M.; Yessoufou, K. The first initiative of DNA

317

barcoding of ornamental plants from Egypt and potential applications in horticulture

318

industry. PloS one 2017, 12, e0172170.

319

12. Artyukova, E.; Kozyrenko, M.; Reunova, G.; Muzarok, T.; Zhuravlev, Y. N. RAPD

320

analysis of genome variability of planted ginseng, Panax ginseng. Mol Biol. 2000, 34,

321

297-302.

322

13. Ma, K. H.; Dixit, A.; Kim, Y. C.; Lee, D. Y.; Kim, T. S.; Cho, E. G.; Park, Y. J.

323

Development and characterization of new microsatellite markers for ginseng (Panax

324

ginseng CA Meyer). Conservation Genetics. 2007, 8, 1507-1509.

325

14. Kim, N. H.; Choi, H. I.; Ahn, I. O.; Yang, T. J. EST-SSR marker sets for practical

326

authentication of all nine registered ginseng cultivars in Korea. J Ginseng Res. 2012, 36,

327

298.

328

15. Choi, H. I.; Kim, N. H.; Kim, J. H.; Choi, B. S.; Ahn, I. O.; Lee, J. S.; Yang, T. J.

329

Development of reproducible EST-derived SSR markers and assessment of genetic

330

diversity in Panax ginseng cultivars and related species. J Ginseng Res. 2011, 35, 399.

331

16. Jung, J.; Kim, K. H.; Yang, K.; Bang, K. H.; Yang, T. J. Practical application of DNA

332

markers for high-throughput authentication of Panax ginseng and Panax quinquefolius

333

from commercial ginseng products. J Ginseng Res. 2014, 38, 123-129.

334

17. Kim, J. H.; Jung, J. Y.; Choi, H. I.; Kim, N. H.; Park, J. Y.; Lee, Y.; Yang, T. J. Diversity

335

and evolution of major Panax species revealed by scanning the entire chloroplast

336

intergenic spacer sequences. Genet Resour Crop Ev. 2013, 60, 413-425.

15 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

337

18. Kim, K.; Lee, S. C.; Lee, J.; Yu, Y.; Yang, K.; Choi, B. S.; Koh, H. J.; Waminal, N. E.;

338

Choi, H. I.; Kim, N. H. et al. Complete chloroplast and ribosomal sequences for 30

339

accessions elucidate evolution of Oryza AA genome species. Sci Rep. 2015, 5, 15655.

340

19. Kim, K.; Nguyen, V. B.; Dong, J. Z.; Wang, Y.; Park, J. Y.; Lee, S. C.; Yang, T. J.

341

Evolution of the Araliaceae family inferred from complete chloroplast genomes and 45S

342

nrDNAs of 10 Panax-related species. Sci Rep. 2016 (In press).

343

20. Allen, G.; Flores-Vergara, M.; Krasynanski, S.; Kumar, S.; Thompson, W. A modified

344

protocol for rapid DNA isolation from plant tissues using cetyltrimethylammonium

345

bromide. Nat Protoc. 2006, 1, 2320-2325.

346 347

21. Tamura, K.; Stecher, G.; Peterson, D.; Filipski, A.; Kumar, S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013, 30, 2725-2729.

348

22. Thiel, T.; Michalek, W.; Varshney, R.; Graner, A. Exploiting EST databases for the

349

development and characterization of gene-derived SSR-markers in barley (Hordeum

350

vulgare L.). Theor Appl Genet. 2003, 106, 411-422.

351

23. Kurtz, S.; Choudhuri, J. V.; Ohlebusch, E.; Schleiermacher, C.; Stoye, J.; Giegerich, R.

352

REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic acids

353

Res. 2001, 29, 4633-4642.

354

24. Nie, X.; Lv, S.; Zhang, Y.; Du, X.; Wang, L.; Biradar, S. S.; Tan, X.; Wan, F.; Weining, S.

355

Complete chloroplast genome sequence of a major invasive species, crofton weed

356

(Ageratina adenophora). PloS one. 2012, 7, E36869.

357

25. Xu, D.; Abe, J.; Gai, J.; Shimamoto, Y. Diversity of chloroplast DNA SSRs in wild and

358

cultivated soybeans: evidence for multiple origins of cultivated soybean. Theor Appl

359

Genet. 2002, 105, 645-653.

16 ACS Paragon Plus Environment

Page 16 of 30

Page 17 of 30

Journal of Agricultural and Food Chemistry

360

26. Chen, J. Q.; Wu, Y.; Yang, H.; Bergelson, J.; Kreitman, M.; Tian, D. Variation in the

361

ratio of nucleotide substitution and indel rates across genomes in mammals and bacteria.

362

Mol Biol Evol. 2009, 26, 1523-1531.

363

27. Kim, K. J.; Lee, H. L. Complete chloroplast genome sequences from Korean ginseng

364

(Panax schinseng Nees) and comparative analysis of sequence evolution among 17

365

vascular plants. DNA Res. 2004, 11, 247-261.

366

28. Asano, T.; Tsudzuki, T.; Takahashi, S.; Shimada, H.; Kadowaki, K. I. Complete

367

nucleotide sequence of the sugarcane (Saccharum officinarum) chloroplast genome: a

368

comparative analysis of four monocot chloroplast genomes. DNA Res. 2004, 11, 93-99.

369

29. Doorduin, L.; Gravendeel, B.; Lammers, Y.; Ariyurek, Y.; Chin-A-Woeng, T.; Vrieling,

370

K. The complete chloroplast genome of 17 individuals of pest species Jacobaea vulgaris:

371

SNPs, microsatellites and barcoding markers for population and phylogenetic studies.

372

DNA Res. 2011, dsr002.

373

30. He, S.; Wang, Y.; Volis, S.; Li, D.; Yi, T. Genetic diversity and population structure:

374

implications for conservation of wild soybean (Glycine soja Sieb. et Zucc) based on

375

nuclear and chloroplast microsatellite variation. Int J Mol Sci. 2012, 13, 12608-12628.

376

31. Leseberg, C. H.; Duvall, M. R. The complete chloroplast genome of Coix lacryma-jobi

377

and a comparative molecular evolutionary analysis of plastomes in cereals. J Mol Evol.

378

2009, 69, 311-318.

379 380 381 382

32. Grover, C. E.; Yu, Y.; Wing, R. A.; Paterson, A. H.; Wendel, J. F. A phylogenetic analysis of indel dynamics in the cotton genus. Mol Biol Evol. 2008, 25, 1415-1428. 33. Britten, R. J.; Rowen, L.; Williams, J.; Cameron, R. A. Majority of divergence between closely related DNA samples is due to indels. Proc Natl Acad Sci. 2003, 100, 4661-4665.

17 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

383

34. Nguyen, B.; Kim, K.; Kim, Y. C.; Lee, S. C.; Shin, J. E.; Lee, J.; Kim, N. H.; Jang, W.;

384

Choi, H. I.; Yang, T. J. The complete chloroplast genome sequence of Panax

385

vietnamensis Ha et Grushv (Araliaceae). Mitochondrial DNA. 2015, 1-2.

386

35. Zhu, S.; Fushimi, H.; Cai, S.; Chen, H.; Komatsu, K. A new variety of the genus Panax

387

from southern Yunnan, China and its nucleotide sequences of 18S ribosomal RNA gene

388

and matK gene. J Jap Bot. 2003, 78, 86-94.

389 390 391 392 393 394

36. Zuo, Y.; Chen, Z.; Kondo, K.; Funamoto, T.; Wen, J.; Zhou, S. DNA barcoding of Panax species. Planta Med. 2011, 77, 182. 37. Li, X.; Yang, Y.; Henry, R. J.; Rossetto, M.; Wang, Y.; Chen, S. Plant DNA barcoding: from gene to genome. Biol Rev. 2015, 90, 157-166. 38. Mine, Y.; Young, D. Regulation of natural health products in Canada. Food Sci Technol Res. 2009, 15, 459-468.

395

39. Wallace, L. J.; Boilard, S. M.; Eagle, S. H.; Spall, J. L.; Shokralla, S.; Hajibabaei, M.

396

DNA barcodes for everyday life: Routine authentication of Natural Health Products. Food

397

Res Int. 2012, 49, 446-452.

398

40. Li, D. Z.; Gao, L. M.; Li, H. T.; Wang, H.; Ge, X. J.; Liu, J. Q.; Chen, Z. D.; Zhou, S. L.;

399

Chen, S. L.; Yang, J. B. Comparative analysis of a large dataset indicates that internal

400

transcribed spacer (ITS) should be incorporated into the core barcode for seed plants.

401

Proc Natl Acad Sci. 2011, 108, 19641-19646.

402

41. Chen, X.; Liao, B.; Song, J.; Pang, X.; Han, J.; Chen, S. A fast SNP identification and

403

analysis of intraspecific variation in the medicinal Panax species based on DNA

404

barcoding. Gene. 2013, 530, 39-43.

405

42. Massouh, A.; Schubert, J.; Yaneva-Roder, L.; Ulbricht-Jones, E. S.; Zupok, A.; Johnson,

406

M. T.; Wright, S.; Pellizzer, T.; Sobanski, J.; Bock, R. Spontaneous chloroplast mutants

18 ACS Paragon Plus Environment

Page 18 of 30

Page 19 of 30

Journal of Agricultural and Food Chemistry

407

mostly occur by replication slippage and show a biased pattern in the plastome of

408

Oenothera. The Plant Cell. 2016, 28, 911-29.

409

43. Park, S.; Ruhlman, T. A.; Sabir, J. S.; Mutwakil, M. H.; Baeshen, M. N.; Sabir, M. J.;

410

Baeshen, N. A.; Jansen, R. K. Complete sequences of organelle genomes from the

411

medicinal plant Rhazya stricta (Apocynaceae) and contrasting patterns of mitochondrial

412

genome evolution across asterids. BMC genomics. 2014, 15, 405.

413

44. Goremykin, V. V.; Salamini, F.; Velasco, R.; Viola, R. Mitochondrial DNA of Vitis

414

vinifera and the issue of rampant horizontal gene transfer. Mol Biol Evol. 2009, 26, 99-

415

110.

416

45. Matsuo, M.; Ito, Y.; Yamauchi, R.; Obokata, J., The rice nuclear genome continuously

417

integrates, shuffles, and eliminates the chloroplast genome to cause chloroplast–nuclear

418

DNA flux. The Plant Cell. 2005, 17, 665-675.

419

420

19 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

421

FIGURE CAPTIONS

422

Fig. 1. Complete chloroplast genomes of five Panax species.

423

Colored boxes show conserved chloroplast genes, classified based on product function.

424

Genes shown inside the circle are transcribed clockwise, and those outside the circle are

425

transcribed counterclockwise. Genes belonging to different functional groups are color-coded.

426

Dashed area in the inner circle indicates the GC content of the chloroplast genome; blue

427

triangles indicate the positions of 14 markers (gcpm1–gcpm14) on the chloroplast genomes.

428

Fig. 2. Repeat structure analysis of the five Panax species chloroplast genomes.

429

(A) Histogram showing the frequency of repeats by length in the five Panax chloroplast

430

genomes.

431

(B) Histogram showing the number of four repeat types in each Panax chloroplast genome.

432

(C) Location of the 235 repeats on the five Panax species chloroplast genomes.

433

(D) Venn diagram showing the repeats shared among the five Panax species.

434

Fig. 3. Comparison of chloroplast genome sequences of five Panax species.

435

Pair-wise comparison of chloroplast genomes between Panax species using the mVISTA

436

program with P. ginseng cv. CP as the reference. Genome regions are color-coded as protein

437

coding (purple), rRNA or tRNA coding genes (blue), and noncoding sequences (pink).

438

Fig. 4. Validation of 14 molecular markers derived from InDel regions of five Panax

439

chloroplast genomes.

440

Schematic diagrams indicate InDel regions between Panax species. Tandem repeats and

441

inserted sequences are designated by pentagons and diamonds, respectively. Dotted and solid

442

lines indicate deleted and conserved sequences; left and right black arrows indicate forward

443

and reverse primers, respectively. The 14 InDel markers are denoted gcpm1 to gcpm14.

444

Different alleles are shown via capillary electrophoresis (A – F, I, N), and agarose gel

445

electrophoresis (G, H, K – M, O). Abbreviated species names shown on schematic diagrams 20 ACS Paragon Plus Environment

Page 20 of 30

Page 21 of 30

Journal of Agricultural and Food Chemistry

446

and amplicons: PgCP, P. ginseng cv. CP; PgYP, P. ginseng cv. YP; Pq, P. quinquefolius; Pn,

447

P. notoginseng; Pj, P. japonicus; Pv, P. vietnamensis; M, 100-bp DNA ladder.

21 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 22 of 30

Table 1. SSRs in the five Panax chloroplast genomes. Number of SSRs Motif

Repeat units

PgCP

Pq

Pn

Pj

Pv

A/T

14

16

19

17

17

C/G

4

3

1

7

3

AT/AT

6

7

6

7

6

AAG/CTT

1

1

1

1

1

AAT/ATT

1

1

3

2

2

AGC/CTG

1

1

-

-

-

AAAG/CTTT

3

3

3

3

3

AAAT/ATTT

2

2

2

2

4

AATT/AATT

1

1

1

1

1

ACCT/AGGT

2

2

2

2

2

AATCT/AGATT

2

2

2

2

2

AAAAT/ATTTT

-

-

1

1

-

AGATAT/ATATCT

-

-

-

-

1

ACTATG/AGTCAT

1

1

1

-

-

38

40

42

45

42

Mononucleotide Dinucleotide

Trinucleotide

Tetranucleotide

Pentanucleotide

Hexanucleotide Total SSRs

Abbreviations: PgCP, P. ginseng cv. CP; Pq, P. quinquefolius; Pn, P. notoginseng; Pj, P. japonicus; Pv, P. vietnamensis; SSR, simple sequence repeat.

22 ACS Paragon Plus Environment

Page 23 of 30

Journal of Agricultural and Food Chemistry

Table 2. Numbers and ratios of nucleotide substitutions and InDels among chloroplast genomes of five Panax species. PgCP

PgYP

Pq

Pn

Pj

Pv

PgCP

/

1

141

480

511

559

PgYP

3 (0.33)

/

142

481

512

560

Pq

34 (4.15)

35 (4.06)

/

509

508

542

Pn

101 (4.75)

98 (4.91)

110 (4.63)

/

484

548

Pj

92 (5.55)

93 (5.51)

90 (5.64)

111 (4.36)

/

331

102 (5.48) 103 (5.44) 104 (5.21) 124 (4.42) 74 (4.47) / Pv Note: The upper triangle shows the total nucleotide substitutions, while the lower triangle indicates the number of InDels. Ratios of nucleotide substitutions to InDels (S/I) are given in brackets. Abbreviations: PgCP, Panax ginseng cv. CP; PgYP, P. ginseng cv. YP; Pq, P. quinquefolius; Pn, P. notoginseng; Pj, P. japonicas; Pv, P. vietnamensis.

23 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Table 3. Molecular markers developed to authenticate Panax species. Melting Expected PCR product size (bp) temperature PgCP PgYP Pq Pn Pj Pv (°C) a F: TGGGGGAATAAATAAGGAAGAA 54.7 gcpm1 392 392 393 391 416 430 R: TTAAAGAAGGCGGAGGTTTTTa 54.0 b F: TGGAAAGGCTGTTGTCACTG 53.2 gcpm2 194 194 161 160 161 161 R: TGCCAAGATTGCAGAAAGATTb 53.8 F: TGGTCAATCGCTGAAGAAAAa 53.2 gcpm3 449 449 449 474 449 449 R: CCGCGGCTTATATAGGTGAAa 57.3 F: CTCTTCCAAATTGATGTTCCAAa 56.6 gcpm4 295 295 318 295 295 295 R: TCCATGATACACCAGAACAATCAa 59.3 F: TCCTGAACCACTAGACGATGGa 53.3 gcpm5 457 457 457 465 469 444 R: TTTCGATAACTTCTTGATCCCTCTa 54.3 F: CTCGCACTAAGCTCGGAAATa 57.3 475 475 475 475 561 477 gcpm6 R: ACCATGGCGTTACTCTACCGa 53.9 F: TGGTGTGTTGAATCCACAATa 53.2 gcpm7 297 297 289 322 322 322 R: CCCCCTTTCCAATAATATCCAa 55.9 a F: TCCAAAGAGGAATCCTTCCA 53.3 gcpm8 237 237 237 313 237 237 R: CCAGCCTCTACTGGGGGTTAa 55.3 F: TCCAGGACTTCGAAAGGGTAa 53.4 gcpm9 492 492 492 492 486 511 R: ACACGATACCAAGGCAAACCa 53.7 F: CCGCTGTTATCCGCTACATTa 54.1 gcpm10 227 227 228 257 225 228 R: TCGTCTAAAATGCCTATACGAACTCa 54.9 F: CTGGACGATCCTATTTGGTCAb 53.7 gcpm11 220 220 220 238 220 202 R: TCCAGCTCCGTATGAAGGTCb 53.8 b F: GCAGAATACCGTCACCCATT 53.6 gcpm12 316 373 259 373 259 202 R: ATTTTGTCCGGATCCTCCTTb 53.8 F: AAGATTTTTATGAAGATACCGAAAATa 52.5 gcpm13 484 484 466 461 456 465 R: CGGTCAAATTCGAGGAAAGAa 55.3 F: TGGTTAGTTTCACCGGATTCAb 54.2 208 208 208 208 208 178 gcpm14 R: TTTTTGAGCCCATTTTTAAGGAb 54.6 Abbreviations: PCR, polymerase chain reaction; PgCP, Panax ginseng cv. CP; PgYP, P. ginseng cv. YP; Pq, P. quinquefolius; Pn, P. notoginseng; Pj, P. japonicus; Pv, P. vietnamensis. a Primers developed in our previous study17, and bprimers developed in this study Marker name

Forward (F)/reverse (R) primers

24 ACS Paragon Plus Environment

Page 24 of 30

Page 25 of 30

Journal of Agricultural and Food Chemistry

Table 4. Marker combinations for each Panax species. Marker

Positions Regions CDS/IGS

PgCP

Allele types for each species PgYP Pq Pn Pj

Pv

trnK-UUU–rps16 IGS C C C D B A rps16–trnQ-UUG IGS A A B B B B atpH-atpI IGS B B B A B B atpH-atpI IGS B B A B B B trnE-trnT IGS B B B B A C trnE-trnT IGS B B B B A B rbcL-accD IGS B B C A A A petA-psbJ IGS B B B A B B clpP-psbB IGS B B B B B A rpl14–rpl16 IGS B B B A B B ycf2 CDS B B B A B C ycf1 CDS B A C AD C D ndhF–rpl32 IGS A A B C D C ycf1 CDS A A A A A B Abbreviations: CDS, coding sequences; IGS, intergenic spaces; PgCP, P. ginseng cv. CP; PgYP, P. ginseng cv. YP; Pq, P. quinquefolius; Pn, P. notoginseng; Pj, P. japonicus; Pv, P. vietnamensis. A, B, C, and D refer to different allele types.

gcpm1 gcpm2 gcpm3 gcpm4 gcpm5 gcpm6 gcpm7 gcpm8 gcpm9 gcpm10 gcpm11 gcpm12 gcpm13 gcpm14

25 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Figure 1

26 ACS Paragon Plus Environment

Page 26 of 30

Page 27 of 30

Journal of Agricultural and Food Chemistry

Figure 2

27 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Figure 3

28 ACS Paragon Plus Environment

Page 28 of 30

Page 29 of 30

Journal of Agricultural and Food Chemistry

Figure 4

29 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Table of Contents Graphic

30 ACS Paragon Plus Environment

Page 30 of 30