Subscriber access provided by University of Sunderland
Omics Technologies Applied to Agriculture and Food 1
H-NMR-Spectroscopy for Determination of the Geographical Origin of Hazelnuts
René Bachmann, Sven Klockmann, Johanna Härdter, Markus Fischer, and Thomas Hackl J. Agric. Food Chem., Just Accepted Manuscript • DOI: 10.1021/acs.jafc.8b03724 • Publication Date (Web): 12 Oct 2018 Downloaded from http://pubs.acs.org on October 14, 2018
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 28
Journal of Agricultural and Food Chemistry
1H
NMR-Spectroscopy for Determination of the Geographical Origin of
Hazelnuts
René Bachmann1, Sven Klockmann2, Johanna Haerdter1, Markus Fischer2 and Thomas Hackl1,2*
1Institute
of Organic Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146
Hamburg,
Germany
*Corresponding
author:
Tel.:
+49-40-428382804
E-Mail:
[email protected] 2HAMBURG
SCHOOL OF FOOD SCIENCE - Institute of Food Chemistry, University of
Hamburg, Grindelallee 117, 20146 Hamburg, Germany
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 2 of 28
1
ABSTRACT
2
262 authentic samples were analyzed by
3
discrimination of hazelnuts (Corylus avellana L.) covering samples from five countries (Germany,
4
France, Georgia, Italy and Turkey) and the harvest years 2013 - 2016. This publication describes
5
method development starting with an extraction protocol suitable for separation of polar and non-
6
polar metabolites in addition to reduction of macromolecular components. Using the polar fraction
7
for data analysis principle component analysis (PCA) was applied and used to monitor sample
8
preparation and measurement. Several machine learning algorithms were tested to build a
9
classification model. The best results were obtained by a linear discrimination analysis applying a
10
random subspace algorithm. The division of the samples in a trainings set and a test set yielded a
11
cross validation accuracy of 91% for the trainings set and an accuracy of 96% for the test set. The
12
identification of key features was carried out by Kruskal-Wallis-test and t-test. A feature assigned
13
to Betaine exhibits a significant level for the classification of all five countries and is considered a
14
possible candidate for the development of targeted approaches. Further, the results were compared
15
to a previously published study based on LC-MS analysis of non-polar metabolites. In summary,
16
this study shows the robustness and high accuracy of a discrimination model based on NMR
17
analysis of polar metabolites.
1H-NMR-spectroscopy
for the geographical
18 19
KEYWORDS
20
Metabolomics, 1H-NMR, Hazelnut, Corylus avellana, Geographical origin, multivariate statistics
ACS Paragon Plus Environment
Page 3 of 28
Journal of Agricultural and Food Chemistry
21
INTRODUCTION
22
Hazelnut (Corylus avellana) is an ancient crop which has been known to human kind since at least
23
the Mesolithic.1 Even today, hazelnut is an important commodity in chocolate, confectionary and
24
bakery industry in their shelled and roasted form. Hazelnut is the third most commonly grown nut
25
after almond and walnut, whereat Turkey is the leading producer (64 %) followed by Italy (13 %),
26
United States, Georgia, Azerbaijan, Spain, France, Iran and China (less than 5 %).2 Hazelnuts
27
currently available on the market exhibit different qualities attributed to cultivation conditions in
28
their countries of origin. Hence, determination of geographical origin is relevant for consumers and
29
processing industries likewise. In 2014 the highest selling price was obtained for Italian hazelnuts
30
with 5.207 USD/t, while Turkey's hazelnuts afforded an 18% lower price, followed by the USA
31
(24%), Georgia (31%) and Azerbaijan (49%).2 This price discrepancy and the willingness of
32
consumers to consume more regional food is also reflected in the increasing number of products
33
registered under the EU scheme Protected Designation of Origin (PDO) or Protected
34
Geographical Indication (PGI) as well as other protected designations for products made and sold
35
outside of EU.3-5 There are currently three PGI (two for Italy, one for France) and two PDOs (Italy
36
and Spain) registered. Another Spanish PDO applied for registration.6 More than twenty-five
37
economic important varieties exist for C. avellana each with own characteristics and geographical
38
distributions according to climatic environments, geographical characteristics and manufacturing
39
conditions.7 Relevant varieties include the Tonda Gentile Trilobata (Piedmont, Italy), Tombul
40
(Giresund, Turkey), Ata Baba (Zaqatala, Azerbaijan) or Barcelona (Oregon, USA; France).7-9
41
Differentiation of the geographical origin or the variety may be based on morphological
42
characteristics or chemical targeted analysis.10-13 Various analytical methods have been applied for
43
non-targeted chemical profiling including NMR, ICP-MS, NIR, GC-MS or LC-MS.8, 10, 14-19 In 3 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 4 of 28
44
most cases subject of these studies was either a small/restricted geographical region and/or a typical
45
variety of a geographical region. For example, 1H NMR was applied in a metabolomics based
46
approach for the differentiation of Italian hazelnut varieties.20, 21 However, only the geographic
47
origin is a market relevant parameter because varieties usually are not declared in final products.
48
Approx. 90% of the worldwide annual yield of hazelnut is used by processing industries. They are
49
dependent on a steady supply of high quality commodity. Supply downtimes caused e.g. by crop
50
failure has a high impact on market rates as occurred in 2014 in Turkey after March onset in winter.
51
While from a traditional point of view most cultivation countries are all located around the 45th
52
latitude, other cultivation countries such as Chile, South Africa and Australia, become more and
53
more important in response to changing climatic conditions and the demand for stable annual
54
yield.22, 23-25 These changes in market situation may increase future demand for comprehensive
55
protocols for geographic authentication of hazelnuts in quality control including the major crop
56
countries.
57
The basic concept of a metabolomics-based approach for authenticity control of food is the
58
assumption that the metabolites of raw materials with different location factors and cultivation
59
conditions differ significantly and reproducibly in the concentration (quantity) or presence (quality)
60
of certain metabolites and metabolite patterns.26 This work addresses for the first time the non-
61
targeted NMR spectroscopic analysis of a large quantity of hazelnut samples from five different
62
Eurasian countries. The data analysis generates a model that predicts the origin of samples from a
63
test set with a high accuracy.
64 65
MATERIALS AND METHODS
66
Reagents and chemicals 4 ACS Paragon Plus Environment
Page 5 of 28
Journal of Agricultural and Food Chemistry
67
Deuterated chloroform (99.8%), methanol (99.8%) and Deuteriumoxide (99.9%) were purchased
68
from Eurisotop (Saint-Aubin Cedex, France). Sodium azide (99.5%), potassium phosphate
69
monobasic anhydrous (>99%) and potassium phosphate dibasic anhydrous (>98%) were purchased
70
from Sigma Aldrich (Merck KGaA, Darmstadt, Germany).
71 72
Hazelnut samples
73
Overall 262 authentic raw hazelnut samples of different varieties, origins and producers from
74
harvest years 2013 (5), 2014 (76), 2015 (105) and 2016 (76) were used for analyses. Figure S1
75
(supporting information, p S XVIII) gives a graphical overview of the sample distribution. The
76
samples were harvested in the respective commercial relevant regions of each country, represented
77
by 134 French (mainly Midi-Pyrénées and Aquitaine), 28 German (mainly in Bavaria), 44 Italian
78
(Piedmont, Campania and Lazio), 41 Turkish (Ordu, Akçakoca and Samsun) and 15 Georgian
79
(Guria, Samegrelo and Imereti) samples. Due to the large sample set we were dependent on
80
collaborators for sample acquisition and most samples were provided by importers and distributers.
81
The authenticity of all samples was declared by our collaborators. For detailed information about
82
the origin, variety and supplier of all samples see supporting information (p SII, table S1). Each
83
sample comprises either 1000 grams’ hazelnut kernels with skin (testa) or 1500 grams unshelled
84
hazelnuts.
85 86
Sample treatment
87
All hazelnut samples were handled in accordance with KLOCKMANN et al.19 100 g of the grist were
88
stored for one week in a freezer (-20°C) to evaporate the dry ice. Two protocols were developed
89
for sample extraction. Each sample was extracted three times, to avoid the influence of outliers. 5 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 6 of 28
90
Extraction protocol A: of the polar metabolites 500 mg of the resulting lyophilisate was mixed with
91
1.5 mL extraction solvent (chloroform-d/methanol-d4/200 mM deuterated phosphate buffer 2/1/2)
92
and two steel balls followed by ball milling for 3 min at 3.1 m/s using a Bead Ruptor 24 equipped
93
with a 1.5 mL microtube carriage kit (Biolabproducts, Bebensee, Germany). The remaining
94
suspension was centrifuged for 15 min at 14,000 x g and 4 °C. 600 µL of the supernatant were
95
taken and transferred into a 5 mm NMR tube (Deutero, Katellaun).
96
Extraction protocol B: The two-phase extraction was equally performed as in protocol A. After
97
centrifuging 200 µL of the supernatant was taken and 400 µL Methanol-d4 was added. The mixture
98
was vortexed for 2 seconds and centrifuged again for 2 min at 14,000 x g and 4 °C. 200 µL of the
99
resulting supernatant was mixed with 500 µL potassium phosphate buffer (200 mM) and
100
transferred into a 5 mm NMR tube (Deutero, Katellaun, Germany).
101 102
NMR data acquisition
103
All spectra were acquired on a Bruker Avance III 400MHz spectrometer (Bruker Biospin,
104
Rheinstetten, Germany) operating at 400.13MHz. The noesygppr1d pulse sequence was used for
105
acquisition of water suppressed 1H NMR-spectra applying the digitization mode baseopt. Each
106
spectrum was recorded at 300 K, with 64 scans, 65536 complex data points, a spectral width of
107
8417.5 Hz. The RG was set to 64 and the transmitter frequency offset was set to 1924.6 Hz. For
108
data processing the FIDs were Fourier transformed with a line broadening factor of 0.3, baseline
109
corrected and phased with Topspin 3.2 (Bruker Biospins, Rheinstetten, Germany).
110
2D TOCSY-spectra were acquired using the sequence dipsi2esgpph at 300K, applying 32 scans,
111
256 (F1) and 2048 (F2) complex data points, a spectral width of 4085.0 Hz and with states-TPPI
112
FnMODE. The RG was set to 256 and the transmitter frequency offset was set to 1881.8 Hz. The 6 ACS Paragon Plus Environment
Page 7 of 28
Journal of Agricultural and Food Chemistry
113
spectra were processed with a sine bell shift (SSB) of 2 in F1 and F2 and applying 32 linear
114
prediction coefficients for a complex linear foreward prediction in F1. 2D-JRES-spectra were
115
acquired applying the jresgpprqf pulse sequence at 300K, using 1 Scan, 40 (F1) and 8192 (F2)
116
complex data points and a spectral width of 6684.5 Hz. The RG was set to 114 and the transmitter
117
frequency offset was set to 1882.0 Hz. The spectra were processed with a sine bell shift of 0,
118
magnitude calculation in F1 and executing tilt and symj command for spectrum symmetrization. A
119
two-dimensional baseline correction was applied to all 2D spectra.
120 121
Statistical analysis
122
The spectra were transferred to AMIX 3.9.14 (Bruker Biospins, Rheinstetten, Germany) and
123
calibrated to the TMSP signal. Various methods like automated bucketing with variable bucket size
124
as well as manually bucketing were compared to each other. The residual solvent signal of
125
methanol-d4 (3.303-3.338 ppm) and the region used for water suppression (4.645-4.971 ppm) were
126
excluded from the bucket table. For automated bucketing the best results were obtained for a bucket
127
size of 0.03 ppm. The best results regarding principle component analysis (PCA) were obtained
128
for manually defined buckets with variable size. 222 buckets were manually defined as listed in
129
the supporting information (p. SXIV, S2). The buckets were scaled to the total intensity (rows).
130
This data was used to carry out a principle component analysis (PCA).132 Columns were scaled to
131
unit variance and the number of principal components was set to a minimum explained variance of
132
95%. For each sample group a confidence level of 95% was defined. The analysis of significant
133
buckets was carried out using a Kruskal-Wallis-test and a confidence level of 95% (supporting
134
information, p. SXIV, S3). The p-score was Bonferroni corrected. A classification analysis was set
135
up using Classification Learner (Matlab R2015b, MathWorks, Inc.). The 262 samples were split 7 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 8 of 28
136
into a training set consisting of 172 samples (Germany 19, France 88, Georgia 10, Italy 28 and
137
Turkey 27) and a test set of 90 samples (Germany 9, France 46, Georgia 5, Italy 16 and Turkey
138
14). For statistical analysis the normalized bucket table was exported from AMIX to Matlab. A
139
manually classifier training with fivefold cross validation was used to compare different classifiers,
140
including decision trees, support vector machines and nearest neighbor classifiers (supporting
141
information, p. SXVIII, table S4). All features were selected because a feature selection to buckets
142
that were significant in kruskal-wallis-test did not improve the accuracy of the classifier. The best
143
results were obtained for a subspace discriminant classifier with 14 subspace dimensions and 30
144
learners.
145 146
RESULTS AND DISSCUSION
147
Dried hazelnuts contain about 60% of lipids and less than 5% of water. Nevertheless, we focused
148
on the polar fraction due to the larger dispersion of NMR signals. Previous work showed that 1H
149
NMR-spectra of non-polar hazelnut extracts contain less information than the spectra of polar
150
extracts.20,28 Indeed the non-polar extract of hazelnuts shows only a small chemical variation
151
between the major abundant metabolites (supporting information, p. SXIV, figure S2). The
152
spectrum of the polar fraction of a French hazelnut sample is shown in figure 1. In total, 16
153
metabolites belonging to the class of amino acids, carbohydrates and organic acids were identified
154
and are annotated in the spectrum. The identification of metabolites was carried out by evaluation
155
of chemical shifts and coupling constants, database search (HMDB, BMRB) and acquisition of 2D
156
TOCSY and JRES spectra. The identity of metabolites was further tested by spike-in experiments
157
with reference samples. Table 1 summarizes signals that were used for assignment.
8 ACS Paragon Plus Environment
Page 9 of 28
Journal of Agricultural and Food Chemistry
158
The first analyses were carried out with a standard two phase extraction protocol using chloroform-
159
d1/methanol-d4/deuterated phosphate buffer 2/1/2.29-31 Two phase extraction was necessary to
160
remove lipids effectively from the polar extract. After phase separation the polar extracts were
161
filled directly in NMR-tubes and 1D-NOESY spectra were acquired (extraction protocol A). These
162
spectra showed in addition to the expected signals of small molecule metabolites very broad signals
163
with their highest intensities from 4.7 to 0.8 ppm. These signals were assigned to not precipitated
164
proteins, exhibited a high variance in intensities from sample to sample and reasonable integration
165
of metabolite signals was not possible. Therefore, the extraction protocol was optimized regarding
166
quantitative protein precipitation. 400 µL methanol-d4 was added to 200 µL of the previously
167
isolated polar fraction (methanol-d4 deuterated phosphate buffer 1/2) from two-phase extraction
168
for protein precipitation. The solution was vortexed, centrifuged and 500 µL of deuterated
169
phosphate buffer was added. 600 µL of the resulting solution was used for analysis (extraction
170
protocol B). This procedure of additional protein precipitation significantly reduced the protein
171
content of the NMR samples and was selected for measurement of all samples (figure 2).
172 173
Statistical Analysis
174
Each hazelnut sample was extracted three times and each extract measured by 1H-NMR using the
175
1D NOESY sequence for water suppression. PCA was used for visualizing the data variance and
176
for a quick and effective identification of outliers. Single outliers of the triple measurement were
177
removed. The result of the PCA is shown in Figure 2 with the 3D Plot of PC1 vs. PC2 vs. PC3
178
(2A).19, 27 The first ten principal components account for 80% of the total variance. The PCA shows
179
a clustering for all sample groups, as well as a separation for German hazelnuts compared to
180
Georgian, Italian and Turkish ones. The variance of a single sample group is larger than the distance 9 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 10 of 28
181
between sample groups. In particular, the overlapping is high between French and Italian samples.
182
Hence, there is no explicit separation in the PCA, as indicated by the overlapping confidence
183
intervals (α = 0.05).
184
Further statistical methods, based on supervised machine learning algorithms, were evaluated for
185
an efficient classification of the experimental data. The normalized bucket table was exported from
186
AMIX to Matlab. Using the Classification Learner App, different classifiers were compared,
187
including decision trees, support vector machines and nearest neighbor classifiers, but the best
188
results were achieved applying the subspace discriminant classifier using a random subspace
189
algorithm (Table S4, supporting information). The subspace discriminant classifier is picking a
190
subset of random features before applying the training algorithm (here linear discriminant analysis,
191
LDA). Afterwards the results of the models are combined. 2/3 (overall 172) of all samples were
192
used for training the classifier using all 222 buckets in the calculation. This classifier was applied
193
to the training set and validated by a fivefold cross validation. The resultant confusion matrix is
194
shown in figure 4. The cross validation yielded an accuracy of 91% indicating a strong and robust
195
model. This is also supported by the high accuracy (> 80%) that was achieved by other
196
classification algorithms, in particular decision tree and support vector machine (supporting
197
information, p. SXVIII, S4). The best result in terms of country classification were obtained for
198
Georgian samples, without misclassification. The samples from Turkey and France show
199
equivalent results with 96% and 94% true positive rate (TPR), respectively. The lowest accuracy
200
with 78% was obtained for the assignment of Italian samples. The highest false positive rate (FPR)
201
occurs for Italian samples that were misclassified as Georgian and for German samples that were
202
misclassified as French. Because the overall performance of the classifier on internal cross
203
validation was promising, a test set was built with the remaining 1/3 (overall 90) of all samples for 10 ACS Paragon Plus Environment
Page 11 of 28
Journal of Agricultural and Food Chemistry
204
external validation. 96% of the samples from the validation set were predicted correctly. The
205
corresponding confusion matrix is shown in figure 5. Most misclassifications were obtained again
206
for Italian samples, where two samples were classified as French. Again, no misclassification was
207
obtained for Georgian samples and the overall assignment accuracy is comparable to the accuracy
208
obtained from the cross validation. The robust accuracies of the test and training sets are an
209
excellent basis for the distinction of hazelnuts in terms of food authentication.
210
A further validation of the NMR/classification model allows the comparison to a previously
211
published LC-MS study on a subset of the sample set presented here (196/262) from the same five
212
countries. The LC-MS analysis achieved an accuracy of 100% for the training set and 80% for the
213
prediction set applying a support vector machine/SIMCA classifier by a non-targeted approach.19
214
The non-targeted MS analysis was further developed to a targeted LC-QqQ-MS/MS method that
215
yielded an accuracy above 98%. Remarkably, in the LC-MS study not only a different analytical
216
technique was applied, but also the non-polar fraction was investigated. Compounds identified are
217
e.g. di- and triacylglycerols, phosphatidylcholines and phosphatidylethanolamines. Remarkably,
218
both approaches, NMR on polar and MS on non-polar metabolites, exhibit a similar clustering with
219
comparable sample distribution along the principle components (figure 3A and B). The internal
220
variances of clusters from NMR measurement is slightly higher compared to the MS analysis. The
221
largest differences in the data is for separation of German samples from Georgian, Italian and
222
Turkish samples. For both data sets the separation is along PC1. On the other hand, the best
223
separation between the Georgian, Italian and Turkish is along PC2 and PC3 in both studies. The
224
data exhibits a remarkable comparability although two different sets of metabolites, polar and non-
225
polar, were analyzed. The internal variance in metabolite levels is similar for both fractions
11 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 12 of 28
226
(polarities) of natural products. The variances, neither within the polar nor within the non-polar
227
metabolites, is sufficient for unequivocal sample assignment by PCA.
228
Metabolites are the end products of cellular regulatory processes, and their concentrations can be
229
regarded as the ultimate response of biological systems to genetic or environmental changes. The
230
metabolome is likewise influenced by genomic differences attributed to varieties and exogenous
231
factors attributed to local growth condition. Since identical varieties are only cultivated in limited
232
regions it is not possible to conclude which of these factors have the largest impact on the individual
233
composition of metabolites. However, in this study heterogeneous sample groups composed by
234
many varieties show similarities that allows us the classification of samples due to their
235
geographical origin with a minimal influence of varieties in this model.
236
Relevant Metabolites
237
The identification of metabolites permits evaluation of key metabolites, the metabolic pathways
238
involved and/or the development of a strategy for targeted analysis. Buckets that are significant for
239
each sample group have to be identified following a spectroscopic analysis for structure
240
determination and identification of the corresponding metabolites. In a first step the significance
241
of each bucket was analyzed by kruskal-wallis-test, comparing the median of each bucket for a
242
single sample group with the medians of the remaining samples. A bucket is considered significant
243
if the Bonferroni corrected p-value is below 0.000225. The Bonferroni correction is used to avoid
244
the problem of multiple comparison. Therefore, the significance level (p < 0.05) is divided by the
245
number of observations (here 222 buckets). The full bucket list with their corresponding p-values
246
is shown in the supporting information (p. XVIII, S3). Overall 196 of 222 buckets exhibit a
247
significant p-value for at least one sample group. No improvement in the classifier model was 12 ACS Paragon Plus Environment
Page 13 of 28
Journal of Agricultural and Food Chemistry
248
achieved when the remaining 26 non-relevant buckets were removed from the model. The
249
variability in the concentration of the individual metabolites is large and some signals like lactic
250
acid do not appear in every spectrum. Metabolites only have been classified as significant for a
251
sample group/country, if all their signals show a significant p-value. Betaine, in contrast to the
252
other metabolites identified so far, exhibits significant differences in medians with relevant p-
253
values for all sample groups. Thus, betaine could be considered a candidate for the development
254
of rapid tests based on targeted approaches. In the classification of Turkish samples, nearly all
255
small organic acids, e.g. acetate, malate, citrate or fumarate, exhibited a significant p-value in the
256
kruskal-wallis-test. However, there is no explicit correlation in their concentrations. While malate
257
has a rather low concentration, the levels of citrate, formate and fumarate are higher in Turkish
258
samples than in other countries (except for Georgia). Boxplots and Spidercharts for selected
259
metabolites are shown in Error! Reference source not found.. Most of the relevant metabolites
260
not identified so far are minor-components and are often coincided with signals from major
261
components. More effort will be necessary for a reliable identification of these in future studies.
262
Our work demonstrates that geographical origin determination of hazelnuts with NMR is possible
263
with a comparable accuracy to LC-MS methodology although focusing on a different part of the
264
metabolome. Compared to the results of preceding studies, this study exhibits an equal accuracy
265
for the determination of the geographical origin, indicating a very robust and applicable model.19,
266
27
267
to identification of more relevant metabolites for an even more accurate distinction. It becomes
268
apparent that the developed model reliably recognizes a large number of important countries of
269
origin, with a minimized influence of their varieties and is suitable for determining the geographical
270
origin.
The combination of two different analyzing techniques and matrices (polar and non-polar) leads
13 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 14 of 28
271 272
ABBREVIATIONS USED
273
DE, Germany; EU, European Union; FID, free induction decay; FNR, false negative rate; FP, false
274
positive; FR, France; GE, Georgia; IT, Italy; JRES, J-resolved; NMR, nuclear magnetic resonance;
275
LC-MS, liquid chromatography/mass spectrometry; PC, principal component; PCA, principle
276
component analysis; PDO, protected designation of origin; PGI, protected geographic indication;
277
RG, receiver gain; TOCSY, total correlation spectroscopy; TPR, true positive rate; TR, Turkey
278 279
ACKNOWLEDGEMENT
280
The authors are very grateful to SCA Unicoque, Erzeugerorganisation Deutscher Haselnussanbauer
281
UG,
282
Fürth/Sortenversuchsanstalt Gonnersdorf, Stelma SRL Unipersonale, AgroTeamConsulting,
283
Institute of Biotechnology and Microbiology, University of Hamburg, August Storck KG,
284
Seeberger GmbH, Crisol de Frutos Secos, Azienda Agricola Cascina Valcrosa, Basaran Entegre
285
Gıda san. ve Tic. A.Ş, Alta Langa Azienda Agricola, Corilu Societa Cooperativa Agricola, Coselva
286
SCCL, Eganut LLC and Franken Genuss UG & Co.KG, Ferrero OHG mbH, Heinrich Brüning
287
GmbH, August Töpfer & Co. (GmbH & Co.) KG, Lübecker Marzipan-Fabrik v. Minden & Bruhns
288
GmbH & Co. KG, Carl Wilhelm Clasen GmbH, Horst Walberg Trockenfrucht Import GmbH,
289
Fratelli Caffa s.a.s. and Rapunzel Naturkost for providing us with authentic hazelnut samples. The
290
authors thank Vera Priegnitz and Claudia Wontorra for their support in sample measurement.
Schlüter&Maack
GmbH,
Amt
für
Ernährung,
Landwirtschaft
und
Forsten
291 292
SUPPORTING INFORMATION
14 ACS Paragon Plus Environment
Page 15 of 28
Journal of Agricultural and Food Chemistry
293
Detailed list of all used hazelnut samples with suppliers, provenance and cultivar information and
294
the bucket list with detailed limits of the buckets are shown in the supporting information.
15 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
295
Page 16 of 28
REFERENCES
296 297
1.
298
RESOURCE ASSESSMENT, 2013. http://archaeologydataservice.ac.uk/archives/view/-
299
meso_framework/downloads.cfm
300
2.
301
http://www.fao.org/faostat/en/#data/PP. (2017.11.28)
302
3.
303
http://agritrade.cta.int/en/Resources/Agritrade-documents/Discussion-papers/Geographical-Indications-
304
and-the-challenges-for-ACP-countries. (2017.11.29)
305
4.
306
Economic and Trade Agreement between Canada of the one part, and the European Union and its Member
307
States, of the other part. In Strasbourg, 2016.
308
5.
States, O. o. A., North American Free Trade Agreement. In System, F. T. I., Ed. 1999.
309
6.
European Commission DOOR Database, http://ec.europa.eu/agriculture/quality/door/-
310
list.html?locale=en&recordSelection=all&recordStart=0&filter.dossierNumber=&filter.comboName=l%C
311
3%ADskov%C3%BD%20o%C5%99ech&filterMin.milestone__mask=&filterMin.milestone=&filterMax.
312
milestone__mask=&filterMax.milestone=&filter.country=&filter.category=PDOPGI_CLASS_16&filter.t
313
ype=&filter.status=. (2017.11.29),
314
7.
315
International: Cambridge, 2008; pp 161-172.
316
8.
317
Application of (1)H NMR for the characterisation and authentication of ''Tonda Gentile Trilobata"
318
hazelnuts from Piedmont (Italy). Food Chem 2014, 148, 77-85.
Blinkhorn, E.; Milner, N.; Developing a Mesolithic Research and Conservation Framework,
United Nations, Food and Agriculture Organization, FAOSTAT Database:
Connor, O.; Geographical Indications and the challenges for ACP, 2005,
European COMMISSION COUNCIL DECISION: on the conclusion of the Comprehensive
Jules Janick, R. E. P., The Encyclopedia of Fruit and Nuts. In Jules Janick, R. E. P., Ed. CAB
Caligiani, A.; Coisson, J. D.; Travaglia, F.; Acquotti, D.; Palla, G.; Palla, L.; Arlorio, M.,
16 ACS Paragon Plus Environment
Page 17 of 28
Journal of Agricultural and Food Chemistry
319
9.
320
native Turkish hazelnut varieties (Corylus avellana L.). Food Chem 2009, 113, 919-925.
321
10.
322
Profiling food volatiles by comprehensive two-dimensional ga schromatography coupled with mass
323
spectrometry: advanced fingerprinting approaches for comparative analysis of the volatile fraction of
324
roasted hazelnuts (Corylus avellana L.) from different origins. J Chromatogr A 2010, 1217, 5848-58.
325
11.
326
vitamin and mineral composition of hazelnut (Corylus avellana L.) varieties cultivated in Turkey. Food
327
Chem 1999, 65, 309-313.
328
12.
329
A.; Boatella, J., Influence of variety and geographical origin on the lipid fraction of hazelnuts (Coryllus
330
avellana L.) from Spain: (II). Triglyceride composition. Food Chem 1994, 50, 245-249.
331
13.
332
A.; Carbone, V., Analysis of different European hazelnut (Corylus avellana L.) cultivars: authentication,
333
phenotypic features and phenolic profiles. J Agric Food Chem 2014.
334
14.
335
Arlorio, M., Chemotype and genotype chemometrical evaluation applied to authentication and traceability
336
of “Tonda Gentile Trilobata” hazelnuts from Piedmont (Italy). Food Chem 2011, 129, 1865-1873.
337
15.
338
is suitable for the classification of hazelnuts according to Protected Designation of Origin. J Sci Food
339
Agric 2014.
340
16.
341
study of hazelnuts from piedmont, Italy. J Agric Food Chem 2009, 57, 3404-8.
342
17.
343
A.; Romero, A.; Boatella, J., Influence of variety and geographical origin on the lipid fraction of hazelnuts
Alasalvar, C.; Amaral, J. S.; Satır, G.; Shahidi, F., Lipid characteristics and essential minerals of
Cordero, C.; Liberto, E.; Bicchi, C.; Rubiolo, P.; Schieberle, P.; Reichenbach, S. E.; Tao, Q.,
Açkurt, F.; Özdemir, M.; Biringen, G.; Löker, M., Effects of geographical origin and variety on
Parcerisa, J.; Rafecas, M.; Castellote, A. I.; Codony, R.; Farràn, A.; Garcia, J.; López, A.; Romero,
Ciarmiello, L. F.; Mazzeo, M. F.; Minasi, P.; Peluso, A.; De Luca, A.; Piccirillo, P.; Siciliano, R.
Locatelli, M.; Coïsson, J. D.; Travaglia, F.; Cereti, E.; Garino, C.; D’Andrea, M.; Martelli, A.;
Moscetti, R.; Radicetti, E.; Monarca, D.; Cecchini, M.; Massantini, R., Near infrared spectroscopy
Oddone, M.; Aceto, M.; Baldizzone, M.; Musso, D.; Osella, D., Authentication and traceability
Parcerisa, J.; Rafecas, M.; Castellote, A.; Codony, R.; Farran, A.; Garcia, J.; Gonzalez, C.; Lopez,
17 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 18 of 28
344
(Corylus avellana L.) from Spain:(III) oil stability, tocopherol content and some mineral contents (Mn, Fe,
345
Cu). Food Chem 1995, 53, 71-74.
346
18.
347
Venditti, A.; Delfini, M., Metabolic profile of different Italian cultivars of hazelnut (Corylus avellana) by
348
nuclear magnetic resonance spectroscopy. Nat Prod Res 2014, 1-7.
349
19.
350
Metabolomic Approaches for Geographical Origin Discrimination of Hazelnuts (Corylus avellana) by
351
UPLC-QTOF-MS. J Agric Food Chem 2016, 64, 9253-9262.
352
20.
353
Application of (1)H NMR for the characterisation and authentication of ''Tonda Gentile Trilobata"
354
hazelnuts from Piedmont (Italy). Food Chem 2014, 148, 77-85.
355
21.
356
Venditti, A.; Delfini, M., Metabolic profile of different Italian cultivars of hazelnut (Corylus avellana) by
357
nuclear magnetic resonance spectroscopy. Natural product research 2014, 28, 1075-81.
358
22.
359
to Zinnia. ABC-CLIO: 2013.
360
23.
361
https://www.gatewaytosouthamerica-newsblog.com/nutella-hazelnuts-orchards-of-chile/ (2018.01.17)
362
24.
B.Baldwin, The potential for hazelnut production in Australia AFBM Journal 2004, 1, 84-92.
363
25.
just-food.com, Best, D.; 24.11.2017, https://www.just-food.com/news/alfred-ritter-drawing-up-
364
plans-for-hazelnut-cultivation_id138238.aspx (2018.02.18)
365
26.
366
for authentication of food in official control. Food Research International 2014, 60, 189-204.
367
27.
368
Determination of Hazelnuts (Corylus avellana) by LC-QqQ-MS/MS-Based Targeted Metabolomics
369
Application. J Agric Food Chem 2017, 65, 1456-1465.
Sciubba, F.; Di Cocco, M. E.; Gianferri, R.; Impellizzeri, D.; Mannina, L.; De Salvador, F. R.;
Klockmann, S.; Reiner, E.; Bachmann, R.; Hackl, T.; Fischer, M., Food Fingerprinting:
Caligiani, A.; Coisson, J. D.; Travaglia, F.; Acquotti, D.; Palla, G.; Palla, L.; Arlorio, M.,
Sciubba, F.; Di Cocco, M. E.; Gianferri, R.; Impellizzeri, D.; Mannina, L.; De Salvador, F. R.;
Cumo, C., Encyclopedia of Cultivated Plants: From Acacia to Zinnia [3 Volumes]: From Acacia
Gateway to South America’s, Real Estate News Service, 2016,
Esslinger, S.; Riedl, J.; Fauhl-Hassek, C., Potential and limitations of non-targeted fingerprinting
Klockmann, S.; Reiner, E.; Cain, N.; Fischer, M., Food Targeting: Geographical Origin
18 ACS Paragon Plus Environment
Page 19 of 28
Journal of Agricultural and Food Chemistry
370
28.
371
E., Metabolomic analysis in food science: a review. Trends in Food Science & Technology 2009, 20, 557-
372
566.
373
29.
374
geographical origin of beef by (1)H NMR-based metabolomics. J Agric Food Chem 2010, 58, 10458-66.
375
30.
376
utilize biodiversity. Methods Mol Biol 2013, 1055, 117-27.
377
31.
378
with the Cry1Ab gene. J Agric Food Chem 2009, 57, 6041-9.
Cevallos-Cevallos, J. M.; Reyes-De-Corcuera, J. I.; Etxeberria, E.; Danyluk, M. D.; Rodrick, G.
Jung, Y.; Lee, J.; Kwon, J.; Lee, K. S.; Ryu, D. H.; Hwang, G. S., Discrimination of the
Pimenta, L. P.; Kim, H. K.; Verpoorte, R.; Choi, Y. H., NMR-based metabolomics: a probe to
Piccioni, F.; Capitani, D.; Zolla, L.; Mannina, L., NMR metabolic profiling of transgenic maize
19 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 20 of 28
380
FIGURE CAPTIONS
381
Figure 1: 400 MHz 1H-NMR-spectrum of a polar extract from a French hazelnut sample. Peaks
382
that were used for metabolite identification are marked in the spectrum.
383
Figure 2: Comparison of the 400 MHz 1H-NMR-spectra of polar hazelnut extracts applying
384
extraction protocol A and B. In protocol B an additional precipitation step by addition of methanol-
385
d4 was used for more effective protein removal. Because the sample was further diluted the total
386
concentration of metabolites was reduced. Particularly, in the range of 4.7 to 0.8 ppm ppm, a
387
significant decrease of broad lines is accomplished
388
Figure 3: A) PCA scores plots of PC1 vs. PC2 vs PC3 of the NMR-analysis (multiple
389
measurements). Explained variance PC1=37.0%; PC2=11.1%; PC3=9.7%. B) PCA scores plot of
390
the LC-MS analysis (single measurements) from Klockmann et al. Both PCA Plots show a similar
391
distribution of the sample groups although they were extracted and analyzed by different
392
techniques.
393
Figure 4: Confidence Matrix of the training set. 0: Germany; 1: France, 2: Georgia; 3: Italy; 4:
394
Turkey. 2/3 (overall 179) of all samples were used as a training set. Applying a random subspace
395
algorithm, the classifier resulting in a fivefold cross validated accuracy of 91%.
396
Figure 5: Confidence Matrix of the test set. 0: Germany; 1: France, 2: Georgia; 3: Italy; 4: Turkey.
397
The test set was built with the 1/3 (overall 90) of all samples. 96% of the samples from the
398
validation set were predicted correctly.
399
Figure 6: A: Box-Whiskers-Plots showing the centered and scaled values for selected (identified)
400
metabolites. DE: Germany; FR: France; GE: Georgia; IT: Italy; TR: Turkey; B: absolute median
401
values for the selected metabolites. Alanine, Betaine and Malate show the biggest difference in
402
concentration between the sample groups. C: normalized (to mean value) change of the medians of 20 ACS Paragon Plus Environment
Page 21 of 28
Journal of Agricultural and Food Chemistry
403
selected metabolites. This plot illustrates differences between the individual sample groups.
404
Despite these different profiles, almost all buckets are significant in the t-test for the classification
405
of at least one country.
21 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 22 of 28
TABLES Table 1: Identified metabolites of a polar hazelnut extract with their impact on origin discrimination Chemical shift [ppm]
Multiplicity
Metabolite
J [Hz]
Significant for1
0.998
d
Valine
7.1
France, Turkey
1.049
d
Valine
7.0
France, Turkey
1.032
d
Isoleucine
7.1
-
1.326
d
Threonine
6.9
-
1.327
d
Lactate
6.7
-
1.482
d
Alanine
7.2
Germany, Georgia
1.909
s
Acetate
-
Turkey
2.020-2.188
m
Glutamate
-
Germany, Italy, Turkey
2.296-2.242
m
Valine
-
France, Turkey
2.350
dd
Malate
15.3, 10.1
Turkey
2.518
d
Citrate
15.2
Georgia, Turkey
2.658
d
Citrate
15.1
Georgia, Turkey
2.659
dd
Malate
15.0, 3.2
Turkey
3.209
s
Choline
-
Germany
3.271
s
Betaine
-
Germany, France, Georgia, Italy, Turkey
3.456
t
Sucrose
9.4
Italy
3.537
dd
Sucrose
10.0, 3.0
Italy
3.666
bs
Sucrose
-
Italy
3.757
t
Sucrose
9.6
Italy
3.791 -3.806
m
Sucrose
-
Italy
4.041
t
Sucrose
8.3
Italy
4.199
d
Sucrose
8.4
Italy
4.278
dd
Malate
10.1, 3.0
Turkey
4.622
d
Glucose
8.0
Turkey
5.215
d
Glucose
3.9
Turkey
22 ACS Paragon Plus Environment
Page 23 of 28
Journal of Agricultural and Food Chemistry
Chemical shift [ppm]
Multiplicity
Metabolite
J [Hz]
Significant for1
5.409
d
Sucrose
3.9
Italy
6.513
s
Fumarate
-
Georgia, Turkey
6.885
pseudo-d
Tyrosine
8.5
Germany
7.188
pseudo-d
Tyrosine
8.4
Germany
8.461
s
Formate
-
Turkey
1according
to Kruskal-Wallis-test
FIGURES Figure 1
23 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 24 of 28
Figure 2
Figure 3
24 ACS Paragon Plus Environment
Page 25 of 28
Journal of Agricultural and Food Chemistry
Figure 4
25 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 26 of 28
Figure 5
Figure 6
26 ACS Paragon Plus Environment
Page 27 of 28
Journal of Agricultural and Food Chemistry
27 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 28 of 28
TABLE OF CONTENT GRAPHIC
28 ACS Paragon Plus Environment