Subscriber access provided by UNIV OF GEORGIA
Article
Differentiation of organically and conventionally grown tomatoes by chemometric analysis of combined data from 1H NMR- and MIR-spectroscopy and stable isotope analysis Monika Hohmann, Yulia Monakhova, Sarah Erich, Norbert Christoph, Helmut Wachter, and Ulrike Holzgrabe J. Agric. Food Chem., Just Accepted Manuscript • DOI: 10.1021/acs.jafc.5b03853 • Publication Date (Web): 12 Oct 2015 Downloaded from http://pubs.acs.org on October 18, 2015
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Journal of Agricultural and Food Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 33
Journal of Agricultural and Food Chemistry
1
Differentiation of organically and conventionally
2
grown
3
combined data from 1H NMR- and MIR-spectroscopy
4
and stable isotope analysis
5
Monika Hohmann1,2, Yulia Monakhova3,4, Sarah Erich5, Norbert Christoph2, Helmut Wachter2*, Ulrike
6
Holzgrabe1
7 8
1
9
2 Bavarian
tomatoes
by
chemometric
analysis
of
Institute of Pharmacy and Food Chemistry, University of Würzburg, Am Hubland, 97074 Würzburg, Germany Health and Food Safety Authority, Luitpoldstraße 1, 97082 Würzburg, Germany
10
3 Spectral
11 12
4
Department of Chemistry, Saratov State University, Astrakhanskaya Street 83, 410012 Saratov, Russia
13
5 Chemical and Veterinary Investigation
Service, Emil-Hoffmann-Straße 33, 50996 Cologne, Germany
Laboratory, Bissierstraße 5, 79114 Freiburg, Germany
14 15 16
*
Corresponding author:
Helmut Wachter
17
Phone: +49 9131 68087151
18
Fax: +49 9131 68087210
19
Email:
[email protected] 1 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
20
Abstract
21
Since the basic suitability of proton nuclear magnetic resonance spectroscopy (1H NMR) to
22
differentiate organic vs. conventional tomatoes was recently proved, the approach to optimize
23
1
24
additional data of isotope ratio mass spectrometry (IRMS, δ13C, δ15N, δ18O) and Mid Infrared
25
spectroscopy (MIR) was assessed. Both individual and combined analytical methods (1H NMR + MIR,
26
1
27
Analysis (PCA), Partial Least Squares – Discriminant Analysis (PLS-DA), Linear Discriminant Analysis
28
(LDA) and Common Components and Specific Weight Analysis (ComDim). Regarding classification
29
abilities, fused data of 1H NMR + MIR + IRMS yielded better validation results (ranging between
30
95.0% and 100.0%) than individual methods (1H NMR: 91.3% - 100%, MIR: 75.6% - 91.7%), suggesting
31
that the combined examination of analytical profiles enhances authentication of organically
32
produced tomatoes.
H NMR classification models (comprising overall 205 authentic tomato samples) by including
H NMR + IRMS, MIR + IRMS, 1H NMR + MIR + IRMS) were examined using Principal Component
33 34
Keywords: organic tomatoes, 1H NMR, MIR, IRMS, chemometrics, data fusion
2 ACS Paragon Plus Environment
Page 2 of 33
Page 3 of 33
Journal of Agricultural and Food Chemistry
35
Introduction
36
The Committee on the Environment, Public Health and Food Safety of the European Parliament has
37
recently published a draft report “on the food crisis, fraud in the food chain and the control thereof”
38
in which organic food is listed as number three of the top-ten products with a particularly high risk
39
for adulterated food.1 This fact certainly derives from the increasing demand for organic food2 with
40
the consumer’s willingness to pay higher prices for organically than for comparable conventional
41
produced food. Thus, verifying authenticity of organic products is of decisive importance to protect
42
consumers against adulteration and to support the trustworthiness of organic labelling.
43
This study will discuss the use of sophisticated chemometric methods for the differentiation of
44
organic and conventional food, exemplified for tomatoes. Tomatoes and tomato products are
45
consumed in a large scale in Europe3 and at present the most popular vegetable in Germany with an
46
average annual consumption of 20.6 kg per person.4 Reliable markers to analytically verify the
47
cultivation methods of tomatoes are hardly available, although numerous attempts are described in
48
previous literature.5-11 Up to now, the composition of the nitrogen isotope (δ15N, expressed as
49
relative difference to the standard of atmospheric nitrogen) has presented the most important
50
marker to distinguish organically and conventionally produced tomatoes, but due to an overlap of
51
results the cultivation method cannot be assigned in every case.12
52
We have recently described the approach of proton nuclear magnetic resonance (1H NMR) profiling
53
for the authentication of organically produced tomatoes and the results confirmed suitability,
54
provided that an appropriate database of authentic tomatoes is available.13 When developing new
55
methods for the authentication of organically produced tomatoes, the currently available analytical
56
methods should not remain unconsidered. The potential of different techniques should rather be
57
combined to achieve synergies. This approach has proven to be highly useful for the differentiation
58
of organically and conventionally produced milk by combining data of 1H NMR and 13C NMR spectra
59
with stable isotope ratios and fatty acid composition,14 for the verification of variety and origin of 3 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
60
wines by combining data of 1H NMR and stable isotope ratios15 and for the determination of Sudan
61
dyes in spices by combining 1H NMR and UV/vis data.16 Combined multivariate examination of
62
individual results from different analytical methods can be performed by simply concatenating data
63
matrices or by use of multiblock methods,17 which facilitates the interpretation of models and their
64
reliability concerning the targeted goal.18
65
Hence, with the aim to develop an optimized analytical approach to verify authenticity of organically
66
produced tomatoes, several analytical methods were combined: isotope ratio mass spectrometry
67
(IRMS, determining δ13C, δ15N and δ18O) with δ15N as currently most reliable marker, 1H NMR
68
spectroscopy that we recently proved to be useful,13 and additionally, Mid InfraRed spectroscopy
69
(MIR) which turned out to be helpful to differentiate organically and conventionally produced
70
wines.19 The individual suitability of each analytical method for the differentiation between
71
organically and conventionally grown tomatoes was analyzed by use of Principal Component Analysis
72
(PCA), Partial Least Squares – Discriminant Analysis (PLS-DA) and Linear Discriminant Analysis (LDA).
73
Furthermore, LDA and PLS-DA (using concatenated data after variable selection for spectroscopic
74
data) and Common Components and Specific Weight Analysis (ComDim)20,21 were performed for
75
combined data of 1H NMR spectroscopy, MIR spectroscopy and IRMS.
76
However, organic and conventional farming cannot be seen as black-and-white definitions, when
77
faced with various possible implementations for each cultivation method.22 Moreover, establishing a
78
database of authentic tomato samples that reflects all conceivable ways of farming is accordingly
79
challenging and almost impossible. Yet analysing test sets of authentically grown tomatoes provides
80
an estimation of the classification power of individual analytical methods to differentiate organically
81
and conventionally grown tomatoes. After that, the applicability of classification models in future can
82
be assessed by validation studies. Therefore, in this study, authentic tomatoes were grown
83
conventionally using hydroponic culture and mineral fertilizer and organically using soil and different
84
organic fertilizers, both in a greenhouse, to keep influences of the weather to a minimum. These
85
cultivation trials do by far not represent all variations in farming conditions, but serve as a starting 4 ACS Paragon Plus Environment
Page 4 of 33
Page 5 of 33
Journal of Agricultural and Food Chemistry
86
point to generally verify the capabilities of analytical methods to differentiate tomatoes regarding its
87
cultivation method.
88
Materials and Methods
89
Chemicals
90
NaOH pellets (for 1 M NaOH) were purchased from VWR (Leuven, Belgium) and HCl (37%, for
91
1 M HCl) from Sigma Aldrich (Saint Louis, USA). TSPd4 (3-(trimethylsilyl)propionic acid-d4 sodium salt,
92
98 atom% D), D2O (99.9 atom% D), EDTA (ethylenediaminetetraacetic acid) and NaN3 were
93
purchased from Merck (Darmstadt, Germany).
94
Sample collection of authentic tomato samples
95
The normally fruited (average weight of 100 g/fruit) tomato varieties Bocati, Hamlet, Mecano,
96
Savantas, Seviocard and Tica and the small fruited (average weight of 20 g/fruit) tomato varieties
97
Sakura, Sunstream and Tastery were grown in overall seven greenhouses in Germany:
98
-
99
Bamberg; organically and conventionally, referred to as ‘BA organic’ and ‘BA conv.’ in the
100 101
two greenhouses of the Bavarian State Research Institute of Viticulture and Horticulture in
following text -
two greenhouses of the State Horticultural College and Research Institute Heidelberg;
102
organically and conventionally, referred to as ‘HD organic’ and ‘HD conv.’ in the following
103
text
104
-
three greenhouses of trading farms in the growing region ‘Knoblauchsland’ near Nuremberg;
105
one organically and two conventionally, referred to as ‘N organic’, ‘N conv. 1’ and ‘N conv. 2’
106
in the following text
107
Conventional growing conditions were each carried out as hydroponic culture using perlite substrate
108
and mineral fertilizer, while organic growing conditions were carried out using soil and clover-grass
5 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
109
silage, horn shavings, vinasse, Patentkali, sheep wool or winter rye (previous culture for green
110
manure) as organic fertilizers.
111
Sampling was performed by harvesting tomatoes systematically from different plants in the
112
greenhouses, at regular intervals of circa 4 weeks between April and October in 2013 and between
113
May and October in 2014. In the harvesting period of 2013 only the greenhouses BA organic, BA
114
conv., N organic, N conv. 1 and N conv. 2 cultivated the varieties Mecano and Tastery (with the
115
exception of one Tica tomato sample of BA organic), while the cultivation was complemented with
116
the varieties Bocati, Hamlet, Savantas, Seviocard, Tica, Sunstream and Sakura in individual
117
greenhouses including two further greenhouses (HD organic and HD conv.) in 2014. This yielded
118
overall 205 tomato samples, thereof 66 harvested in 2013 and 139 in 2014. The composition of
119
samples for individual measurements with respect to cultivars and greenhouses is illustrated in
120
Table 1 (samples available for 1H NMR/MIR/IRMS/data fusions are described from left to right in each
121
cell). For subsequent analysis, at least 250 g of tomatoes were pooled, pureed and homogenized, and
122
the puree was stored at -18 °C until measurement.
123
Isotope ratio mass spectrometry (IRMS)
124
One part of pureed tomato sample was freeze dried using a freeze dryer (Alpha 1-4 LSC, Christ,
125
Osterode, Germany), pulverized using a ball mill MM 301 (Retsch GmbH, Haan, Deutschland) and
126
used for measurement of 13C/12C and 15N/14N isotope ratios. The other part was centrifuged for 10
127
min (2700 rcf), sodium azide was added to the supernatant and used for measurement of 18O/16O
128
isotope ratio.
129
2.2 mg of the pulverized sample dry mass was weighed into tin capsules, combusted using an
130
Elementar Analyzer (Euro EA 3000, Euro Vectors SpA, Milano, Italy) and analyzed with an Isotope
131
Ratio Mass Spectrometer (ΔPlus XP, Thermo Finnigan, Bremen, Germany) equipped with a ConFlow
132
IV Interface (ThermoFisher Scientific, Bremen, Germany), and an auto sampler (Zero Blank Revolver
133
Autosampler, Blisotec GmbH, Jülich, Germany) controlled by Isodat 3.0 software (Thermo Finnigan, 6 ACS Paragon Plus Environment
Page 6 of 33
Page 7 of 33
Journal of Agricultural and Food Chemistry
134
Bremen, Germany). Resulting gases, CO2 and N2, were separated by a GC column and isotope ratios
135
were determined simultaneously. The 18O/16O ratio in “tomato water” was measured in 200 µL after
136
equilibration with CO2 using a MultiFlow 07/003 (Elementar, Manchester, England) with a Gilson
137
222XL Sampler (Gilson, Villiers Le Bel, France) interfaced to an IRMS (IsoPrimeTM, Manchester,
138
England). The 13C/12C, 15N/14N and 18O/16O isotope ratios were given in ‰ on a δ-scale. The values
139
refer to the international reference standards VPDB (Vienna Pee Dee Belemnite) for δ13C,
140
atmospheric nitrogen for δ15N and Vienna-Standard Mean Ocean Water (V-SMOW2) for δ18O.
141
ߜሾ‰ሿ =
142
Acetanilide, casein, glutamic acid and water were calibrated as working standards using the
143
international standards (IAEA-CH6, IAEA-CH7, NBS 22, USGS 40 for 13C/12C, 15N/14N and V-SMOW2,
144
SLAP2 and GISP for 18O/16O). Samples were analyzed twice and working standards were measured
145
four times to control the stability of the series of measurement. The standard deviation for IRMS
146
analysis was ≤ 0.2 ‰.
147
1
148
The aqueous tomato phase was analysed after centrifugation of puree at 3528 g for 5 min. 900 µL of
149
clear liquid tomato phase was mixed with 100 µL of a solution of 7 mM TSPd4, 10 mM EDTA and
150
2 mM NaN3 in D2O and the pH was adjusted to pH 4.00 ± 0.03, using 1 M NaOH or 1 M HCl. Finally,
151
600 µL of the pH adjusted solutions were filled into 5 mm NMR-tubes for NMR-measurement using a
152
400 MHz 1H NMR spectrometer. Acquisition and processing parameters of 1H NMR measurement
153
were set as previously described.13 For examination, the spectral range from 0-10 ppm was used,
154
excluding the regions of the residual water signal from 4.67 – 4.85 ppm and of residual ethanol (NMR
155
tubes were reused and washed with ethanol) from 3.60-3.70 and 1.14-1.22 ppm.
ோೞೌ ିோೞೌೌೝ ோೞೌೌೝ
∗ 1000
H NMR spectroscopy
7 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
156
MIR spectroscopy
157
Tomato puree was filtered through a folded filter (4-7 µm). The clear filtrate was used for MIR
158
measurement with a WineScan FT120 instrument (Foss GmbH, Rellingen, Germany) for tomato
159
samples of 2013 and a WineScan FT2 Flex instrument (Foss GmbH, Rellingen, Germany) for tomato
160
samples of 2014. For examinations, the range of wavelengths from 964 cm-1 to 2998 cm-1 (528
161
acquired data points) was used, excluding the range from 1547 cm-1 to 1716 cm-1 in order to
162
eliminate water absorption. For each sample the averaged spectra of two successive measurements
163
was used.
164
Multivariate statistics
165
Data pre-processing was performed by reducing the dimension of 1H NMR data, bundling spectral
166
regions of 0.02 ppm width into buckets using Amix 3.9.12 software (Bruker Biospin GmbH,
167
Rheinstetten, Germany; each bucket represents the signal intensity related to the spectral region)
168
and MIR transmission spectra were converted into respective absorption spectra. Buckets of 1H NMR
169
spectra, wavenumbers of MIR spectra and IRMS data (δ13C, δ15N, and δ18O, given in ‰ on a δ-scale)
170
served as variables for multivariate data analysis.
171
Multivariate data analysis was performed on the assumption of normally distributed data. For
172
individual analysis of analytical data, PCA and LDA were carried out with SPSS statistics 21 (IBM
173
Corporation, Armonk, USA). LDA was performed with equal ‘a priori’ probabilities for all groups and
174
stepwise selection procedure23 (chosen method: minimization of Wilks’ Lambda24; selection criterion
175
F-statistics with p < 0.005 for inclusion and p > 0.010 for exclusion). During validations of LDA, instead
176
of all variables only the variables selected for LDA of all samples were taken into account for
177
examination. PLS-DA was performed with Unscrambler X version 10.0.1 (Camo Software AS, Oslo,
178
Norway) using the Non-linear Iterative PArtial Least Squares algorithm (NIPALS).
179
For combined analysis of several analytical methods, MATLAB 2015a (The Math Works, Natick, MA,
180
USA) and SAISIR package for MATLAB25 were used. For variable selection of 1H NMR and MIR data 8 ACS Paragon Plus Environment
Page 8 of 33
Page 9 of 33
Journal of Agricultural and Food Chemistry
181
clustering of latent variables (CLV) was used.26,27 LDA and PLS-DA were applied to the concatenated
182
spectroscopic data (1H NMR and MIR after CLV variable selection) and IRMS data (δ13C, δ15N, δ18O). In
183
this study LDA was applied to the PCA scores, since the number of variables should not be too large28.
184
The best classification models were constructed when the inverse of the sum of squares (square of
185
Euclidian distance) was used as block scaling factor (i.e. after applying the block scaling factor, the
186
total variance of each block equals 1). Furthermore, multiblock method ComDim21,22 was performed
187
on spectroscopic data (1H NMR and MIR after CLV variable selection) and IRMS data (δ13C, δ15N,
188
δ18O).
189
For model evaluation a test devised by Tóth et al.29 was performed, which provides information on
190
the prediction performance of classification models by comparing the variance of classification
191
models to the variance of their leave-one-out classification model counterparts using F-statistics.
192
Results and Discussion
193
Overall, 205 tomato samples of nine different varieties were analyzed. Besides 1H NMR spectra that
194
were recorded for each tomato sample (n = 205), IRMS (n = 114) and MIR spectroscopy
195
measurements (n = 199) were performed for a selection of tomato samples. In the following, the
196
capabilities of individual methods and of combinations of these methods for the differentiation of
197
organically and conventionally grown tomatoes will be described.
198
In order to get an overview on the data structure, PCA was performed using individual data of
199
1
200
spectroscopy.
201
Furthermore, for both data of individual methods (1H NMR, MIR) and combined methods
202
(1H NMR + MIR, 1H NMR + IRMS, MIR + IRMS, 1H NMR + MIR + IRMS) LDA and PLS-DA were tested for
203
their ability to classify the cultivation method of tomatoes. At this, LDA classification models revealed
204
equivalent or superior validation results than PLS-DA regarding the percentage of correct
H NMR and MIR spectroscopy and ComDim was applied to combined data of IRMS, 1H NMR and MIR
9 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
205
classifications, and LDA achieved constantly better comparability of results among different
206
validation steps. Thus, for reasons of simplicity, only the outcomes of LDA will be demonstrated in
207
the following.
208
Validation of classification models
209
The use of supervised classification methods as LDA or PLS-DA always entails the risk that overfitted
210
models are created, since purposing optimized classification can accidentally force the inclusion of
211
insignificant variables. Such models reveal indeed good classification abilities for model samples, but
212
fail in the classification of further samples. Hence, suitable validation studies of classification models
213
are highly important30 to consider both the suitability of multivariate analysis and the representative
214
nature of model samples.
215
Regarding the approach to differentiate organically and conventionally grown tomatoes, several
216
critical influencing factors have to be considered during validation. As a natural product, the
217
composition of tomatoes is subject to unavoidable natural variations which complicate the aim to
218
designate their cultivation method. In order to evaluate the influence of natural compositional
219
fluctuations, one third of samples were coincidentally excluded for the creation of LDA and PLS-DA
220
models and used as independent test set for validation (this validation procedure will be referred to
221
as “random validation” in the following text).
222
However, since tomato samples were collected repeatedly at different harvesting times, randomly
223
selected validation test sets comprise tomato samples of the same cultivars grown in the same
224
greenhouses, but simply harvested at another point in time than samples of the calibration set. Thus,
225
although good results for random validation indicate differentiability of organically and
226
conventionally grown tomatoes, the practicality of these classification models for further tomato
227
samples of another cultivar or from another greenhouse (with specific implementations of
228
cultivation) is questionable. For instance, the previous results on the differentiation of organically
229
produced tomatoes using 1H NMR showed that the differentiation between two greenhouses with 10 ACS Paragon Plus Environment
Page 10 of 33
Page 11 of 33
Journal of Agricultural and Food Chemistry
230
different growing conditions works well, but does not prove to be useful for the classification of
231
tomato samples from different greenhouses despite basically comparable growing conditions, since
232
the model is overfitted taking into account two greenhouses only.13 Hence, further validation steps
233
were performed. Complete test sets of individual cultivars and individual greenhouses were
234
specifically excluded for model calibration and used as test set, in order to assess the quality of
235
classifications for tomatoes, whose cultivar or specific cultivation method were not taken into
236
account
237
cultivars/greenhouses, consecutively, and calibration samples were formed by tomato samples of the
238
remaining cultivars/greenhouses. At this, each tomato sample was excluded once for the
239
corresponding cultivar and once for the greenhouse group and the average result of all tomato
240
samples yielded the terms of cultivar and greenhouse validation.
241
Table 2 illustrates the respective number of calibration and validation samples for classification
242
models of 1H NMR, MIR and data fusions (indicated from left to right) during the steps of cultivar and
243
greenhouse
244
cultivars/greenhouses and especially Tastery/Mecano is over-represented as cultivar for
245
small/normally fruited tomato samples, the use of these cultivars as validation test would lead to a
246
relatively small number of remaining calibration samples. Hence, for cultivar validation of fused data
247
Tastery/Mecano were not used as cultivar validation test set, because the remaining calibration set
248
would provide 8/11 tomato samples for calibration only, which is not appropriate. For all remaining
249
validation steps at minimum 42 tomato samples were available for calibration.
250
Generally, this validation concept presents a stepwise approach. Firstly, results of random validation
251
indicate the basic ability to differentiate organically and conventionally grown tomatoes and
252
subsequently, cultivar and greenhouse validation verify if the classification ability is adequately
253
resistant to compositional variations subject to specific cultivars or greenhouses. Thus, good results
254
for random validation coincident with worse outcome for cultivar/greenhouse validation indicate
for
calibration.
cultivation.
Validation
As
the
test
tomato
sets consisted
samples
are
of
not
11 ACS Paragon Plus Environment
each
evenly
group
of
distributed
individual
on
all
Journal of Agricultural and Food Chemistry
255
overfitted classification models, while good and comparable results among all validation steps
256
confirm suitability of the classification models.
257
Furthermore, in order to verify if validations yield significant results which are not based on random
258
events of meaningless data, a randomization test was performed: variables (1H NMR data and MIR
259
data, respectively) were replaced by random vectors and the validation results for classification
260
models thereof were analyzed. Since the random probability of each tomato sample to be organic or
261
conventional is 50%, an objective validation process of classification models based on random data is
262
expected to achieve circa 50% correct predictions. In accordance to this, on average 52±7% and
263
46±7% correct classifications (average of random/cultivar/greenhouse validations) were achieved for
264
the randomization test of LDA classification models for 1H NMR and MIR data, confirming the
265
informative value of the validation approach.
266
Isotope ratio mass spectrometry (IRMS)
267
Overall 114 tomato samples were analysed by IRMS regarding δ15N values of the dry residues of
268
tomatoes. The isotope composition of nitrogen in the applied fertilizers predefines the isotope
269
composition of nitrogen of the fertilized tomatoes and consequently, higher δ15N values of organic
270
fertilizers12 lead to higher δ15N values of organically produced tomatoes.11 One exception to be
271
mentioned is the use of leguminous, which is legitimate as organic fertilizer. Legumes can metabolize
272
atmospheric nitrogen (δ15N value around 0‰) to plant-accessible nitrogenous molecules which leads
273
to noticeable low δ15N values and thus, hampers differentiation from conventionally grown crops12.
274
Our results of IRMS totally comply with these findings (Figure 1). The δ15N value averaged
275
significantly higher results for organically than for conventionally grown tomatoes, but yet an
276
overlapping region existed in the range from 2‰ to 4‰. This overlap is mainly due to the use of
277
green manures of the greenhouses N organic and BA organic, while HD organic only applied horn
278
shavings and vinasse as fertilizers and yielded accordingly high δ15N values that were clearly
279
separated from the δ15N range of conventionally grown tomatoes. 12 ACS Paragon Plus Environment
Page 12 of 33
Page 13 of 33
Journal of Agricultural and Food Chemistry
280
Beside δ15N values, IRMS included the determination of δ13C (of the dry residue of tomatoes) and
281
δ18O (of the aqueous tomato phase), but these are less relevant in view of the growing regime. δ13C
282
(averaging -30.2 ± 3.5‰ vs. VPDB) indicates greenhouse cultivation due to striking negative values of
283
δ13C caused by supplement of CO2 from heatings with CH4 and δ18O (averaging -4.4 ± 1.5‰ vs. V-
284
SMOW) depends on the source of water.31
285
1
H NMR spectroscopy
286
For each tomato sample (n = 205) a 1H NMR spectrum of the aqueous phase was acquired. 1H NMR
287
spectra provided wide information about sugars, organic acids, amino acids and further minor
288
components at the same time13 and hence, 1H NMR is an accordingly useful source of data for
289
tomato profiling. To reduce the dimension of data, 1H NMR spectra were transformed into buckets
290
by bundling spectral regions of 0.02 ppm.
291
PCA
292
PCA of buckets was performed to get an overview of the data clustering. Mean-centred and
293
standardized buckets were used for analysis, because varying concentrations of ingredients resulted
294
in signal intensities that differed highly in scale. The scatter plot of PC1 vs. PC2 (Figure 2A)
295
demonstrates that the data clustered mainly according to the cultivar type and especially data clouds
296
of normally fruited varieties were separated from small fruited tomato varieties (Figure 2B). Actually,
297
the values of PC1 seem to be predefined by the dry mass of tomatoes, as PC1 highly correlated with
298
the total spectral intensity (R = 0.967; total spectral intensity was calculated as the sum of all buckets
299
from 0 - 10 ppm, excluding ranges of water and ethanol resonance signals). However, a trend for the
300
separation of respective organically and conventionally grown tomatoes was also achieved along PC5
301
with significantly higher values for the group of conventionally grown tomatoes (t-test: p < 0.001;
302
Figure 2C), but the overlapping data-clouds did not enable obvious differentiation.
303
LDA
13 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
304
Hence, the supervised classification algorithm of LDA was used for further examinations. PCA showed
305
that the main variance in NMR data is given by the total spectral intensity, which is probably due to
306
varying dry masses of tomatoes. To reduce this effect, buckets were transformed into their relative
307
values referred to the total spectral intensity (sum of all buckets from 0 - 10 ppm, excluding ranges of
308
water and ethanol resonance signals) prior to LDA.
309
Moreover, as PCA revealed wide differences between normally and small fruited tomatoes,
310
classification models were built for all tomatoes as well as for normally and small fruited tomatoes
311
individually. The Tóth-test29 thereof suggested better prediction performances for separate
312
classification models for the groups of normally and small fruited tomatoes (p = 0.48 and p = 0.20)
313
than for one overall classification model including all tomato samples (p < 0.05). Thus, separate
314
classification models were used for further examinations.
315
For both classification models of normally and small fruited tomato samples the individual validation
316
steps showed comparable outcome and thus, no indication for overfitted models is given (Table 3).
317
Comparing LDA classification results for normally and small fruited tomatoes among each other,
318
normally fruited ones yielded better validation results with 100% correct classifications for cultivar
319
and 99.1% for greenhouse validation compared to 91.3% for cultivar and 95.7% for greenhouse
320
validation of small fruited tomatoes. Regarding cultivar validation, the model for normally fruited
321
samples is possibly more representative, as six different cultivars were included compared to only
322
three different varieties of small fruited tomatoes.
323
MIR spectroscopy
324
MIR spectroscopy is a powerful tool for food analysis, offering simple sample preparation and rapid
325
analysis.32 It can be used for authenticity analysis33 as well as for quantification purposes after
326
adequate calibration with samples of known composition.34,35 To test the suitability of MIR for
327
differentiating tomatoes of different cultivation methods, the aqueous phase of tomatoes was
14 ACS Paragon Plus Environment
Page 14 of 33
Page 15 of 33
Journal of Agricultural and Food Chemistry
328
analyzed by means of MIR-spectroscopy. Overall 199 tomato samples were measured using MIR and
329
spectra were analyzed using PCA and LDA.
330
PCA
331
Mean centered data of MIR absorption spectra were used for examination. Just as for 1H NMR, PCA
332
of MIR-spectra mainly revealed the separation of cultivars (Figure 3A), especially of normally and
333
small fruited tomato samples along PC1 (Figures 3B and 3C). However, in between these groups,
334
conventionally produced tomatoes yielded significantly higher values for PC1 than organically
335
produced tomatoes (each t-test: p < 0.001), but yet with overlapping regions (Figures 3B and 3C).
336
LDA
337
Classification models were created for all tomato samples as well as for the groups of normally and
338
small fruited tomato samples individually. The Tóth-test29 indicated adequate prediction
339
performances of each classification model (p = 0.33, p = 0.24, p = 0.38 for all, normally and small
340
fruited tomato samples, respectively). In favour of comparability to classification results of 1H NMR
341
data, individual classification models for normally and small fruited tomatoes were used for further
342
examinations.
343
LDA of MIR data showed quite comparable results for random validation and cultivar/ greenhouse
344
validation and amounted to a maximum of 8.2% for the differences between classification results
345
(91.7% for random and 83.5% for greenhouse validation of normally fruited samples; Table 3). Hence,
346
no evidence for overfitting of LDA classification models is given. Comparing LDA classification findings
347
for normally and small fruited samples, results for normally fruited tomato samples (83.5%-91.7%)
348
were always better than for small fruited ones (75.6%-82.2%). Overall, LDA classification results for
349
MIR ranged between 75.6% and 91.7% and thus, are inferior to 1H NMR results (ranging from 93.5%
350
to 100%).
15 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
351
Data fusion of 1H NMR, MIR, and IRMS
352
Finally, the differentiation of organically and conventionally grown tomatoes was analyzed by fusing
353
data of individual methods (1H NMR, MIR, and IRMS). At this, normalized 1H NMR data were used,
354
scaled to the total spectral intensity (total spectral intensity was calculated as the sum of all buckets
355
from 0 - 10 ppm, excluding ranges of water and ethanol resonance signals). Since spectroscopic data
356
naturally present data sources with a high number of variables in contrast to IRMS with only three
357
variables (δ13C, δ15N, and δ18O), the number of variables of 1H NMR and MIR was reduced prior to
358
data fusions by using clustering of latent variables (CLV). The CLV method involves two stages,
359
namely a hierarchical clustering analysis followed by a partitioning algorithm. Partitioning is
360
determined by the value of a quality criterion (T) – the sum of the first eigenvalues of the data
361
matrices of each cluster.26,27
362
ComDim analysis
363
To get an overview of the sample grouping regarding data of all analytical methods, ComDim analysis
364
was performed for data of 1H NMR + MIR + IRMS, separately for the groups of normally and small
365
fruited tomatoes. The basic idea of ComDim is the creation of one common space of common
366
components out of several variable blocks available for the same samples,21,22 which are the variables
367
of several analytical methods for this special case. Figure 4 illustrates the results of ComDim analysis
368
(Figure 4A) compared to respective individual results of PCA for 1H NMR (Figure 4B) and MIR (Figure
369
4C) data as well as the range of δ15N (Figure 4D), separately for normally and small fruited tomato
370
samples. Compared to PCA analysis of individual methods, ComDim clearly shows an increased
371
separation trend of data points according to the cultivation method.
372
A major advantage of ComDim analysis compared to PCA on concatenated data is that ComDim
373
provides information about the relationship of individual variable blocks and their selectivity on the
374
total variance of common components.36 Figure 5 demonstrates the specific weight (or salience) of
375
1
H NMR, MIR, and IRMS associated with the first three common dimensions (D1, D2, D3) of ComDim 16 ACS Paragon Plus Environment
Page 16 of 33
Page 17 of 33
Journal of Agricultural and Food Chemistry
376
analysis. For both normally and small fruited tomatoes, MIR data are dominant for D1, IRMS data for
377
D2 and 1H NMR data for D3 (Figure 5), and thus, each analytical method considerably influenced the
378
results of ComDim.
379
LDA of concatenated data
380
Concatenation of several data matrices presents the simplest way of data fusion and was applied
381
combining data of 1H NMR + MIR, 1H NMR + IRMS, MIR + IRMS, 1H NMR + MIR + IRMS. The quality of
382
each data combination for a classification of the cultivation method (using LDA and PLS-DA) was
383
again assessed by test set validation (Table 3). For the sake of comparability, data of the same
384
tomato samples (n = 112) were used for all combinations of data, even if generally more samples
385
were available for individual combinations.
386
Comparing different data fusion models for the classification quality of LDA, best results were
387
achieved for 1H NMR + MIR + IRMS with 100% correct classifications for random, cultivar and
388
greenhouse validation for small fruited and 95.0% for random, 100% for cultivar and 98.3% for
389
greenhouse validation for normally fruited tomatoes. Second best LDA validation results are
390
presented by the combination of 1H NMR + IRMS (94.9%-100.0%), while results for 1H NMR + MIR
391
(72.9%-100%) and MIR + IRMS (83.3%-100%) are occasionally different for individual validations but
392
comparable in view of the average quality of all results.
393
Comparison of results for fused data and individual analysis
394
Comparing classification results of concatenated data to findings of individual methods, LDA
395
validation of 1H NMR + MIR + IRMS (95.0% - 100%) yielded better results than LDA validation of
396
individual methods (1H NMR: 91.3% - 100% and MIR: 75.6% - 85.3%). Hence, this supports the
397
approach to combine these methods in order to achieve synergies for an optimized differentiation of
398
organically and conventionally grown tomatoes.
17 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
399
Regarding single analytical methods, especially the classification results of 1H NMR models are
400
promising. However, the quality of 1H NMR models depends crucially on the representative nature of
401
model samples to avoid overfitting and generally, more tomato samples differing in cultivar and
402
specific growing conditions need to be measured to further enhance significance of results. Within
403
the framework of possibilities for the actual available sample compilation, test set validation was
404
performed as the best to yield a realistic estimation of the quality of results. In the future, if the
405
database of authentic tomato samples is sufficiently widened, enhanced chemometric classification
406
models can be used as a helpful screening tool to investigate the authenticity of tomatoes.
407
Moreover, additional measurement of MIR and IRMS analysis can improve classification results of
408
individual 1H NMR analysis.
409
Acknowledgement
410
Special thanks are due to colleagues from the Bavarian State Research Institute of Viticulture and
411
Horticulture (LWG, Bamberg, Germany), the State Horticultural College and Research Institute
412
Heidelberg (LVG, Heidelberg, Germany) and to producers of the region “Knoblauchsland” for
413
providing authentic tomato samples.
414
References
415
1. European Parliament, Committee on the Environment, Public Health and Food Saftey. Draft
416
report on the food crisis, fraud in the food chain and the control thereof (2013/2091(INI)).
417
http://www.europarl.europa.eu/sides/getDoc.do?pubRef=-//EP//NONSGML+COMPARL+PE-
418
519.759+02+DOC+PDF+V0//EN&language=EN (as from 05.08.2015).
419
2. Sahota, A. The world of Organic Agriculture, Statistics and Emerging Trends 2013. The Global
420
Market for Organic Food & Drink. https://www.fibl.org/fileadmin/documents/shop/1606-
421
organic-world-2013.pdf (as from 05.08.2015).
18 ACS Paragon Plus Environment
Page 18 of 33
Page 19 of 33
Journal of Agricultural and Food Chemistry
422
3. Caris-Veyrat C.; Amiot M. J.; Tyssandier V.; Grasselly D.; Buret M.; Mikolajczak M.; Guilland, J.
423
C.; Bouteloup-Demange C.; Borel, P. Influence of organic versus conventional agricultural
424
practice on the antioxidant microconstituent content of tomatoes and derived purees;
425
Consequences on antioxidant plasma status in humans. J Agric. Food Chem. 2004, 52, 6503-
426
6509.
427
4. Presseinformation der Bundesanstalt für Landwirtschaft und Ernährung vom 09.07.2013 -
428
20,6
kg
pro
Kopf
verzehrt:
Tomaten
sind
der
Deutschen
liebstes
Gemüse.
429
http://www.ble.de/SharedDocs/Downloads/08_Service/04_Pressemitteilungen/Archiv2013/
430
130709_Tomaten.pdf;jsessionid=F8552452F0D99F07C45DD6E21B128375.1_cid335?__blob=
431
publicationFile (as from 05.08.2015).
432
5. Mitchell, A. E.; Hong, Y. J.; Koh, E.; Barrett, D. M.; Bryant, D. E.; Denison, R. F.; Kaffka, S. Ten-
433
year comparison of the influence of organic and conventional crop management practices on
434
the content of flavonoids in tomatoes. J. Agric. Food Chem. 2007, 55, 6154-6159.
435
6. Vallverdú-Queralt, A.; Medina-Remón, A.; Casals-Ribes, I.; Amat, M.; Lamuela-Raventós, R.M.
436
A Metabolomic Approach Differentiates between Conventional and Organic Ketchups. J.
437
Agric. Food Chem. 2011, 59, 11703-11710.
438
7. Vallverdú-Queralt, A.; Medina-Remón, A.; Casals-Ribes, I.; Amat, M.; Lamuela-Raventós, R.M.
439
Is there any difference between the phenolic content of organic and conventional tomato
440
juice? Food Chem. 2012, 130, 222-227.
441
8. Kelly, S. D.; Bateman, A. S. Comparison of mineral concentrations in commercially grown
442
organic and conventional crops - Tomatoes (Lycopersum esculentum) and lettuces (Lactuca
443
sativa). Food Chem. 2010, 119, 738-745.
444 445
9. Gosling, P.; Hodge, A.; Goodlass, G.; Bending, G. D. Arbuscular mycorrhizal fungi and organic farming. Agric. Ecosyst. Environ. 2006, 113, 17-35.
19 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
446
10. Bateman, A. S.; Kelly, S. D.; Jickells, T. D. Nitrogen isotope relationships between crops and
447
fertilizer: implications for using nitrogen isotope analysis as an indicator of agricultural
448
regime. J. Agric. Food Chem. 2005, 53, 5760-5765.
449 450
11. Bateman, A. S.; Kelly, S. D.; Woolfe, M. Nitrogen isotope composition of organically and conventionally grown crops. J. Agric. Food Chem. 2007, 55, 2664-2670.
451
12. Rogers, K. M.; Nitrogen isotopes as a screening tool to determine the growing regimen of
452
some organic and nonorganic supermarket produce from New Zealand. J. Agric. Food Chem.
453
2008, 56, 4078-4083.
454
13. Hohmann, M.; Christoph, N.; Wachter, H.; Holzgrabe, U.; 1H NMR profiling as an approach to
455
differentiate conventionally and organically grown tomatoes. J. Agric. Food Chem. 2014, 62,
456
8530-8540.
457
14. Erich, S.; Schill, S.; Annweiler, E.; Waiblinger, H. U.; Kuballa, T.; Lachenmeier, D.W.;
458
Monakhova, Y. B. Combined chemometric analysis of 1H NMR, 13C NMR and stable isotope
459
data to differentiate organic and conventional milk. Food Chem. 2015, 188, 1-7.
460
15. Monakhova, Y. B.; Godelmann, R.; Hermann, A.; Kuballa, T.; Cannet, C.; Schäfer, H.; Spraul,
461
M.; Rutledge, D. N. Synergistic effect of the simultaneous chemometric analysis of 1H NMR
462
spectroscopic and stable isotope (SNIF-NMR,
463
Anal. Chim. Acta 2014, 833, 29-39.
464 465 466 467
18
O,
13
C) data: Application to wine analysis.
16. Di Anibal, C. V.; Callao, M. P.; Ruisánchez, I. 1H NMR and UV-visibile data fusion for determining Sudan syes in culinary spices. Talanta, 2011, 84, 829-833. 17. MacGregor, J. F.; Jaeckle, C.; Kiparissides, C.; Koutoudi, M. Process monitoring and diagnosis by multiblock PLS methods. AICHE J. 1994, 40, 826-838.
468
18. Westerhuis J. A.; Smilde, A. K. Deflation in multiblock PLS. J. Chemometr. 2001, 15, 485-493.
469
19. Cozzolino, D.; Holdstock, M.; Dambergs, R. G.; Cynkar W. U.; Smith, P. A. Mid infrared
470
spectroscopy and multivariate analysis: A tool to discriminate between organic and non-
471
organic wines grown in Australia. Food Chem. 2009, 116, 761-765. 20 ACS Paragon Plus Environment
Page 20 of 33
Page 21 of 33
472 473 474 475
Journal of Agricultural and Food Chemistry
20. Qannari, E. M.; Wakeling, I.; Courcoux, P.; MacFie, H. J. H. Defining the underlying sensory dimensions. Food Qual. Prefer. 2000, 11, 151-154. 21. Qannari, E. M.; Wakeling, I.; MacFie, H. J. H. A hierarchy of models for analysing sensory data. Food Qual. Prefer. 1995, 6, 309-314.
476
22. Drinkwater L. E.; Letourneau, D. K.; Workneh, F.; van Bruggen, A. H. C.; Shennan, C.
477
Fundamental differences between conventional and organic tomato agroecosystems in
478
California. Ecol. Appl. 1995, 5, 1098-1112.
479 480
23. Flury, B.; Riedwyl, H. In Angewandte multivariate Statistik. 1st ed.; Gustav Fischer Verlag: Stuttgart, Germany, 1983.
481
24. Marini, F., Magrí, A. L., Balestrieri, F., Fabretti, F., Marini, D. Supervised pattern recognition
482
applied to the discrimination of the floral origin of six types of Italian honey samples. Anal.
483
Chim. Acta 2004, 515, 117-125.
484 485 486 487
25. Cordella, C.; Bertrand, D. SAISIR: A new general chemometric toolbox. Trac – Trend Anal. Chem. 2014, 54, 75-82. 26. Vigneau, E.; Qannari, E. M. Clustering of Variables Around Latent Components. Commun. Stat. – Simul. C. 2003, 32, 1131-1150.
488
27. Cuny, M.; Vigneau, E.; Le Gall, G.; Colquhoun, I.; Lees, M.; Rutledge, D. N. Fruit juice
489
authentication by 1H NMR spectroscopy in combination with different chemometrics tools.
490
Anal. Bioanal. Chem. 2008, 390, 419-427.
491
28. Monakhova, Y. B.; Godelmann, R.; Kuballa, T.; Mushtakova, S. P., Rutledge, D. N.
492
Independent components analysis to increase efficiency of discriminant analysis methods
493
(FDA and LDA): Application to NMR fingerprinting of wine. Talanta 2015, 141, 60-65.
494
29. Tóth, G.; Bodai, Z.; Heberger, K. Estimation of influential points in any data set from
495
coefficient of determination and its leave-one-out cross-validated counterpart. J. Comput.
496
Aid. Mol. Des. 2013, 27, 837–844.
21 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
497 498
30. Riedl, J.; Esslinger, S.; Fauhl-Hassek, C. Review of validation and reporting of non-targeted fingerprinting approaches for food authentication. Anal. Chim. Acta 2015, 885, 17-32.
499
31. Schmidt, H. L.; Roßmann, A.; Voerkelius, S.; Schnitzler, W. H.; Georgi, M.; Grassmann, J.
500
Zimmermann, G., Winkler, R. Isotope characteristics of vegetables and wheat from
501
conventional and organic production. Isot. Environ. Healt. S. 2005, 41, 223-238.
502 503
32. Vandevoort, F. R.; Fourier transform infrared spectroscopy applied to food analysis. Food Res. Int. 1992, 25, 397-403.
504
33. Cozzolino, D.; Smyth, H. E.; Gishen, M. Feasibility study on the use of visible and near-
505
infrared spectroscopy together with chemometrics to discriminate between commercial
506
white wines of different varietal origins. J. Agric. Food Chem. 2003, 51, 7703-7708.
507 508 509 510
34. Bauer, R.; Nieuwoudt, H.; Bauer, F. F.; Kossmann, J.; Koch, K. R.; Esbensen, K. H. FTIR spectroscopy for grape and wine analysis. Anal. Chem. 2008, 80, 1371-1379. 35. Lachenmeier, D. W. Rapid quality control of spirit drinks and beer using multivariate data analysis of fourier transform infrared spectra. Food Chem. 2007, 101, 825-832.
511
36. Mazerolles, G.; Hanafi, M.; Dufour, E.; Bertrand, D.; Qannari, E. M. Common components and
512
specific weights analysis: a chemometric method for dealing with complexity of food
513
products. Chemometr. Intell. Lab. 2006, 81, 41-49.
514
Notes
515
This research project was funded by the Bavarian State Ministry of the Environment and Consumer
516
Protection and Y. Monakhova acknowledges funding in the framework of the state contract
517
4.1708.2014K of the Russian Ministry of Education.
22 ACS Paragon Plus Environment
Page 22 of 33
Page 23 of 33
Journal of Agricultural and Food Chemistry
518
Figure Captions
519
Figure 1: Box plot of δ15N values of the aqueous tomato phase (expressed as ‰ vs. atmospheric
520
nitrogen) with regard to the cultivation method (organic light grey and conventional dark
521
grey colored) for all tomato samples (on the left side) and tomato samples of individual
522
greenhouses (on the right side); each box is determined by the 25th and 75th percentiles, each
523
whiskers by the 5th and 95th percentiles.
524
Figure 2: PCA of NMR data; A: scatter plot of PC1 vs. PC2 with square symbols for normally fruited
525
(Bocati blue, Hamlet red, Mecano yellow, Savantas cyan, Seviocard purple and Tica pink
526
colored) and triangular symbols for small fruited tomato samples (Sakura yellow, Sunstream
527
light green, Tastery purple colored); B: scatter plot of PC1 vs. PC2 with square yellow symbols
528
for normally fruited and blue triangular symbols for small fruited tomato samples; C: scatter
529
plot of PC1 vs. PC5 with square symbols for normally fruited and triangular symbols for small
530
fruited tomato samples, which are colored light grey for organic and dark grey for
531
conventional cultivation methods.
532
Figure 3: Scatter plot of PC1 vs. PC2 for PCA of MIR data; A: square symbols for normally fruited
533
(Bocati blue, Hamlet red, Mecano yellow, Savantas cyan, Seviocard purple and Tica pink
534
colored) and triangular symbols for small fruited tomato samples (Sakura yellow, Sunstream
535
light green, Tastery purple colored); B: square symbols for normally fruited (colored light
536
grey for organic and dark grey for conventional cultivation methods) and triangular colorless
537
symbols for small fruited tomato samples; C: square colorless symbols for normally fruited
538
and triangular symbols (colored light grey for organic and dark grey for conventional
539
cultivation methods) for small fruited tomato samples.
540
Figure 4: Figures A-D are illustrated for normally (on the left side) and small fruited tomatoes (on the
541
right side) with square symbols for normally fruited and triangular symbols for small fruited
542
tomato samples, each colored light grey for organic and dark grey for conventional 23 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
543
cultivation methods) A: three-dimensional plot of the first three dimensions of ComDim
544
analysis for 1H NMR + MIR + IRMS data; B: PCA scatter plot (PC1 vs. PC5) for 1H NMR data; C:
545
PCA scatter plot (PC1 vs. PC2) for MIR data; D: box plot of δ15N of the aqueous tomato phase
546
(expressed as ‰ vs. atmospheric nitrogen).
547
Figure 5: Salience of 1H NMR, MIR and IRMS data on the first three dimensions of ComDim analysis
548
for normally fruited (on the left side) and small fruited tomatoes (on the right side).
24 ACS Paragon Plus Environment
Page 24 of 33
Page 25 of 33
Journal of Agricultural and Food Chemistry
Table 1: Number of tomato samples for measurements of 1H NMR/MIR/IRMS/data fusion analysis (from left to right in each cell) with respect to harvesting period, cultivar, and greenhouse.
2013 2014
Mecano Tica Tastery Bocati Hamlet Mecano Savantas Seviocard Tica Sakura Sunstream Tastery
BA org. 6/6/5/5 1/0/0/0 6/6/5/5 6/6/0/0 6/6/6/6
6/6/0/0 6/6/0/0 6/5/0/0 6/6/6/6
N org. 6/6/6/6
HD org.
6/5/6/5 4/4/4/4 4/4/4/4 6/6/0/0 4/4/4/4 3/3/0/0 3/3/3/3
4/4/4/4 3/3/3/3
BA conv. 6/6/5/5
N conv.1 8/7/6/6
N conv.2 8/7/6/6
6/6/5/5
7/6/6/5
6/6/6/6
6/6/0/0 6/6/6/6
6/6/0/0
6/6/0/0 1/1/0/0
6/6/0/0 6/6/0/0 7/7/1/1 6/6/6/6
6/6/0/0
5/5/0/0
25 ACS Paragon Plus Environment
HD conv.
4/4/4/4
3/3/3/3 4/4/4/4
Journal of Agricultural and Food Chemistry
Table 2: Number of calibration and validation samples for each validation step for classification models of 1H NMR, MIR and data fusions (listed from left to right in each cell); numbers in brackets are indicated for information purposes only, no validation was performed with this test sets.
cultivar greenhouse cultivar greenhouse
normally fruited tomatoes
small fruited tomatoes
Validation set Sakura Sunstream Tastery BA organic N organic HD organic BA conv. N conv. 1 N conv. 2 HD conv. Bocati Hamlet Mecano Savantas Seviocard Tica BA organic N organic HD organic BA conv. N conv. 1 N conv. 2 HD conv.
Number of samples for calibration for validation 81 78 53 12 12 0 73 71 45 20 19 8 32 31 (8) 61 59 (45) 69 67 42 24 23 11 87 85 48 6 5 5 86 83 46 7 7 7 68 65 41 25 25 12 80 78 48 13 12 5 82 79 47 11 11 6 86 83 46 7 7 7 108 105 55 4 4 4 96 93 55 16 16 4 40 39 (11) 72 70 (48) 108 105 59 4 4 0 109 106 56 3 3 3 99 97 59 13 12 0 87 85 48 25 24 11 97 94 53 15 15 6 97 94 44 15 15 15 88 85 48 24 24 11 98 96 53 14 13 6 97 95 53 15 14 6 108 105 55 4 4 4
26 ACS Paragon Plus Environment
Page 26 of 33
Page 27 of 33
Journal of Agricultural and Food Chemistry
Table 3: Test set validation for LDA using data of 1H NMR, MIR, 1H NMR + MIR, 1H NMR + IRMS, MIR + IRMS, and 1H NMR + MIR + IRMS, separately for small and normally fruited tomatoes with each random validation and validation of individual cultivars and greenhouses. 1
H NMR
MIR
H NMR + MIR
random Small fruited cultivar Tomatoes greenhouse
93.5
80.0
96.7
100.0
100.0
H NMR + MIR + IRMS 100.0
91.3
82.2
100.0
100.0
83.3
100.0*
95.7
75.6
73.6
96.2
84.9
100.0
Normally random cultivar fruited Tomatoes greenhouse
100.0
91.7
94.4
95.0
100.0
95.0
100.0
85.3
90.9
100.0
90.9
100.0*
99.1
83.5
72.9
94.9
88.1
98.3
validation step
1 1
1
H NMR + IRMS
MIR + IRMS
* cultivar validation for Mecano and Tastery was renounced due to an inappropriate ratio between the number of samples for validation and calibration - only the respectively remaining cultivars served as test set for cultivar validation
27 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Figure 1.
28 ACS Paragon Plus Environment
Page 28 of 33
Page 29 of 33
Journal of Agricultural and Food Chemistry
Figure 2.
29 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Figure 3.
30 ACS Paragon Plus Environment
Page 30 of 33
Page 31 of 33
Journal of Agricultural and Food Chemistry
Figure 4.
31 ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Figure 5.
32 ACS Paragon Plus Environment
Page 32 of 33
Page 33 of 33
Journal of Agricultural and Food Chemistry
TOC
33 ACS Paragon Plus Environment