Volatile-Compound Fingerprinting by Headspace-Gas

Jan 3, 2018 - Chemometric-Data Analysis and Software. As a first step, before performing the two-way multivariate analysis, the original, preprocessed...
4 downloads 13 Views 2MB Size
Subscriber access provided by UNIV OF TASMANIA

Article

Volatile Compound Fingerprinting by Headspace Gas ChromatographyIon Mobility Spectrometry (HS-GC-IMS) for the Authenticity Assessment of Honey as Benchtop Alternative to 1H-NMR Profiling Natalie Gerhardt, Markus Birkenmeier, Sebastian Schwolow, Sascha Rohn, and Philipp Weller Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.7b03748 • Publication Date (Web): 03 Jan 2018 Downloaded from http://pubs.acs.org on January 3, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Volatile Compound Fingerprinting by Headspace Gas Chromatography-Ion Mobility Spectrometry (HS-GC-IMS) for the Authenticity Assessment of Honey as Benchtop Alternative to 1 H-NMR Profiling Natalie Gerhardt*, Markus Birkenmeier*, Sebastian Schwolow*, Sascha Rohn‡, Philipp Weller* * Institute for Instrumental Analytics and Bioanalysis, Mannheim University of Applied Sciences, 68163 Mannheim, Germany ‡ Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, 20146 Hamburg, Germany ABSTRACT: This work describes a simple approach of untargeted profiling of volatile compounds for the authentication of the botanical origin of honey based on resolution-optimized HS-GCIMS, combined with optimized chemometric techniques, namely PCA, LDA and kNN. A direct comparison of the PCA-LDA models between HS-GC-IMS and 1H-NMR data demonstrated that HS-GCIMS profiling could be used as a complementary tool to the NMRbased profiling of honey samples. While NMR profiling still requires comparatively precise sample preparation, in particular, pH adjustment, HS-GC-IMS fingerprinting may be considered as an alternative approach for a truly fully automatable, cost-efficient and in particular, highly sensitive method. It was demonstrated that all tested honey samples could be distinguished based on their botanical origin. The loadings plots revealed the volatile compounds responsible for the differences among different monofloral honeys. The HS-GC-IMSbased PCA-LDA model was composed of two linear functions of discrimination and 10 selected PCs that discriminated canola, acacia and honeydew honeys with a predictive accuracy of 98.6 %. Application of the LDA model on an external test set of ten authentic honeys clearly proved the high predictive ability of the model by correctly classifying them into three variety groups with a correct classification of 100 %. The constructed model presents a simple and efficient method of analysis and may serve as a basis for the authentication of other food types.

By definition, honey is a natural and almost non-treated food, which is highly valued by consumers as an authentic and a high-quality product. As a result, one of the major concerns of the value-added food chain is to ensure and enforce both, authenticity and quality. With a growing globalization of the honey market, the identification of the botanical origin of honey as well as the proof of its authenticity in terms of composition and geography has become an increasingly challenging task. In particular, high-priced monofloral honeys of rare botanical origin are a potential target for food fraud. According to the EU Commission’s publicly available reports1, honey is one of the most frequently adulterated food products, which is regularly found to be non-compliant with the quality specifications defined by the EU Commission’s Directive 2001/110/EC. In most cases, this referred to a faulty declaration of the botanical source. In recent years, numerous studies have been carried out to identify specific marker compounds for a particular

type of honey.2-8 However, it is difficult to identify and detect characteristic floral markers in honey of various botanical origins, as possible indicators that might distinguish between the different types of honey depend not only on floral source, but also on geography and ecophysiological factors such as climate, soil, season, and also on storage conditions.9,10 Furthermore, only a few compounds seem to be truly specific for certain monofloral honeys and many of them can be also found in variable concentrations in various honey types. Consequently, there are no identified chemical markers or sets of markers for authenticity that are accessible by conventional, target-based analyses. Hence, an analytical approach covering a multitude of parameters in parallel on the one hand, paired with strong discrimination power on the other hand is required here. This is reflected by the plethora of studies published over the last years, which cover a broad scope of analytical techniques described as

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

potentially suitable tools. While the most commonly used methods for quality and authenticity in many laboratories are still sensory analysis and physicochemical methods, which rely on very traditional markers, such as 5-hydroxymethyl furfural, enzyme activity, moisture and mono- and disaccharides, the botanical and geographical origin of honey is typically determined by melissopalynological analysis based on a tedious and time-consuming identification of the pollen using microscopy.11 However, analysis and data interpretation is very time consuming and do not always lead to correct assignments, mostly due the univariate approach and its inherent low selectivity. Alternative methods mostly base on targeted as well as non-targeted spectroscopic and spectrometric techniques and typically involve the use of chromatographic methods such as HPLC-UV,12,13 headspacesolid phase microextraction gas chromatography coupled with mass spectrometry (HS-SPME-GC/MS),14-21 HS-GCQTOF-MS,22 HPLC-MS/MS,23-25 UPLC-QTOF-MS,26,27 IR-based spectroscopic techniques (FT-MIR/NIR)28-30 or Raman spectroscopy.31-33 A different approach is followed by isotope ratio mass spectrometry (IRMS) based techniques, where either the major constituting sugars are analyzed for their 13C pattern via HPLC-C-IRMS or the isolated protein fraction is analysed as a bulk fraction.34,35 While these techniques overcame a number of the limitations of traditional, wet-chemistry based assays, the major issue of a complex and time-consuming sample preparation remains for many of the proposed techniques, in particular for GC-based techniques, as well as for IRMS methods. This is obviously contradicting the idea of a rapid characterization and botanical classification. Furthermore, IR techniques, either based on ATR-FT-MIR or NIR, typically suffer from a high dependence on water content, ambient temperature, particle size, and other parameters that decrease reproducibility, which is a crucial factor for chemometric approaches. Thus, there is an urgent need for non-target based techniques that require little or ideally, no sample preparation on the one hand, but deliver high selectivity on the other hand. This is reflected by a number of studies published over the last five years on a non-targeted profiling of honey using high-resolution 1H-NMR spectroscopy in combination with multivariate statistical analysis.36-41 All of these studies, together with our previous feasibility study42 proved NMR-based screening to be a suitable tool for the rapid authenticity analysis of honey with a high discriminative power. However, costs for such equipment are massive: high costs of ownership, maintenance and the requirement for expert knowledge limit the use to few, specialized laboratories. While NMR measurements per se commonly require little or no sample preparation, this is not the case for chemometric approaches. For such analyses, the samples require highly precise pH adjustment, as otherwise non-linear effects due to shifting protons, e.g., from hydrated aldehydes lead to misinterpretation by multivariate analysis. Moreover, in comparison to hyphenated MS techniques, NMR significantly lacks the sensitivity to

Page 2 of 11

capture a wide range of minor honey components (i.e. amino acids, polyphenols, organic acids and its esters) at very low concentration levels of partially less than 0.1 ppm. This is a significant drawback, as the potentially decisive information and thus the distinguishing power are often determined by minor compounds, which may be contributing to honey aroma depending on the concentration exceeding its odor threshold. Consequently, an efficient complementary method to the NMR-based profiling of honey samples is required here, which should be implementable in routine applications with a rugged and fast setup featuring a low cost of ownership on the one hand, but on the other hand with a high sensitivity in the sub ppm-level. In the present study, an innovative analytical approach for detection of the botanical origins of honey was developed, based on the analysis of volatile fractions (VOC) using a resolution-optimized headspace gas chromatography ion mobility (HS-GC-IMS) setup. It consists of a drift time ion mobility spectrometry (DTIMS) coupled to a headspace capillary gas chromatography (HS-GC), as described recently in a proof-of-concept study by our research group.43 HS-GC-IMS has been demonstrated to be an effective separation technique due to the comparatively simple system setup, benchtop size, robustness, and price.44-48 As being headspace-based, time-consuming sample pretreatment steps are usually not required, which means that the analysis is carried out using an almost untreated sample. GC-IMS is principally a 2D approach, combining GC retention with drift time. In the first dimension, the analytes are separated by gas chromatography based on their retention in a capillary column. The second dimension is defined by the drift times of ions formed in the ionization chamber of the IMS cell. The charged molecules are separated under the influence of an electrical field depending on their drift behavior in a buffer gas. Drift times of such ions depend on the collision cross section (CCS) of the ion, which is directly connected to structural parameters of size, shape, and the charge location or distribution and can be calculated by the MASON-SCHAMB equation. This second dimension is a true orthogonal separation and allows to separate isomeric compounds possibly coeluting in the GC separation.49,50 Honey features a distinct aroma profile depending on its biological origin, which is also the basis for sensory evaluations. In particular, monofloral honeys at least partially show highly characteristic aromas, among other due to the presence of specific VOCs that origin from the nectar of specific plants. The analysis by HS-GC-IMS could generate highly resolved aroma fingerprints, which can be evaluated by chemometric techniques and subsequently, allow the discrimination of the botanical origin of honeys. Headspace-based measurements commonly require little or no sample preparation, which would significantly simplify analysis. Consequently, the aim of this work was to demonstrate the potential of HS-GC-IMS as an effective complementary tool to the NMR-based screening methods currently applied

ACS Paragon Plus Environment

Page 3 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

in the discrimination between various monofloral botanical species of honeys.

EXPERIMENTAL SECTION

Table 1. Botanical variety and geographical origin of honey samples used as training set Botanical origin

No.

Geographical origin

Acacia

25

Germany (6), France (3), Croatia (1) Hungary (4), Italy (1), Moldavia (1) EU / non-EU (8), Germany/Romania (1)

Canola

23

Germany (15), Chile (1), Poland (2) Romania (2), Bulgaria/Romania (1) EU / non-EU (2)

Honeydew

24

Turkey (2), Germany (9), Italy (3) EU / non-EU (9), Italy/Czechia (1) Italy/Germany (1), Switzerland (1)

Reagents and Honey Samples All reagents and solvents were purchased at the highest available quality (≥98 %), but minimum of HPLC-grade. A total of 72 reference samples of monofloral honeys from various European countries were analyzed using HSGC-IMS. The samples were acacia honey (Robinia pseudoacacia), canola honey (Brassica napus) and honeydew honey (forest flower honeys) from various countries, as shown in Table 1. In addition, ten further authentic honey samples were analyzed to provide external test set results. The samples were obtained by governmental food inspectors from Baden-Wuerttemberg, Germany, supermarkets or directly from bee keepers. The botanical origin of the honey samples was confirmed by microscopic pollen analysis in the framework of official control of foodstuffs. All samples were stored in the dark at room temperature (18 - 23 °C) in screw-cap jars before analysis. Anhydrous sodium chloride was obtained from VWR International GmbH (Darmstadt, Germany). Ultra-pure water was in-house purified using a Milli-Q water purification system (Millipore, Bedford, MA, USA). Internal standard 2-acetylpyridine was purchased from Sigma-Aldrich Chemie GmbH (Taufkirchen, Germany). Saturated sodium chloride solution was prepared freshly and used as a blank in each analytical run. A total of 41 analytical standards including aldehydes (2-pentenal, 2-methylpropanal, 2methylbutanal, hexanal, trans-2-hexenal, trans, trans-2,4hexadienal, trans-2-heptenal, pentanal, heptanal, octantal, benzaldehyde, acetaldehyde, phenylacetaldehyde, decanal, furfural, trans-2-nonenal), ketones (2-butanone, 3-hydroxy2-butanone, acetophenone), alcohols (cis-3-hexen-1-ol, cis2-penten-1-ol, 3-methylbutan-1-ol, 2-methyl-3-buten-2-ol, 1-octen-3-ol, 2-methylpropan-1-ol, 2-phenylethanol, hexan-1-ol, penten-3-ol, ethanol), esters (cis-3-hexenyl acetate, ethyl butanoate, hexyl acetate, butyl acetate, ethyl propionate and ethyl acetate, organic acids (2-methylheptanoic acid, benzoic acid, trans-cinnamic acid) and monoterpene (D,L-menthol, cis-linalool-oxide, D-limonene) were used for the identification of characteristic volatile compounds from honey samples. The analytical standards were purchased at ≥98 % purity from Sigma-Aldrich (SigmaAldrich Chemie GmbH, Taufkirchen, Germany). Stock solutions (1000 mg/L) were prepared by dissolving each compound in Millipore water. All stock and standard solution were stored at 4°C prior use.

Isolation of Volatile Organic Compounds in Honey For the analysis of volatile compounds, 2 g of honey sample was introduced into a 20 mL headspace vial and subsequently incubated at 45 °C for 15 minutes after spiking with 18 µL of 2-acetylpyridine stock solution (1008 mg/L) as internal standard and mixing with 2 mL saturated sodium chloride solution. Finally, 700 µL of sample headspace was automatically injected by means of a heated syringe (80 °C) into the heated injector of the GC-IMS equipment under conditions reported below. The determination of headspace volatiles was performed in duplicate for each honey sample.

HS-GC-IMS Apparatus Analysis was performed on an advanced Ion Mobility Spectrometry (IMS) manufactured by Gesellschaft für Analytische Sensorsysteme mbH (Dortmund, Germany) coupled with an Agilent 6890N gas chromatograph (Agilent Technologies, Palo Alto, CA, USA). The system was equipped with a CombiPal GC autosampler (CTC Analytics AG, Switzerland) with a headspace sampling unit and a 2.5 mL gas-tight heatable syringe (Gerstel GmbH, Mühlheim, Germany). The injector port was equipped with a headspace glass liner (1.2 mm i.d.), (Agilent, Waldbronn, Germany) to minimize peak broadening. The chromatographic separation was performed on a DB-225 capillary column (25 % phenyl, 25 % cyanopropyl methyl siloxane), 25 m x 0.32 mm x 0.25 µm film thickness (Agilent Technologies, Santa Clara, CA). Nitrogen of 99.99 % purity was used as a carrier gas at a constant flow of 1.5 mL/min. A gas purifier cartridge was used (Restek GmbH, Bad Homburg, Germany). The IMS drift time cell was mounted on the top side of the GC. The transfer line to the IMS cell was set to 120 °C. Following the gas-chromatographic separation, the analytes were ionized in the IMS ionization chamber by a 3H ionization source (300 MBq activity). The drift tube length was 10 cm and was operated at a constant voltage of 5 kV and a temperature of 90 °C with a nitrogen flow of 150 mL/min. The gas flow was controlled by a mass flow controller (Voegtlin Instruments AG, Aesch, Switzerland). The IMS cell was operated in positive ion mode. Each spectrum was

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the average of 6 scans obtained by using an injection pulse width of 150 µs, a sampling frequency of 150 kHz, a repetition rate of 21 ms and a blocking and injection voltage of 70 mV and 2500 mV, respectively. The data were collected using the LAV software version 2.2.1 from Gesellschaft für Analytische Sensorsysteme mbH (Dortmund, Germany).

Chromatographic Conditions For analysis, a headspace volume of 700 µL was sampled at a speed of 250 µL/s and a syringe temperature of 80 °C to avoid condensation effects. Before each analysis, the syringe was automatically flushed with a stream of nitrogen for 2 minutes to avoid cross contamination. Injection was performed into a split/splitless injector, operated at 150 °C in split mode (split 1:10). The GC oven temperature was programmed as follows: initial temperature 40 °C, hold for 2 min, following a temperature ramp up to 120 °C at 8 °C/min, hold for 10 min.

Data Preprocessing Due to the very large datasets produced by HS-GC-IMS, an appropriate pretreatment procedure of the threedimensional data is necessary in order to avoid significant errors in the multivariate statistical analysis. The preprocessing procedure of the VOC spectra was performed using the MATLAB® software (version R2016a, Mathworks, Natick, MA). First, a baseline correction and smoothing SAVITZKY-GOLAY filter based on second order was used to improve the signal-to-noise ratio of all spectra of honey samples. In the next step, the spectra were normalized relative to the expected reaction ion peak (RIP) position, followed by spline interpolation to create a common set of points on the drift time-axis (x-axis) of the GC-IMS spectra (see Figure 1). Subsequently, the duplicate measurements were averaged for each sample. Additionally, in order to correct small random deviations in retention time, all spectra were aligned with a shift in y-axis (retention time-axis) based on a linear function fitted to a reference peak (2acetylpyridine) by using a specially designed MATLAB algorithm. Finally, a dataset with only spectra in which signals appear has been built for the statistical and pattern recognition analysis (Figure 2). The retention time and drift time used for chemometric analysis was 100 to 900 s (4525 variables) and 7.04 to 13 ms (941 variables), respectively. The aligned and mean centered dataset comprising 4525 x 941 variables and 72 samples (averaged spectra) constituted the starting point for the pattern recognition analysis. 1

H-NMR Measurements at 400 MHz

All 1H-NMR data discussed in this work were already described in our previous study.42 Detailed sample preparation and instrumental parameters are given here.

Chemometric Data Analysis and Software As a first step, before performing the two-way multivariate analysis, the original preprocessed three-way ion mobility data arrays were unfolded to matrices resulting in a data

Page 4 of 11

matrix consisting of 72 rows (72 honey samples) and 4258025 variables (4525 x 941: retention time x drift time). The techniques applied to HS-GC-IMS and 1H-NMR data for the multivariate data assessment of the honey samples were principal component analysis (PCA) for dimensionality reduction, followed by a linear discriminant analysis (LDA) and the k-nearest neighbors (kNN) for the subsequent classification. The LDA model was formed with the training data set only. Subsequently, the test set samples were projected into this LDA scores space and plotted. As an additional method, PLS discriminant analysis (PLS-DA) was also evaluated for classification. The kNN classifier was applied to generate non-linear classifications, finding the closest k examples in the dataset to the unknown class using Euclidean distance, and selecting the predominant class for it. In this work, value of k equal to 5 was used. Finally, the predictive ability of PCA-LDA, PLS-DA and kNN models was evaluated by k-fold cross-validation (k = 10), in which a random partition for a stratified k-fold cross was created. K-fold cross validation is a widely used approach for estimating the test error by leaving out part k, fitting the model to the other k-1 parts, and then obtaining predictions for the left-out kth part. Each subsample has roughly equal size and roughly the same class proportions as in group. For better comparability, equal random numbers were used in cross-validation for both HS-GC-IMS and 1H-NMR-based models. All the calculations and preprocessings were assessed by using in-house MATLAB® routines (version R2016a, Mathworks, Natick, MA, USA). The PCA, kNN,PCA-LDA and PLS-DA model were built by applying the MATLAB® Statistical Toolbox (Version 9.1). Before importing HSGC-IMS data into MATLAB, all raw data files were first converted to csv text files using the csv export tool implemented in LAV software version 2.2.1 from Gesellschaft für Analytische Sensorsysteme mbH (Dortmund, Germany).

RESULTS AND DISSCUSSION Volatile Fraction Profiling by HS-GC-IMS In this study, HS-GC-IMS was used to compare between various monofloral honeys based on their VOC profiles in order to establish a fast and reliable method for authenticity screenings. The resulting three-dimensional VOC profiles are complex and feature more than 100 individual signals. Consequently, the discrimination between floral origins was based on a non-targeted profiling approach as opposed to the classical targeted approaches of selecting one or more chemical compounds as markers. This approach of using of the complete spectral information data requires chemometric techniques that use signals and signal intensities as variables without identifying them or establishing calibration curves. While the identification and quantification of individual substances may be relevant for subsequent specification analyses (e.g. HMF as a heat treatment indicator or ethanol

ACS Paragon Plus Environment

Page 5 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

as a spoilage marker), this study focused on the chemometric analysis of fingerprint information without prior identification and calibration of all measured substances. However, for a better understanding of the chemical identity of the volatiles responsible for class separation, 18 exemplary target compounds were identified and analyzed by HS-GCIMS. This is shown in Figure 1 in form of the “feature maps”, which overlay the 2D spectra. By using these individual identifiers (features) obtained from the retention time index and the normalized drift time, the targets may be calibrated and quantitated by use of reference materials. In our previous study,43 a good stability and reproducibility of the fingerprint analysis by HS-GC-IMS was demonstrated, showing relative standard deviations of the retention time and drift time values of the significant peaks in the spectra of less than 1 %. Representative GC-IMS spectra corresponding to a canola, acacia or honeydew honey sample are shown in Figure 1. As obvious, the VOC profiles of the analyzed monofloral honeys vary significantly in terms of their nature and signal intensities, depending on the botanical source of the studied honeys. The most prominent differences in the VOC pattern of different honeys were observed in the fingerprint region at retention times ranging from 200 – 600 s. It can be seen that low molecular weight VOCs are present at the highest concentration levels and elute in a time range between 200 – 300 s. These signals generally are typically found in all honeys analyzed, for example, acetaldehyde, ethanol, 2-methylpropanal, ethyl acetate, 2-butanone and 2methylbutanal and only minor variations in intensities were observed. The significant differences in the VOC profiles, which characterize different honey types, were observed at later retention times. As an example, abundant VOCs detected at GC retention time between 350 – 500 s were highly specific for canola honey (see highlighted area Figure 1), while these compounds were not found in acacia and hon-

eydew honey samples or were only detected with low abundances. Further characteristic signals, which can be considered as potential markers were observed solely or at least predominantly in honeydew honey at GC retention times in the range of 730 – 810 s (see highlighted area figure 1). Figure 1 shows the application of the feature map for the 18 selected compounds. The compounds were verified by comparison of corresponding drift time and retention time with those of authentic reference compounds and additionally, by standard addition. In general, linear and branched aldehydes, ketones, short-chain alcohols were found in most or all of the honeys analyzed. Some compounds were more prevalent in particular types of honey, e.g., acacia honey featured higher levels of hexanal and cislinalool oxide, while canola honeys showed higher concentrations of benzaldehyde. The aroma profile of honeydew honey showed a characteristic pattern, where terpenes, such as cis-linalool oxide and D-limonene were dominant. In particular, 3-hydroxy-2-butanone (acetoin), trans-2pentenal and 3-methylbutanol were found to be characteristic for honeydew honeys. As it is to be expected from the more intense aroma of honeydew honeys, it was found that these generally presented a richer volatile chromatographic profile at relatively low signal intensities, while in contrast, acacia honeys featured the least intensive profile of all three honeys analyzed. These results are closely associated with sensory perceptions, which confirm the aroma in acacia honey to be of lower intensity as compared to honeydew and canola honeys. According to previous publications,51,52 floral honeys are characterized more by low free acidity, polyphenol content, and lactone quantity, whereas honeydew honeys are better featured by high free acidity, polyphenol content and lactone quantity. These compounds are at least partially accessible via headspace sampling and clearly contribute to the characteristic aroma profile of a specific honey type.

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 11

Figure 1. Representative HS-GC-IMS chromatogram-mobility plot of volatiles fingerprints of canola, acacia and honeydew honey samples. The signals characteristic for a specific honey type are highlighted. Identified signals: (1) acetaldehyde, (2) ethanol, (3) 2-methylpropanal, (4) ethyl acetate, (5) 2-butanone, (6) 2-methylbutanal, (7) 2-methylpropanol, (8) pentanal, (9) butyl acetate, (10) 3-methyl-1-butanol, (11) hexanal, (12) trans-2-pentenal, (13) 3-hydroxy-2-butanone (Acetoin), (14) heptanal, (15) furfural, (16) cis-linalool-oxide, (17) benzaldehyde, (18) D-limonene # = corresponding dimers formed in IMS drift tube

Honey Discrimination by PCA In the scope of this study, the primary goal was the discrimination of honey samples from different floral origins rather than the determination of individual components of samples. The obtained HS-GC-IMS profiles are highly complex, featuring numerous peaks of different intensities and the presence of characteristic compounds mixed with non-characteristic ones. As mandatory step to discriminate between different classes of a given honey sample, it is necessary to extract the complete profile of chemical differences from this complex data set. For this purpose, a mean-centered PCA as a first-pass unsupervised method to identify chemical differences between high-dimensional HS-GC-MS spectra and to detect possible clusters within samples, was carried out over the 72 x 4525 x 941 data matrix (72 samples and 4525 x 941 chromatographic variables). In this way, a first verification if discriminatory efficiency of the variables considered is provided. Figure 2a shows the scatter plot for the first three principal components (PCs) determined by PCA, through which a visualization of the data structure in a reduced dimension is obtained. The first three PCs explained 67 % of total variance in original information. While in practice, it is more common to plot 2D plots of any two of the PCs, these projections of multidimensional data onto a 2D space often lead to overlapping clusters in one plot and other combinations of PCs in other plots need to be checked to observe

cluster separations. The PCA results from this first study section show a significant differentiation of all investigated groups in a single 3D plot. As can be seen from the PCA score plot in Figure 2a, honey samples of different botanical origin were well separated and grouped according to their floral source, with only a slight overlap between the both flower originated samples acacia and canola. The PC1 and PC2 differentiate between acacia and honeydew while the PC3 separates these classes from the third one canola. The PCA results clearly show that honeydew honeys feature significant differences from nectar honeys with respect to their VOC profile. The honeydew honeys were well discriminated to positive score values in PC1, while the cluster of acacia honey was well defined by negative scores of PC1 and PC2 and canola honey was well separated by the positive score values in PC2. When regarding the score plot, it should be noted that the variability within each honey cluster is high, thus reflecting a wide geographical spread of all honey samples investigated and also a natural variability in the pollen profile. While scores plots show trends and groupings of data, loadings reveal the chemical basis of variation. The corresponding PCA loadings plot (Figure 2b) for the first three PCs confirmed that the VOCs in the fingerprint region (GC retention time: 200-600 s) of the spectra had the strongest influence on class separation.

ACS Paragon Plus Environment

6

Page 7 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2. (a): PCA 3D scatter plot of pre-processed HS-GC-IMS spectra of three monofloral honey types; (b): corresponding PCA loadings plot for the first three PCs describing 67 % of the total variance. Different honey groups show significant differences in spectral characteristics.

The results of PCA analysis demonstrate that honey samples for all categories have different characteristics and feature different VOC profiles in a multidimensional space. Thus, HS-GC-IMS spectra seem to contain useful information to perform classification models in order to assign an unknown honey sample to the class to which it belongs.

Honey Classification: HS-GC-IMS versus NMR Data

1

H-

Each of the HS-GC-IMS spectra used in this study is a set of many thousand variables. The use of such high dimensional data as input information for classification algorithms, such as LDA is not advisable, as one would run into certain overfitting of the model. This term refers to the overweighting of residual noise over real information, which results in an apparent class separation where there is none. This can be overcome by reducing dimensionality using PCA. The eigenvalues of the principal components (which represent the percentage of variance retained for the principal components) showed that the first ten principal components represent more than 90 % of the total data variance. The ideal number of PCs used for PCA-LDA model was determined based on the lowest classification error rate by cross validation, calculated for PCs 2 to 20. The model was most robust when using 6 to 12 PCs. To illustrate this, we used a full PC-decomposition with all 71 PCs resulting in a 72 x 71 data matrix. Here, it could be demonstrated that the overall accuracy of the LDA and kNN model decreased to 60 % and 38 %, respectively, which underlines the fact that nearly all algorithms are prone to overfitting when too many PCs are used for model building. Table 2 shows the results obtained from a reduced PC set instead of full PC-decomposition, Hence, PCA scores of the first ten extracted PCs instead of original variables were used for the calculation. A new data matrix X: 72 x 10 containing the scores of the 72 honey samples on the first ten principal components was considered to be

the input information for the LDA and kNN employed in the development of classification models. In the next step, we compared HS-GC-IMS profiling versus data obtained from 1H-NMR profiling of the same samples to evaluate the discrimination quality of the 3DVOC profiles. For this purpose, both 1H-NMR fingerprints generated in a previously published study38 for the same set of honey samples and the HS-GC-IMS data in this study were separately analyzed by PCA-LDA following the same procedure and were then used for direct comparison. The predictive accuracy of both LDA models was estimated by using a k-fold cross validation (k = 10) approach. As it is important to enable a direct comparison between the two LDA models, we ensured that the same data set splits were employed for HS-GC-IMS and also for 1H-NMR data. Finally, after the model was validated by full crossvalidation procedure, the predictive power of the obtained LDA model was shown by classifying new samples. For this, PCA-LDA models were applied on an external test set of ten authentic honey samples (3x canola, 4x acacia and 3x honeydew honeys), in order to evaluate the prediction of class membership of honey samples and to compare the classification ability of the HS-GC-IMS vs 1H-NMR spectra. For both methods, the same set of samples were used to train and test the model. These independent test set samples were recorded under the same analytical conditions, including microscopic pollen analysis. Figure 3 shows the LDA scores plot of the first two discriminant functions for the classification of both the training and test sets after analysis of honey samples on both the HS-GC-IMS and 1H-NMR platforms. In Figure 3a, it can be seen that the HS-GC-IMS fingerprints allow a good separation of acacia, canola and honeydew honeys based on the first and second discriminant functions. The separation between the three classes is quite clear with only a slight overlap between the acacia and canola samples. The

ACS Paragon Plus Environment

7

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

first discriminant function separates canola from acacia honeys, while the second discriminant function separates canola from the honeydew honeys. After cross-validation, the PCA-LDA and kNN models were able to correctly classify all samples into their respective botanical origin groups with an overall classification accuracy of 98.6 % and 86.1 %, respectively. The best result was obtained by the PCA-LDA model. As a comparison, a PLS-DA was performed, based on the first five PLS components. This approach delivered a comparable overall accuracy to PCALDA at 97.0 % with equal variables distinguishing classes from another. Depending on the separation problem, the PCA-LDA approach may have an advantage over PLS-DA,

Page 8 of 11

since it does not assume multi-colinearity between the independent variables. From Figure 3b, it can be seen that the discrimination quality of the generated VOC fingerprints in honeys are matching with the classification results obtained with 1H-NMR fingerprints, showing an overall accuracy between 95.8 – 100 %. The classification results for HS-GC-IMS and 1H-NMR fingerprints by different chemometric methods are summarized in Table 2. Thus, our results demonstrate that VOC profiling by HSGC-IMS in combination with multivariate statistics is an efficient tool to perform a rapid and cost-effective classification of the different botanical origins of honey samples.

Figure 3. Linear discriminant analysis (LDA), (a) HS-GC-IMS data, (b) 1H-NMR data. The scores plot for classification includes training set (72 honey samples) and external test set (black symbols: A = accacia, C = canola, H = honeydew)

Table 2. Classification results for HS-GC-IMS and 1HNMR fingerprints by different chemometric methods based on first ten PCs after employing 10-fold cross validation Three-class model* HS-GC-IMS

1

Method

Overall Accuracy (%)

PCA-LDAa

98.6c

kNNb

86.1c

PLS-DAa,#

97.0c

PCA-LDAa

100c

kNNb

95.8c

PLS-DA a,#

100c

H-NMR

a

Calculated considering the samples from all three groups kNN, k-nearest neighbor classification (k = 5) c k-fold cross * Acacia honey; Canola honey; Honeydew honey; (n = 72) # Based on the first five PLS components b

After the external validation of the models, all samples from the test set were correctly recognized by PCA-LDA model for both HS-GC-IMS and 1H-NMR data at the significance level described by the posterior probability. Thus, the predicted label of all samples matched with the true

label of a tested honey sample. The results of the external validation are summarized in Table 3. The estimated posterior probability of each class for test data give information about the expected probability that a future sample is correctly classified when performing class prediction. The estimates were obtained using the kernel density function in MATLAB 9.2. The posterior probabilities were between 93.8 – 100 %, except for two acacia honey samples (A1 and A2, see also table 3, in bold) in the HS-GC-IMS-based model with probability estimates at 62.2 % and 76.9 %. When taking a closer look at these two acacia test set samples in the HS-GC-MS-based model (Figure 3a), it becomes evident that these two samples are allocated more closely to the cluster of canola samples, which is not the case for the 1H-NMR data (Figure 3b). From the results of microscopic analysis, it was found that the acacia pollen profile in these two acacia honeys was less typical due to presence of other dominant pollen types, particularly canola and sunflower, which could be the reason for the shift towards the canola cluster. Consequently, while the relevant discriminant information in the 1H-NMR spectra is generally determined by the major sugar signals rather than the minor components, 3D-VOC profiles obtained by HSGC-IMS seem to better reflect the pollen composition in a honey sample. This means that at least for the honey samples analyzed here, 1H-NMR analysis tends towards an overoptimistic separation between classes in comparison to

ACS Paragon Plus Environment

8

Page 9 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

the more conservative separation based on HS-GC-IMS data. However, as the study is intended to be a proof-ofconcept, this finding should be considered to be tentative, as a larger number of test set samples and furthermore, from more additional honey varieties is required to verify this trend. In summary, results of the present study demonstrate that our setup consisting of a resolution-optimized HS-GCIMS, combined with chemometric protocols is suitable for a fast and cost-efficient discrimination of the botanical provenance of honeys. A direct comparison of the discriminant analysis between HS-GC-IMS and 1H-NMR data showed that the discrimination ability of the model using information extracted from the VOC spectra is comparable to that obtained from 1H-NMR data. These findings also suggest that HS-GC-IMS-based VOC profiling can serve as an important complementary and alternative tool for a reliable identification and quality control method of honey, which is less complex, faster, more cost-efficient, and also better associated with the sensory perception. In particular, the HS-GC-IMS screening approach could be a powerful supplement to a sensory evaluation.

Table 3. Validation results using external test sets of honey samples with corresponding posterior-probability estimates

CONCLUSION

Notes

The results from this work demonstrated that a nontargeted VOC profiling approach using a resolutionoptimized HS-GC-IMS setup provides a fast, cost-efficient and robust tool for reliable classification of the botanical origin of honeys. This was proved by a high predictive ability of the PCA-LDA model, where all external test set samples were correctly classified into the three variety groups. The potential of HS-GC-IMS profiling as an alternative to currently used NMR-based screening methods was clearly demonstrated by direct and objective comparison of the performance of the discriminant analysis between HS-GCIMS and 1H-NMR fingerprints. The experimental results show that the discrimination of the botanical origin of honeys based on spectral variation in the VOC profiles match closely with that based on the variations in the saccharide content of 1H-NMR spectra. Moreover, we tentatively observed that the class prediction of the HS-GC-IMS-based model seem to reflect differences in the pollen composition in a honey sample better compared to the one based on 1H-NMR data. Thus, the HSGC-IMS-based VOC profiling approach could help to reduce potential risks of misinterpretation of such data by preventing overoptimistic class separation. Consequently, our findings indicate that HS-GC-IMS-based screening method for honey analysis either can complement traditional analysis of commercial honey samples or be even superior to the currently used screening methods such as 1 H-NMR due to elimination of the use of high expensive equipment or expert knowledge.

AUTHOR INFORMATION Corresponding Author Phone: +49-(0)621 292 6484. Fax: +49-(0)621 292 6420. Email: [email protected]

No.

True label

Predicted label

Posterior probability (%) GC-IMS

1

H-NMR

1

A1

A

62.2

100

2

A2

A

76.9

99.9

3

A3

A

97.2

100

4

A4

A

93.8

100

5

C1

C

99.9

100

6

C2

C

99.5

100

7

C3

C

99.9

100

8

H1

H

99.9

100

9

H2

H

99.9

100

10

H3

H

100

100

A = Acacia honey; C = Canola honey; H = Honeydew honey

The authors declare no competing financial interest.

ACKNOWLEDGEMENTS The authors would like to acknowledge the Chemisches und Veterinaeruntersuchungsamt (CVUA) Karlsruhe and Freiburg for supplying honey samples and supporting data of pollen analysis. We gratefully acknowledge Center for Applied Research in Biomedical Mass Spectrometry (ABIMAS) for the financial support.

REFERENCES (1) European Parliament, Committee on the Environment, Public Health and Food Safety report 2013, (2013/2091(INI)). (2) Rowland, C. Y.; Blackman, A. J.; D'Arcy, B. R.; Rintoul, G. B. J. Agric. Food Chem. 1995, 43, 753–763. (3) Guyot, C.; Scheirman, V.; Collin, S. Food Chem. 1999, 64, 3– 11. (4) Alissandrakis, E.; Tarantilis, P. A.; Harizanis, P. C.; Polissiou, M. J. Agric. Food. Chem. 2007, 55, 8152–8157. (5) Castro-Vázquez, L.; Díaz-Maroto, M. C.; Pérez-Coello, M. S. Food Chem. 2007, 103, 601–606. (6) Odeh, I.; Abulafi, S.; Dewik, H.; Alnajjar, I.; Iman, A.; Dembitsky, V.; Hanus, L. Food Chem. 2007, 101, 1393–1397. (7) Jerković, I.; Mastelić, J.; Marijanović, Z. Chem. Biodivers. 2006, 3, 1307–1316. (8) Kaškonienė, V.; Venskutonis, P. R.; Čeksterytė, V. LWT-Food Sci. Technol. 2010, 43, 801–807. (9) Persano Oddo, L.; Piro, R. Apidologie 2004, 35, S38-S81. (10) Babarinde, G. O.; Babarinde, S. A.; Adegbola, D. O.; Ajayeoba, S. I. J. Food Sci. Technol. 2011, 48, 628–634. (11) Louveaux, J.; Maurizio, A.; Vorwohl, G. Bee World 1978 (59), 139–157. (12) Cavazza, A.; Corradini, C.; Musci, M.; Salvadeo, P. J. Sci. Food Agric. 2013, 93, 1169–1175. (13) Zhou, J.; Yao, L.; Li, Y.; Chen, L.; Wu, L.; Zhao, J. Food Chem. 2014, 145, 941–949. (14) Karabagias, I. K.; Badeka, A.; Kontakos, S.; Karabournioti, S.; Kontominas, M. G. Food Res. Int. 2014, 55, 363–372. (15) Radovic, B. S.; Careri, M.; Mangia, A.; Musci, M.; Gerboles, M.; Anklam, E. Food Chem. 2001, 72, 511–520.

ACS Paragon Plus Environment

9

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(16) Baroni, M. V.; Nores, M. L.; Díaz, M. d. P.; Chiabrando, G. A.; Fassano, J. P.; Costa, C.; Wunderlin, D. A. J. Agric. Food. Chem. 2006, 54, 7235–7241. (17) Cuevas-Glory, L. F.; Pino, J. A.; Santiago, L. S.; Sauri-Duch, E. Food Chem. 2007, 103, 1032–1043. (18) Jerković, I.; Marijanović, Z. Chem Biodivers. 2009, 6 (3), 421–430. (19) Castro-Vázquez, L.; Díaz-Maroto, M. C.; Guchu, E.; PérezCoello, M. S. Eur Food Res Technol 2006, 224 (1), 27–31. (20) Piasenzotto, L.; Gracco, L.; Conte, L. J. Sci. Food Agric. 2003, 83 (10), 1037–1044. (21) Robotti, E.; Campo, F.; Riviello, M.; Bobba, M.; Manfredi, M.; Mazzucco, E.; Gosetti, F.; Calabrese, G.; Sangiorgi, E.; Marengo, E. J. Chem 2017, 2017 (4), 1–14. (22) Moniruzzaman, M.; Rodríguez, I.; Ramil, M.; Cela, R.; Sulaiman, S. A.; Gan, S. H. Talanta 2014, 129, 505–515. (23) Pulcini, P.; Allegrini, F.; Festuccia, N. Apiacta 2006, 21–27. (24) Oelschlaegel, S.; Gruner, M.; Wang, P. N.; Boettcher, A.; Koelling-Speer, I.; Speer, K. J. Agric. Food Chem. 2012, 60, 7229– 7237. (25) Tette, P. A. S.; Guidi, L. R.; Bastos, E. M.; Fernandes, C.; Gloria, M. B. A. Food Chem. 2017, 229, 527–533. (26) Trautvetter, S.; Koelling-Speer, I.; Speer, K. Apidologie 2009, 40, 140–150. (27) Jandrić, Z.; Haughey, S. A.; Frew, R. D.; McComb, K.; Galvin-King, P.; Elliott, C. T.; Cannavan, A. Food Chem. 2015, 189, 52– 59. (28) Herrero Latorre, C.; Peña Crecente, R. M.; García Martín, S.; Barciela García, J. Food Chem. 2013, 141, 3559–3565. (29) Gok, S.; Severcan, M.; Goormaghtigh, E.; Kandemir, I.; Severcan, F. Food Chem. 2015, 170, 234–240. (30) Woodcock, T.; Downey, G.; Odonnell, C. Food Chemistry 2009, 114, 742–746. (31) Goodacre R.; Radovic B.S.; Anklam E. Appl. Spectrosc. 2002, 56, 521–527. (32) Fernández Pierna, J. A.; Abbas, O.; Dardenne, P.; Baeten, V. Biotechnol. Agron. Soc. Environ. 2011, 15, 75–84. (33) Corvucci, F.; Nobili, L.; Melucci, D.; Grillenzoni, F. V. Food Chem. 2015, 169, 297–304. (34) Schellenberg, A.; Chmielus, S.; Schlicht, C.; Camin, F.; Perini, M.; Bontempo, L.; Heinrich, K.; Kelly, S. D.; Rossmann, A.; Thomas, F.; Jamin, E.; Horacek, M. Food Chem. 2010, 121, 770–777. (35) Dinca, O. R.; Ionete, R. E.; Popescu, R.; Costinel, D.; Radu, G. L. Food Anal. Method 2015, 8, 401–412. (36) Boffo, E. F.; Tavares, L. A.; Tobias, A. C.; Ferreira, M. M.; Ferreira, A. G. LWT-Food Sci. Technol. 2012, 49, 55–63. (37) Schievano, E.; Finotello, C.; Uddin, J.; Mammi, S.; Piana, L. J. Agric. Food Chem. 2016, 64, 3645–3652. (38) Ohmenhaeuser, M.; Monakhova, Y. B.; Kuballa, T.; Lachenmeier, D. W. ISRN Anal. Chem. 2013, 2013, 1–9. (39) Zheng, X.; Zhao, Y.; Wu, H.; Dong, J.; Feng, J. Food Anal. Method 2016, 9, 1470–1479. (40) Spiteri, M.; Jamin, E.; Thomas, F.; Rebours, A.; Lees, M.; Rogers, K. M.; Rutledge, D. N. Food Chem. 2015, 189, 60–66. (41) Consonni, R.; Cagliani, L. R.; Cogliati, C. Food Control 2013, 32, 543–548. (42) Gerhardt, N.; Birkenmeier, M.; Kuballa, T.; Ohmenhaeuser, M.; Rohn, S.; Weller, P. Proceedings of the XIII International Conference on the Applications of Magnetic Resonance in Food Science; IM Publications, 2016, 33-37. (43) Gerhardt, N.; Birkenmeier, M.; Sanders, D.; Rohn, S.; Weller, P. Anal. Bioanal. Chem. 2017, 409, 3933–3942. (44) Márquez-Sillero, I.; Cárdenas, S.; Sielemann, S.; Valcárcel, M. J. Chromatogr. A 2014, 1333, 99–105. (45) Krisilova, E. V.; Levina, A. M.; Makarenko, V. A. J. Anal. Chem. 2014, 69, 371–376. (46) Shuai, Q.; Zhang, L.; Li, P.; Zhang, Q.; Wang, X.; Ding, X.; Zhang, W. Anal. Method 2014, 6, 9575–9580. (47) Garrido-Delgado, R.; Dobao-Prieto, M. M.; Arce, L.; Aguilar, J.; Cumplido, J. L.; Valcárcel, M. J. Agric. Food Chem. 2015, 63, 2179–2188

Page 10 of 11

(48) Denawaka, C. J.; Fowlis, I. A.; Dean, J. R. J. Chromatogr. A 2014, 1338, 136–148. (49) Johnson, P. V.; Beegle, L. W.; Kim, H. I.; Eiceman, G. A.; Kanik, I. Int. J. Mass spectrom. 2007, 262, 1–15. (50) Kiss, A.; Heeren, R. M. A. Anal. Bioanal. Chem. 2011, 399, 2623–2634. (51) Sanz, M. L.; Gonzalez, M.; Lorenzo, C. de; Sanz, J.; Martı́nez-Castro, I. Food Chem. 2005, 91, 313–317. (52) Soria, A. C.; González, M.; Lorenzo, C. de; Martínez-Castro, I.; Sanz, J. J. Sci. Food Agric. 2005, 85, 817–824

ACS Paragon Plus Environment

10

Page 11 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

Analytical Chemistry

ACS Paragon Plus Environment