Volatile-Compound Fingerprinting by Headspace-Gas

Jan 3, 2018 - The data were collected using LAV software version 2.2.1 from Gesellschaft für Analytische Sensorsysteme mbH (Dortmund, Germany). Chrom...
1 downloads 10 Views 3MB Size
Article Cite This: Anal. Chem. XXXX, XXX, XXX−XXX

pubs.acs.org/ac

Volatile-Compound Fingerprinting by Headspace-GasChromatography Ion-Mobility Spectrometry (HS-GC-IMS) as a Benchtop Alternative to 1H NMR Profiling for Assessment of the Authenticity of Honey Natalie Gerhardt,† Markus Birkenmeier,† Sebastian Schwolow,† Sascha Rohn,‡ and Philipp Weller*,† †

Institute for Instrumental Analytics and Bioanalysis, Mannheim University of Applied Sciences, 68163 Mannheim, Germany Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, 20146 Hamburg, Germany



ABSTRACT: This work describes a simple approach for the untargeted profiling of volatile compounds for the authentication of the botanical origins of honey based on resolution-optimized HS-GC-IMS combined with optimized chemometric techniques, namely PCA, LDA, and kNN. A direct comparison of the PCA−LDA models between the HS-GC-IMS and 1H NMR data demonstrated that HS-GC-IMS profiling could be used as a complementary tool to NMR-based profiling of honey samples. Whereas NMR profiling still requires comparatively precise sample preparation, pH adjustment in particular, HS-GC-IMS fingerprinting may be considered an alternative approach for a truly fully automatable, cost-efficient, and in particular highly sensitive method. It was demonstrated that all tested honey samples could be distinguished on the basis of their botanical origins. Loading plots revealed the volatile compounds responsible for the differences among the monofloral honeys. The HS-GC-IMSbased PCA−LDA model was composed of two linear functions of discrimination and 10 selected PCs that discriminated canola, acacia, and honeydew honeys with a predictive accuracy of 98.6%. Application of the LDA model to an external test set of 10 authentic honeys clearly proved the high predictive ability of the model by correctly classifying them into three variety groups with 100% correct classifications. The constructed model presents a simple and efficient method of analysis and may serve as a basis for the authentication of other food types.

B

specifications defined by the EU Commission’s Directive 2001/ 110/EC. In most cases, this fraud was due to a faulty declaration of the botanical source. In recent years, numerous studies have been carried out to identify specific marker compounds for particular types of honey.2−8 However, it is difficult to identify and detect characteristic floral markers in honey of various botanical origins, as possible indicators that might distinguish the different types of honey depend not only

y definition, honey is a natural and almost untreated food, which is highly valued by consumers as an authentic and high-quality product. As a result, one of the major concerns of the value-added food chain is to ensure and enforce both authenticity and quality. With a growing globalization of the honey market, the identification of the botanical origin of honey as well as the proof of its authenticity in terms of its composition and geography has become an increasingly challenging task. In particular, high-priced monofloral honeys of rare botanical origin are a potential target for food fraud. According to the EU Commission’s publicly available reports,1 honey is one of the most frequently adulterated food products and regularly found to be noncompliant with the quality © XXXX American Chemical Society

Received: September 13, 2017 Accepted: January 3, 2018 Published: January 3, 2018 A

DOI: 10.1021/acs.analchem.7b03748 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry on the floral sources but also on geography and ecophysiological factors such as the climate, soil, season, and storage conditions.9,10 Furthermore, only a few compounds seem to be truly specific for certain monofloral honeys, and many of them can be also found in variable concentrations in various honey types. Consequently, there are no identified chemical markers or sets of markers for authenticity that are accessible by conventional, target-based analyses. Hence, an analytical approach covering a multitude of parameters in parallel paired with strong discrimination power is required here. This is reflected by the plethora of studies published over the last few years that cover a broad scope of analytical techniques described as potentially suitable tools. Although the most commonly used methods to determine quality and authenticity in many laboratories are still sensory analyses and physicochemical methods, which rely on very traditional markers such as enzyme activity and the levels of 5hydroxymethyl furfural, moisture, and mono- and disaccharides, the botanical and geographical origins of honey are typically determined by melissopalynological analysis based on a tedious and time-consuming identification of the pollen using microscopy.11 However, analysis and data interpretation are very time-consuming and do not always lead to correct assignments, mostly because of the univariate approach and its inherently low selectivity. Alternative methods are mostly based on targeted as well as nontargeted spectroscopic and spectrometric techniques and typically involve the use of chromatographic methods such as HPLC-UV,12,13 headspace solid-phase-microextraction gas chromatography coupled with mass spectrometry (HS-SPME-GC/MS),14−21 HS-GC-QTOFMS,22 HPLC-MS/MS,23−25 UPLC-QTOF-MS,26,27 IR-based spectroscopic techniques (FT-MIR/NIR),28−30 or Raman spectroscopy.31−33 A different approach is that of isotoperatio-mass-spectrometry (IRMS)-based techniques, in which either the major constituting sugars are analyzed for their 13C patterns via HPLC-C-IRMS or the isolated protein fraction is analyzed as a bulk fraction.34,35 Although these techniques overcame a number of the limitations of traditional, wet-chemistry-based assays, the major issue of complex and time-consuming sample preparation remains for many of the proposed techniques, in particular for GC-based techniques as well as for IRMS methods. This obviously contradicts the idea of rapid characterization and botanical classification. Furthermore, IR techniques, either based on ATR-FT-MIR or NIR, typically suffer from a high dependence on water content, ambient temperature, particle size, and other parameters that decrease reproducibility, which is a crucial factor for chemometric approaches. Thus, there is an urgent need for non-target-based techniques that require little or ideally no sample preparation but deliver high selectivity. This is reflected by a number of studies published over the last five years on the nontargeted profiling of honey using high-resolution 1H NMR spectroscopy in combination with multivariate statistical analysis.36−41 All of these studies, together with our previous feasibility study,42 proved NMR-based screening to be a suitable tool for rapid authenticity analyses of honey with a high discriminative power. However, costs for such equipment are massive: high costs of ownership and maintenance and the requirement for expert knowledge limit its use to a few specialized laboratories. Although NMR measurements per se commonly require little or no sample preparation, this is not the case for chemometric

approaches. For such analyses, the samples require highly precise pH adjustments, as otherwise nonlinear effects due to shifting protons from hydrated aldehydes, for example, lead to misinterpretations by multivariate analyses. Moreover, in comparison with hyphenated MS techniques, NMR significantly lacks the sensitivity to capture a wide range of minor honey components (i.e., amino acids, polyphenols, and organic acids and their esters) at very low concentration levels of less than 0.1 ppm. This is a significant drawback, as the potentially decisive information and thus distinguishing power are often determined by minor compounds, which may be contributing to the honey’s aroma depending on whether the concentration exceeds the odor threshold. Consequently, an efficient, complementary method to the NMR-based profiling of honey samples is required here, and it should be implementable in routine applications, have a rugged and fast setup, feature a low cost of ownership, and have high sensitivity at the sub-parts-per-million level. In the present study, an innovative analytical approach for the detection of the botanical origins of honey was developed, which was based on the analysis of volatile fractions (VOC) using a resolution-optimized headspace-gas-chromatographyion-mobility (HS-GC-IMS) setup. It consists of drift-time ionmobility spectrometry (DTIMS) coupled to headspace capillary gas chromatography (HS-GC), as described recently in a proofof-concept study by our research group.43 HS-GC-IMS has been demonstrated to be an effective separation technique because of its comparatively simple system setup, benchtop size, robustness, and price.44−48 Being headspace-based, time-consuming sample-pretreatment steps are usually not required, which means that the analysis is carried out using an almost untreated sample. GC-IMS is principally a 2D approach, combining GC retention with drift time. In the first dimension, the analytes are separated by gas chromatography on the basis of their retention in a capillary column. The second dimension is defined by the drift times of ions formed in the ionization chamber of the IMS cell. The charged molecules are separated under the influence of an electrical field depending on their drift behavior in a buffer gas. The drift times of such ions depend on the collision cross section (CCS) of the ion, which is directly connected to the structural parameters of size and shape and the charge location or distribution and can be calculated by the Mason−Schamb equation. This second dimension is true orthogonal separation and allows one to separate isomeric compounds possibly coeluting in GC separation.49,50 Honey features a distinct aroma profile depending on its biological origin, which is also the basis for sensory evaluations. In particular, monofloral honeys at least partially show highly characteristic aromas, among other features, due to the presence of specific VOCs that origin from the nectar of specific plants. The analysis by HS-GC-IMS could generate highly resolved aroma fingerprints, which could be evaluated by chemometric techniques and subsequently allow the discrimination of the botanical origins of honeys. Headspace-based measurements commonly require little or no sample preparation, which significantly simplifies analysis. Consequently, the aim of this work was to demonstrate the potential of HS-GC-IMS as an effective complementary tool to the NMR-based screening methods currently applied in the discrimination of various monofloral botanical species of honeys. B

DOI: 10.1021/acs.analchem.7b03748 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry



EXPERIMENTAL SECTION Reagents and Honey Samples. All reagents and solvents were purchased at the highest available quality (≥98%) and were at the minimum of HPLC-grade. A total of 72 reference samples of monofloral honeys from various European countries were analyzed using HS-GC-IMS. The samples were acacia honey (Robinia pseudoacacia), canola honey (Brassica napus), and honeydew honey (forest flower honeys) from various countries, as shown in Table 1. In

standard and being mixed with 2 mL of a saturated sodium chloride solution. Finally, 700 μL of the sample headspace was automatically injected by means of a heated syringe (80 °C) into the heated injector of the GC-IMS equipment under the conditions reported below. The determination of the headspace volatiles was performed in duplicate for each honey sample. HS-GC-IMS Apparatus. Analyses were performed on an advanced Ion-Mobility Spectrometer (IMS) manufactured by Gesellschaft für Analytische Sensorsysteme mbH (Dortmund, Germany) coupled with an Agilent 6890N gas chromatograph (Agilent Technologies, Palo Alto, CA). The system was equipped with a CombiPal GC autosampler (CTC Analytics AG, Zwingen, Switzerland) with a headspace sampling unit and a 2.5 mL gas-tight heatable syringe (Gerstel GmbH, Mühlheim, Germany). The injector port was equipped with a headspace glass liner (1.2 mm i.d., Agilent, Waldbronn, Germany) to minimize peak broadening. Chromatographic separation was performed on a DB-225 capillary column (25% phenyl, 25% cyanopropyl methyl siloxane) with a 25 m × 0.32 mm × 0.25 μm film thickness (Agilent Technologies, Santa Clara, CA). Nitrogen of 99.99% purity was used as the carrier gas at a constant flow of 1.5 mL/min. A gas-purifier cartridge was used (Restek GmbH, Bad Homburg, Germany). The IMS drift-time cell was mounted on the top of the GC. The transfer line to the IMS cell was set to 120 °C. Following gas-chromatographic separation, the analytes were ionized in the IMS ionization chamber by a 3H ionization source (300 MBq activity). The drift-tube length was 10 cm and was operated at a constant voltage of 5 kV and a temperature of 90 °C with a nitrogen flow of 150 mL/min. The gas flow was controlled by a mass-flow controller (Voegtlin Instruments AG, Aesch, Switzerland). The IMS cell was operated in positive-ion mode. Each spectrum was the average of six scans obtained by using injection pulse widths of 150 μs; sampling frequencies of 150 kHz; repetition rates of 21 ms; and blocking and injection voltages of 70 mV and 2500 mV, respectively. The data were collected using LAV software version 2.2.1 from Gesellschaft für Analytische Sensorsysteme mbH (Dortmund, Germany). Chromatographic Conditions. For the analysis, a headspace volume of 700 μL was sampled at a speed of 250 μL/s and a syringe temperature of 80 °C to avoid condensation effects. Before each analysis, the syringe was automatically flushed with a stream of nitrogen for 2 min to avoid cross contamination. Injection was performed into a split/splitless injector, operated at 150 °C in split mode (split 1:10). The GC oven temperature was programmed as follows: the initial temperature was 40 °C, which was held for 2 min, and then the temperature was ramped up to 120 °C at 8 °C/min and held at 120 °C for 10 min. Data Preprocessing. Because of the very large data sets produced by HS-GC-IMS, an appropriate pretreatment procedure of the three-dimensional data is necessary in order to avoid significant errors in the multivariate statistical analysis. The preprocessing procedure of the VOC spectra was performed using the MATLAB software (version R2016a, Mathworks, Natick, MA). First, a second-order, baselinecorrection and smoothing Savitzky−Golay filter was used to improve the signal-to-noise ratios of all spectra of honey samples. In the next step, the spectra were normalized relative to the expected reaction-ion-peak (RIP) position, which was followed by spline interpolation to create a common set of points on the drift-time axis (x axis) of the GC-IMS spectra

Table 1. Botanical Varieties and Geographical Origins of the Honey Samples Used as the Training Set botanical origin

no.

geographical origin

acacia

25

canola

23

honeydew

24

Germany (6), France (3), Croatia (1), Hungary (4), Italy (1), Moldavia (1), EU/non-EU (8), Germany/ Romania (1) Germany (15), Chile (1), Poland (2), Romania (2), Bulgaria/Romania (1), EU/non-EU (2) Turkey (2), Germany (9), Italy (3), EU/non-EU (9), Italy/Czechia (1), Italy/Germany (1), Switzerland (1)

addition, 10 further authentic honey samples were analyzed to provide external-test-set results. The samples were obtained by governmental food inspectors from Baden-Wuerttemberg, Germany; from supermarkets; or directly from bee keepers. The botanical origins of the honey samples was confirmed by microscopic pollen analysis in the framework of the official control of foodstuffs. All samples were stored in the dark at room temperature (18−23 °C) in screw-cap jars before analysis. Anhydrous sodium chloride was obtained from VWR International GmbH (Darmstadt, Germany). Ultrapure water was purified in-house using a Milli-Q water-purification system (Millipore, Bedford, MA). Internal standard 2-acetylpyridine was purchased from Sigma-Aldrich Chemie GmbH (Taufkirchen, Germany). A saturated sodium chloride solution was prepared freshly and used as a blank in each analytical run. A total of 41 analytical standards, including aldehydes (2pentenal, 2-methylpropanal, 2-methylbutanal, hexanal, trans-2hexenal, trans,trans-2,4-hexadienal, trans-2-heptenal, pentanal, heptanal, octantal, benzaldehyde, acetaldehyde, phenylacetaldehyde, decanal, furfural, and trans-2-nonenal), ketones (2butanone, 3-hydroxy-2-butanone, and acetophenone), alcohols (cis-3-hexen-1-ol, cis-2-penten-1-ol, 3-methylbutan-1-ol, 2methyl-3-buten-2-ol, 1-octen-3-ol, 2-methylpropan-1-ol, 2phenylethanol, hexan-1-ol, penten-3-ol, and ethanol), esters (cis-3-hexenyl acetate, ethyl butanoate, hexyl acetate, butyl acetate, ethyl propionate, and ethyl acetate), organic acids (2methylheptanoic acid, benzoic acid, and trans-cinnamic acid), and monoterpenes (D,L-menthol, cis-linalool-oxide, and Dlimonene), were used for the identification of characteristic volatile compounds from honey samples. The analytical standards were purchased at ≥98% purity from Sigma-Aldrich (Sigma-Aldrich Chemie GmbH, Taufkirchen, Germany). Stock solutions (1000 mg/L) were prepared by dissolving each compound in Millipore water. All stock and standard solutions were stored at 4 °C prior to use. Isolation of Volatile Organic Compounds in Honey. For the analysis of volatile compounds, 2 g of a honey sample was introduced into a 20 mL headspace vial and subsequently incubated at 45 °C for 15 min after being spiked with 18 μL of a 2-acetylpyridine stock solution (1008 mg/L) as the internal C

DOI: 10.1021/acs.analchem.7b03748 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

Figure 1. Representative HS-GC-IMS chromatogram−mobility plot of the volatile fingerprints of canola-, acacia-, and honeydew-honey samples. The characteristic signals for the specific honey types are highlighted. Identified signals: (1) acetaldehyde, (2) ethanol, (3) 2-methylpropanal, (4) ethyl acetate, (5) 2-butanone, (6) 2-methylbutanal, (7) 2-methylpropanol, (8) pentanal, (9) butyl acetate, (10) 3-methyl-1-butanol, (11) hexanal, (12) trans-2-pentenal, (13) 3-hydroxy-2-butanone (acetoin), (14) heptanal, (15) furfural, (16) cis-linalool-oxide, (17) benzaldehyde, and (18) Dlimonene. Corresponding dimers formed in the IMS drift tube are indicated with #.

Figure 2. (a) PCA 3D scatter plot of preprocessed HS-GC-IMS spectra of the three monofloral honey types and (b) corresponding PCA loadings plot for the first three PCs, which describe 67% of the total variance. Different honey groups show significant differences in spectral characteristics.

study.42 Detailed sample-preparation and instrumental parameters are given here. Chemometric-Data Analysis and Software. As a first step, before performing the two-way multivariate analysis, the original, preprocessed three-way ion-mobility data arrays were unfolded to matrices, resulting in a data matrix consisting of 72 rows (72 honey samples) and 4 258 025 variables (4525 × 941, retention time × drift time). The techniques applied to the HSGC-IMS and 1H NMR data for the multivariate data assessment of the honey samples were principal-component analysis (PCA) for dimensionality reduction, which was followed by linear-discriminant analysis (LDA), and k-nearest neighbors (kNN) for the subsequent classification. The LDA model was formed with the training data set only. Subsequently, the test set samples were projected into this LDA-scores space and plotted. As an additional method, PLS

(see Figure 1). Subsequently, the duplicate measurements were averaged for each sample. Additionally, in order to correct small, random deviations in retention times, all spectra were aligned with a shift in the y axis (retention-time axis) based on a linear function fitted to a reference peak (2-acetylpyridine) by using a specially designed MATLAB algorithm. Finally, a data set with only the spectra in which signals appear was built for statistical and pattern-recognition analysis (Figure 2). The retention times and drift times used for chemometric analysis were 100 to 900 s (4525 variables) and 7.04 to 13 ms (941 variables), respectively. The aligned and mean-centered data set comprising 4525 × 941 variables and 72 samples (averaged spectra) constituted the starting point for the patternrecognition analysis. 1 H NMR Measurements at 400 MHz. All 1H NMR data discussed in this work were already described in our previous D

DOI: 10.1021/acs.analchem.7b03748 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

depending on the botanical source of the studied honey. The most prominent differences in the VOC patterns of the different honeys were observed in the fingerprint regions at retention times ranging from 200 to 600 s. It can be seen that low-molecular-weight VOCs are present at the highest concentration levels and elute in a time range between 200 and 300 s. These signals generally are typically found in all honeys analyzed; for example, acetaldehyde, ethanol, 2methylpropanal, ethyl acetate, 2-butanone, and 2-methylbutanal, with only minor variations in their intensities, were observed. The significant differences in the VOC profiles, which characterize different honey types, were observed at later retention times. As an example, abundant VOCs detected at a GC retention time between 350 and 500 s were highly specific for canola honey (see the highlighted area in Figure 1), whereas these compounds were not found in the acacia- or honeydewhoney samples or were only detected with low abundances. Further characteristic signals, which can be considered as potential markers, were observed solely or at least predominantly in honeydew honey at GC retention times in the range of 730−810 s (see the highlighted area in Figure 1). Figure 1 shows the application of the feature map for the 18 selected compounds. The compounds were verified by comparison of the corresponding drift times and retention times to those of the authentic reference compounds and additionally by standard addition. In general, linear and branched aldehydes, ketones, and short-chain alcohols were found in most or all of the honeys analyzed. Some compounds were more prevalent in particular types of honey; for example, acacia honey featured higher levels of hexanal and cis-linalool oxide, whereas canola honey showed higher concentrations of benzaldehyde. The aroma profile of honeydew honey showed a characteristic pattern, in which terpenes, such as cis-linalool oxide and Dlimonene, were dominant. In particular, 3-hydroxy-2-butanone (acetoin), trans-2-pentenal, and 3-methylbutanol were found to be characteristic for honeydew honeys. As was expected from their more intense aromas, it was found that honeydew honeys generally presented richer volatile chromatographic profiles at relatively low signal intensities, whereas in contrast, acacia honeys featured the least intensive profiles of all three honeys analyzed. These results are closely associated with sensory perceptions, which confirm the aroma in acacia honey to be of lower intensity as compared with those of honeydew and canola honeys. According to previous publications,51,52 floral honeys are characterized more by low free acidities, polyphenol contents, and lactone quantities, whereas honeydew honeys are better featured by high free acidities, polyphenol contents, and lactone quantities. These compounds are at least partially accessible via headspace sampling and clearly contribute to the characteristic aroma profiles of specific honey types. Honey Discrimination by PCA. In the scope of this study, the primary goal was the discrimination of honey samples from different floral origins rather than the determination of individual components of samples. The obtained HS-GC-IMS profiles are highly complex, featuring numerous peaks of different intensities and the presence of characteristic compounds mixed with noncharacteristic ones. As a mandatory step to discriminate different classes of a given honey sample, it is necessary to extract the complete profile of chemical differences from this complex data set. For this purpose, a mean-centered PCA was carried out over the 72 × 4525 × 941 data matrix (72 samples and 4525 × 941 chromatographic variables) as a first-pass, unsupervised method to identify

discriminant analysis (PLS-DA) was also evaluated for its use in classification. The kNN classifier was applied to generate nonlinear classifications by finding the closest k examples in the data set to the unknown class using Euclidean distance and selecting the predominant class for it. In this work, a value of k equal to 5 was used. Finally, the predictive abilities of the PCA−LDA, PLS-DA, and kNN models were evaluated by k-fold cross validation (k = 10), in which a random partition for a stratified k-fold cross was created. k-fold cross validation is a widely used approach for estimating the test error by leaving out part k, fitting the model to the other k − 1 parts, and then obtaining predictions for the left-out kth part. Each subsample has roughly an equal size and the same class proportions as others in the group. For better comparability, equal random numbers were used in cross validation for both HS-GC-IMS and 1H NMR based models. All the calculations and preprocessing were assessed by using in-house MATLAB routines (version R2016a, Mathworks, Natick, MA, USA). The PCA, kNN, PCA−LDA, and PLS-DA models were built by applying the MATLAB Statistical Toolbox (version 9.1). Before importing the HS-GC-IMS data into MATLAB, all raw data files were first converted to csv text files using the csv export tool implemented in the LAV software version 2.2.1 from Gesellschaft für Analytische Sensorsysteme mbH (Dortmund, Germany).



RESULTS AND DISCUSSION Volatile-Fraction Profiling by HS-GC-IMS. In this study, HS-GC-IMS was used to compare various monofloral honeys on the basis of their VOC profiles in order to establish a fast and reliable method for authenticity screenings. The resulting three-dimensional VOC profiles are complex and feature more than 100 individual signals. Consequently, the discrimination of floral origins was based on a nontargeted profiling approach as opposed to the classical targeted approaches of selecting one or more chemical compounds as markers. This approach of using the complete spectral data requires chemometric techniques that use signals and signal intensities as variables without identifying them or establishing calibration curves. Although the identification and quantification of individual substances might be relevant for subsequent specification analyses (e.g., HMF as a heat-treatment indicator or ethanol as a spoilage marker), this study focused on the chemometric analysis of fingerprint information without prior identification and calibration of all measured substances. However, for a better understanding of the chemical identities of the volatiles responsible for class separation, 18 exemplary target compounds were identified and analyzed by HS-GC-IMS. This is shown in Figure 1 in the form of “feature maps”, which overlay the 2D spectra. By using these individual identifiers (features) obtained from the retention-time index and the normalized drift time, the targets may be calibrated and quantitated by use of reference materials. In our previous study,43 the good stability and reproducibility of the fingerprint analysis by HS-GC-IMS were demonstrated, with relative standard deviations of the retention-time and drifttime values of the significant peaks in the spectra of less than 1%. Representative GC-IMS spectra corresponding to a canola-, acacia-, or honeydew-honey sample are shown in Figure 1. As is obvious, the VOC profiles of the analyzed monofloral honeys vary significantly in terms of their nature and signal intensities, E

DOI: 10.1021/acs.analchem.7b03748 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

model decreased to 60% and 38%, respectively, which underlines the fact that nearly all algorithms are prone to overfitting when too many PCs are used for model building. Table 2 shows the results obtained from a reduced PC set

chemical differences among high-dimensional HS-GC-MS spectra and to detect possible clusters within samples. In this way, the first verification of the discriminatory efficiency of the variables considered was provided. Figure 2a shows the scatter plot for the first three principal components (PCs) determined by PCA, through which a visualization of the data structure in a reduced dimension is obtained. The first three PCs explained 67% of the total variance in the original information. Although it is more common in practice to plot 2D plots of any two of the PCs, these projections of multidimensional data onto a 2D space often lead to overlapping clusters in one plot, and other combinations of PCs in other plots need to be checked to observe cluster separations. The PCA results from this first study section show a significant differentiation of all the investigated groups in a single 3D plot. As can be seen from the PCA score plot in Figure 2a, honey samples of different botanical origins were well separated and grouped according to their floral sources, with only a slight overlap between the acacia- and canola-originated samples. PC1 and PC2 differentiate between acacia and honeydew, and PC3 separates these classes from the third one, canola. The PCA results clearly show that honeydew honeys feature significant differences from nectar honeys with respect to their VOC profiles. The honeydew honeys were well discriminated to positive score values in PC1, whereas the cluster of acacia honeys was welldefined by negative scores of PC1 and PC2, and canola honeys were well separated by the positive score values in PC2. Regarding the score plot, it should be noted that the variability within each honey cluster is high, thus reflecting the wide geographical spreads of all the honey samples investigated and also the natural variabilities in the pollen profiles. Although the score plots show trends and groupings of data, the loadings reveal the chemical basis of variation. The corresponding PCA loadings plot (Figure 2b) for the first three PCs confirmed that the VOCs in the fingerprint region of the spectra (GC retention time of 200−600 s) had the strongest influence on class separation. The results of the PCA analysis demonstrate that the honey samples for all the categories have different characteristics and feature different VOC profiles in a multidimensional space. Thus, HS-GC-IMS spectra seem to contain useful information to perform classification models in order to assign an unknown honey sample to the class to which it belongs. Honey Classification: HS-GC-IMS versus 1H NMR Data. Each of the HS-GC-IMS spectra used in this study is a set of many thousands of variables. The use of such high-dimensional data as input information for classification algorithms, such as LDA is not advisable, as one would run into certain overfitting of the model. This term refers to the overweighting of residual noise over real information, which results in apparent class separation where there is none. This can be overcome by reducing dimensionality using PCA. The eigenvalues of the principal components (which represent the percentage of the variance retained for the principal components) showed that the first 10 principal components represent more than 90% of the total data variance. The ideal number of PCs used for a PCA−LDA model was determined on the basis of the lowest classification error rate by cross validation of those calculated for PCs 2 to 20. The model was most robust when using 6 to 12 PCs. To illustrate this, we used a full PC decomposition with all 71 PCs resulting in a 72 × 71 data matrix. Here, it could be demonstrated that the overall accuracy of the LDA and kNN

Table 2. Classification Results for the HS-GC-IMS and 1HNMR Fingerprints by Different Chemometric Methods Based on the First 10 PCs after Employing 10-Fold Cross Validation three-class modela HS-GC-IMS

1

H NMR

overall accuracy (%)b

method c

PCA−LDA kNNd PLS-DAc,e PCA−LDAc kNNd PLS-DAc,e

98.6 86.1 97.0 100 95.8 100

a

Acacia honey, canola honey, or honeydew honey (n = 72). bk-fold cross. cCalculated considering the samples from all three groups. d kNN, k-nearest-neighbor classification (k = 5). eBased on the first five PLS components.

instead of full PC decomposition. Hence, PCA scores of the first 10 extracted PCs instead of the original variables were used for the calculation. A new data matrix, X: 72 × 10, containing the scores of the 72 honey samples on the first 10 principal components was considered to be the input information for the LDA and kNN employed in the development of classification models. In the next step, we compared HS-GC-IMS-profiling data versus data obtained from the 1H NMR profiling of the same samples to evaluate the discrimination quality of the 3D-VOC profiles. For this purpose, both the 1H NMR fingerprints generated in a previously published study38 for the same set of honey samples and the HS-GC-IMS data in this study were separately analyzed by PCA−LDA, each following the same procedure, and the results were then used for a direct comparison. The predictive accuracies of both LDA models were estimated by using a k-fold-cross-validation (k = 10) approach. As is important in enabling a direct comparison between the two LDA models, we ensured that the same dataset splits were employed for the HS-GC-IMS and 1H NMR data. Finally, after it was validated by a full cross-validation procedure, the predictive power of the obtained LDA model was shown by classifying new samples. For this, PCA−LDA models were applied on an external test set of 10 authentichoney samples (3 canola, 4 acacia, and 3 honeydew honeys), in order to evaluate predictions of class membership of honey samples and to compare the classification abilities of the HSGC-IMS and 1H NMR spectra. For both methods, the same set of samples were used to train and test the model. These independent test-set samples were recorded under the same analytical conditions, including microscopic pollen analysis. Figure 3 shows the LDA score plots of the first two discriminant functions for the classification of both the training and test sets after the analyses of the honey samples on both the HS-GC-IMS and 1H NMR platforms. In Figure 3a, it can be seen that the HS-GC-IMS fingerprints allow good separation of the acacia, canola, and honeydew honeys based on the first and second discriminant functions. The separation of the three classes is quite clear with only a slight overlap between the acacia and canola samples. The first discriminant function F

DOI: 10.1021/acs.analchem.7b03748 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

Figure 3. Linear discriminant analysis (LDA) of (a) HS-GC-IMS data and (b) 1H NMR data. The score plots for the classifications include the training set (72 honey samples) and external test set (black symbols: A = acacia, C = canola, and H = honeydew).

separates the canola honeys from the acacia honeys, and the second discriminant function separates the canola honeys from the honeydew honeys. After cross validation, the PCA−LDA and kNN models were able to correctly classify all the samples into their respective botanical-origin groups with overall classification accuracies of 98.6% and 86.1%, respectively. The best result was obtained by the PCA−LDA model. As a comparison, PLS-DA was performed, on the basis of the first five PLS components. This approach delivered a comparable overall accuracy to that of PCA−LDA at 97.0% with equal variables distinguishing the classes from one another. Depending on the separation problem, the PCA−LDA approach may have an advantage over PLS-DA, because it does not assume multicolinearity among the independent variables. From Figure 3b, it can be seen that the discrimination quality of the generated VOC fingerprints in the honeys match the classification results obtained with the 1H NMR fingerprints, showing an overall accuracy between 95.8 and 100%. The classification results for the HS-GC-IMS and 1H NMR fingerprints by different chemometric methods are summarized in Table 2. Thus, our results demonstrate that VOC profiling by HSGC-IMS in combination with multivariate statistics is an efficient tool to perform a rapid and cost-effective classification of the different botanical origins of honey samples. After the external validation of the models, all the samples from the test set were correctly recognized by the PCA−LDA model for both the HS-GC-IMS and 1H NMR data at the significance level described by the posterior probability. Thus, the predicted labels of all the samples matched with the true labels of the tested honey samples. The results of the external validation are summarized in Table 3. The estimated posterior probability of each class for the test data give information about the expected probability that a future sample will be correctly classified when performing class prediction. The estimates were obtained using the kernel density function in MATLAB 9.2. The posterior probabilities were between 93.8 and 100%, except for two acacia-honey samples (A1 and A2, see also Table 3, in bold) in the HS-GC-IMS-based model with probability estimates at 62.2 and 76.9%. When taking a closer look at these two acacia test-set samples in the HS-GC-MS-based model (Figure 3a), it becomes evident that these two samples are allocated more closely to the cluster of canola samples, which is not the case for these samples in the 1H NMR data (Figure 3b). From the results of microscopic analysis, it was found that the

Table 3. Validation Results Using External Test Sets of Honey Samples with the Corresponding PosteriorProbability Estimates posterior probability (%)

a

no.

true labela

predicted labela

GC-IMS

1 2 3 4 5 6 7 8 9 10

A1 A2 A3 A4 C1 C2 C3 H1 H2 H3

A A A A C C C H H H

62.2 76.9 97.2 93.8 99.9 99.5 99.9 99.9 99.9 100

1

H NMR 100 99.9 100 100 100 100 100 100 100 100

A = acacia honey, C = canola honey, and H = honeydew honey.

acacia pollen profile in these two acacia honeys was less typical because of the presence of other dominant pollen types, particularly canola and sunflower, which could be the reason for the shift toward the canola cluster. Consequently, whereas the relevant discriminant information in the 1H NMR spectra is generally determined by the major sugar signals rather than the minor components, the 3D-VOC profiles obtained by HS-GCIMS seem to better reflect the pollen composition in a honey sample. This means that at least for the honey samples analyzed here, 1H-NMR analysis tends toward an overoptimistic separation of classes in comparison to the more conservative separation based on the HS-GC-IMS data. However, as this study is intended to be a proof-of-concept, this finding should be considered to be tentative, as a larger number of test-set samples from more honey varieties are required to verify this trend. In summary, the results of the present study demonstrate that our setup consisting of resolution-optimized HS-GC-IMS combined with chemometric protocols is suitable for the fast and cost-efficient discrimination of the botanical provenance of honeys. A direct comparison of the discriminant analyses of the HS-GC-IMS and 1H NMR data showed that the discrimination ability of the model using information extracted from VOC spectra is comparable to that obtained from 1H NMR data. These findings also suggest that HS-GC-IMS-based VOC profiling can serve as an important complementary and G

DOI: 10.1021/acs.analchem.7b03748 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

(5) Castro-Vázquez, L.; Díaz-Maroto, M. C.; Pérez-Coello, M. S. Food Chem. 2007, 103, 601−606. (6) Odeh, I.; Abulafi, S.; Dewik, H.; Alnajjar, I.; Imam, A.; Dembitsky, V.; Hanus, L. Food Chem. 2007, 101, 1393−1397. (7) Jerković, I.; Mastelić, J.; Marijanović, Z. Chem. Biodiversity 2006, 3, 1307−1316. (8) Kaškonienė, V.; Venskutonis, P. R.; Č eksterytė, V. LWT-Food Sci. Technol. 2010, 43, 801−807. (9) Persano Oddo, L.; Piro, R. Apidologie 2004, 35, S38−S81. (10) Babarinde, G. O.; Babarinde, S. A.; Adegbola, D. O.; Ajayeoba, S. I. J. Food Sci. Technol. 2011, 48, 628−634. (11) Louveaux, J.; Maurizio, A.; Vorwohl, G. Bee World 1978, 59, 139−157. (12) Cavazza, A.; Corradini, C.; Musci, M.; Salvadeo, P. J. Sci. Food Agric. 2013, 93, 1169−1175. (13) Zhou, J.; Yao, L.; Li, Y.; Chen, L.; Wu, L.; Zhao, J. Food Chem. 2014, 145, 941−949. (14) Karabagias, I. K.; Badeka, A.; Kontakos, S.; Karabournioti, S.; Kontominas, M. G. Food Res. Int. 2014, 55, 363−372. (15) Radovic, B. S.; Careri, M.; Mangia, A.; Musci, M.; Gerboles, M.; Anklam, E. Food Chem. 2001, 72, 511−520. (16) Baroni, M. V.; Nores, M. L.; Díaz, M. d. P.; Chiabrando, G. A.; Fassano, J. P.; Costa, C.; Wunderlin, D. A. J. Agric. Food Chem. 2006, 54, 7235−7241. (17) Cuevas-Glory, L. F.; Pino, J. A.; Santiago, L. S.; Sauri-Duch, E. Food Chem. 2007, 103, 1032−1043. (18) Jerković, I.; Marijanović, Z. Chem. Biodiversity 2009, 6 (3), 421− 430. (19) Castro Vázquez, L.; Díaz-Maroto, M. C.; Guchu, E.; PérezCoello, M. S. Eur. Food Res. Technol. 2006, 224 (1), 27−31. (20) Piasenzotto, L.; Gracco, L.; Conte, L. J. Sci. Food Agric. 2003, 83 (10), 1037−1044. (21) Robotti, E.; Campo, F.; Riviello, M.; Bobba, M.; Manfredi, M.; Mazzucco, E.; Gosetti, F.; Calabrese, G.; Sangiorgi, E.; Marengo, E. J. Chem. 2017, 2017 (4), 1−14. (22) Moniruzzaman, M.; Rodríguez, I.; Ramil, M.; Cela, R.; Sulaiman, S. A.; Gan, S. H. Talanta 2014, 129, 505−515. (23) Pulcini, P.; Allegrini, F.; Festuccia, N. Apiacta 2006, 21−27. (24) Oelschlaegel, S.; Gruner, M.; Wang, P. N.; Boettcher, A.; Koelling-Speer, I.; Speer, K. J. Agric. Food Chem. 2012, 60, 7229−7237. (25) Tette, P. A. S.; Guidi, L. R.; Bastos, E. M.; Fernandes, C.; Gloria, M. B. A. Food Chem. 2017, 229, 527−533. (26) Trautvetter, S.; Koelling-Speer, I.; Speer, K. Apidologie 2009, 40, 140−150. (27) Jandrić, Z.; Haughey, S. A.; Frew, R. D.; McComb, K.; GalvinKing, P.; Elliott, C. T.; Cannavan, A. Food Chem. 2015, 189, 52−59. (28) Herrero Latorre, C.; Peña Crecente, R. M.; García Martín, S.; Barciela García, J. Food Chem. 2013, 141, 3559−3565. (29) Gok, S.; Severcan, M.; Goormaghtigh, E.; Kandemir, I.; Severcan, F. Food Chem. 2015, 170, 234−240. (30) Woodcock, T.; Downey, G.; Odonnell, C. Food Chem. 2009, 114, 742−746. (31) Goodacre, R.; Radovic, B. S.; Anklam, E. Appl. Spectrosc. 2002, 56, 521−527. (32) Fernández Pierna, J. A.; Abbas, O.; Dardenne, P.; Baeten, V. Biotechnol. Agron. Soc. Environ. 2011, 15, 75−84. (33) Corvucci, F.; Nobili, L.; Melucci, D.; Grillenzoni, F. V. Food Chem. 2015, 169, 297−304. (34) Schellenberg, A.; Chmielus, S.; Schlicht, C.; Camin, F.; Perini, M.; Bontempo, L.; Heinrich, K.; Kelly, S. D.; Rossmann, A.; Thomas, F.; Jamin, E.; Horacek, M. Food Chem. 2010, 121, 770−777. (35) Dinca, O. R.; Ionete, R. E.; Popescu, R.; Costinel, D.; Radu, G. L. Food Anal. Method 2015, 8, 401−412. (36) Boffo, E. F.; Tavares, L. A.; Tobias, A. C.; Ferreira, M. M.; Ferreira, A. G. LWT-Food Sci. Technol. 2012, 49, 55−63. (37) Schievano, E.; Finotello, C.; Uddin, J.; Mammi, S.; Piana, L. J. Agric. Food Chem. 2016, 64, 3645−3652. (38) Ohmenhaeuser, M.; Monakhova, Y. B.; Kuballa, T.; Lachenmeier, D. W. ISRN Anal. Chem. 2013, 2013, 1−9.

alternative tool for the reliable identification and quality control of honey and is less complex, faster, more cost-efficient, and also better associated with sensory perceptions. In particular, the HS-GC-IMS-screening approach could be a powerful supplement to a sensory evaluation.



CONCLUSION The results from this work demonstrated that a nontargeted VOC-profiling approach using a resolution-optimized HS-GCIMS setup provides a fast, cost-efficient, and robust tool for the reliable classification of the botanical origins of honeys. This was proved by the high predictive ability of the PCA−LDA model, in which all the external test-set samples were correctly classified into the three variety groups. The potential of HS-GC-IMS profiling as an alternative to the currently used NMR-based screening methods was clearly demonstrated by a direct and objective comparison of the performances of the discriminant analyses of the HS-GC-IMS and 1H NMR fingerprints. The experimental results showed that the discrimination of the botanical origins of honeys based on spectral variations in the VOC profiles matched closely with that based on variations in the saccharide contents of the 1H NMR spectra. Moreover, we tentatively observed that the class predictions of the HS-GC-IMS-based model seemed to reflect differences in pollen compositions in honey samples better compared with those based on 1H NMR data. Thus, the HS-GC-IMS-based VOC-profiling approach could help to reduce the potential risks of the misinterpretation of such data by preventing overoptimistic class separations. Consequently, our findings indicate that the HS-GC-IMS-based screening method for honey analysis either can complement traditional analyses of commercial honey samples or even be superior to the currently used screening methods such as 1H NMR because of the elimination of the need of highly expensive equipment or expert knowledge.



AUTHOR INFORMATION

Corresponding Author

*Tel.: +49-(0)621 292 6484, Fax: +49-(0)621 292 6420, Email: [email protected]. ORCID

Philipp Weller: 0000-0002-5083-421X Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors would like to acknowledge the Chemisches und Veterinaeruntersuchungsamt (CVUA) Karlsruhe and Freiburg for supplying honey samples and the supporting data of pollen analysis. We gratefully acknowledge the Center for Applied Research in Biomedical Mass Spectrometry (ABIMAS) for financial support.



REFERENCES

(1) Committee on the Environment, Public Health and Food Safety. On the food crisis, fraud in the food chain and the control thereof; 2013/ 2091(INI); European Parliament, 2013. (2) Rowland, C. Y.; Blackman, A. J.; D’Arcy, B. R.; Rintoul, G. B. J. Agric. Food Chem. 1995, 43, 753−763. (3) Guyot, C.; Scheirman, V.; Collin, S. Food Chem. 1999, 64, 3−11. (4) Alissandrakis, E.; Tarantilis, P. A.; Harizanis, P. C.; Polissiou, M. J. Agric. Food Chem. 2007, 55, 8152−8157. H

DOI: 10.1021/acs.analchem.7b03748 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry (39) Zheng, X.; Zhao, Y.; Wu, H.; Dong, J.; Feng, J. Food Anal. Method 2016, 9, 1470−1479. (40) Spiteri, M.; Jamin, E.; Thomas, F.; Rebours, A.; Lees, M.; Rogers, K. M.; Rutledge, D. N. Food Chem. 2015, 189, 60−66. (41) Consonni, R.; Cagliani, L. R.; Cogliati, C. Food Control 2013, 32, 543−548. (42) Gerhardt, N.; Birkenmeier, M.; Kuballa, T.; Ohmenhaeuser, M.; Rohn, S.; Weller, P. IM Publications 2016, 33−37. (43) Gerhardt, N.; Birkenmeier, M.; Sanders, D.; Rohn, S.; Weller, P. Anal. Bioanal. Chem. 2017, 409, 3933−3942. (44) Márquez-Sillero, I.; Cárdenas, S.; Sielemann, S.; Valcárcel, M. J. Chromatogr. A 2014, 1333, 99−105. (45) Krisilova, E. V.; Levina, A. M.; Makarenko, V. A. J. Anal. Chem. 2014, 69, 371−376. (46) Shuai, Q.; Zhang, L.; Li, P.; Zhang, Q.; Wang, X.; Ding, X.; Zhang, W. Anal. Methods 2014, 6, 9575−9580. (47) Garrido-Delgado, R.; Dobao-Prieto, M. M.; Arce, L.; Aguilar, J.; Cumplido, J. L.; Valcárcel, M. J. Agric. Food Chem. 2015, 63, 2179− 2188. (48) Denawaka, C. J.; Fowlis, I. A.; Dean, J. R. J. Chromatogr. A 2014, 1338, 136−148. (49) Johnson, P. V.; Beegle, L. W.; Kim, H. I.; Eiceman, G. A.; Kanik, I. Int. J. Mass Spectrom. 2007, 262, 1−15. (50) Kiss, A.; Heeren, R. M. A. Anal. Bioanal. Chem. 2011, 399, 2623−2634. (51) Sanz, M. L.; Gonzalez, M.; de Lorenzo, C.; Sanz, J.; MartínezCastro, I. Food Chem. 2005, 91, 313−317. (52) Soria, A. C.; González, M.; de Lorenzo, C.; Martínez-Castro, I.; Sanz, J. J. Sci. Food Agric. 2005, 85, 817−824.

I

DOI: 10.1021/acs.analchem.7b03748 Anal. Chem. XXXX, XXX, XXX−XXX