Validated Modeling for German White Wine Varietal Authentication

Jun 24, 2014 - An untargeted analytical approach combined with chemometrics using the volatiles of German white wine was investigated regarding the us...
24 downloads 6 Views 780KB Size
Article pubs.acs.org/JAFC

Validated Modeling for German White Wine Varietal Authentication Based on Headspace Solid-Phase Microextraction Online Coupled with Gas Chromatography Mass Spectrometry Fingerprinting A. E. Springer,† J. Riedl,† S. Esslinger,† T. Roth,‡ M. A. Glomb,‡ and C. Fauhl-Hassek*,† †

Department Safety in the Food Chain, Bundesinstitut für Risikobewertung (BfR) Federal Institue for Risk Assessment, Max-Dohrn-Straße 8-10, D-10589 Berlin, Germany ‡ Institute of Chemistry, Food Chemistry, Martin-Luther-University Halle-Wittenberg, Kurt-Mothes-Straße 2, D-06120 Halle/Saale, Germany S Supporting Information *

ABSTRACT: An untargeted analytical approach combined with chemometrics using the volatiles of German white wine was investigated regarding the usefulness for verifying botanical origin. A total of 198 wine samples of Riesling, Müller-Thurgau, Silvaner, Pinot Gris, and Pinot Blanc were examined applying headspace solid-phase microextraction online coupled with gas chromatography mass spectrometry. The resultant three-dimensional raw data were processed by available metabolomics software. After data treatment, a partial least-squares discriminant analysis (PLS-DA) model was validated. External samples were correctly classified for 97% Silvaner, 93% Riesling, 91% Pinot Gris/Blanc, and 80% Müller-Thurgau. This model was related to monoterpenoids, C13-norisoprenoids, and esters. Further, 100% prediction for a two-class model of Riesling versus Pinot Gris/ Blanc was confirmed by 74 additional samples measured independently. Hence, the strategy applied was, in particular, reliable and relevant for white wine varietal classification. In addition, the superior classification performance of the Riesling class was revealed. KEYWORDS: white wine, aroma fingerprinting, botanical origin, pattern recognition, validation, ROC curve analysis



INTRODUCTION Wine of authentic origin is highly appreciated by consumers and constitutes an important economic factor. In Germany, for example, great importance is attached to the botanical purity of wine, and thus, labeling of the variety is very common. This provides consumers with important information on aroma characteristics and is also related to certain quality wine categories and regional origin. Here, to reveal practices of fraud by analytical chemistry, single chemical parameters often do not allow for the characterization of such complex identity properties, such as the botanical origin.1 For example, the concentration of shikimic acid displays limited discriminative power to determine white wine varietals and is therefore combined with the protein profile of these wines.2 As a result, more white wine varietals can be reliably determined. In this regard, multivariate statistics are of support. However, for verifying the origin, spectroscopic and spectrometric analytical methods are of particular interest. Omitting elaborate sample preparation, these techniques give multiple results in one single run. Typically, in an untargeted approach or fingerprinting,3 no compounds are pre-selected and, therefore, the raw data are used for multivariate data analysis. This allows for an unbiased parameter selection1 and prevents the loss of potentially important information that can be used for the subsequent data analysis. Nuclear magnetic resonance (NMR) or infrared spectroscopy is commonly used in this respect.4 However, mass spectrometry (MS)-based techniques allow for a more sensitive detection. In particular, the hyphenated MS techniques, such as gas chromatography © 2014 American Chemical Society

(GC)−MS and liquid chromatography MS, have proven to be suitable for the untargeted approach.5,6 The hyphenation prevents ion suppression occurring in direct MS analysis.7 Moreover, the identification of important substances is possible. On the whole, to convert the three-dimensional chromatography−MS raw data for the purpose of statistic analysis, various tools are available, using mass feature extraction and retention time alignment.8,9 Volatiles are of great importance for the characterization among origins of food10 and, in particular, wine.11−13 To extract these from the wine matrix, the key technique available for fingerprinting currently constitutes headspace solid-phase microextraction (HS−SPME). Working free of solvents, it can be carried out automatically and online coupled to GC− MS.10 HS−SPME−GC−MS fingerprinting regarding the botanical origin has been performed on plant or food matrices.14−16 However, to our knowledge, this approach has not been used for the classification of botanical origin of white wines thus far. Other analytical techniques, which were used for fingerprinting and employing volatile analytes for varietal discrimination, comprised the use of HS−SPME−GC−MS without any relevant GC separation17 or direct MS techniques, such as HS−MS18 and proton transfer reaction−MS.19 Received: Revised: Accepted: Published: 6844

April 30, 2014 June 23, 2014 June 24, 2014 June 24, 2014 dx.doi.org/10.1021/jf502042c | J. Agric. Food Chem. 2014, 62, 6844−6851

Journal of Agricultural and Food Chemistry

Article

97%), γ-terpinene (purity > 97%), and 2-phenylethyl acetate (purity > 99%) were provided by Sigma-Aldrich. α-Terpineol (purity = 95%) was purchased from ABCR (Karlsruhe, Germany), and 1,1,6-trimethyl1,2-dihydronaphthalene (TDN) (purity = 98%) was synthesized by VeZerf (Idar-Oberstein, Germany). HS−SPME Sample Extraction and GC−MS Analysis. To extract samples, 10 mL of cooled wine (4 °C) and 20 μL of internal standard mix were filled in a 20 mL amber headspace vial, which was then immediately closed using headspace analysis screw caps with silicone/polytetrafluoroethylene (PTFE) septa of 1.3 mm thickness. Sample bottles were exclusively opened at the time of sample preparation. Samples of identical origin regarding grape variety were distributed equally across all sequences of measurement to ensure their comparability because of possibly biased analytical variation. For this purpose, the measurement order was chosen randomized along with a block design. The HS−SPME of the wine volatiles was examined by full automation and online desorption of the analytes to the GC system using a MPS 2 device from Gerstel (Mülheim a. d. Ruhr, Germany). This ensured a steady throughput of samples and constant SPME parameters. The mono material fiber polydimethylsiloxane (PDMS), 100 μm thickness, 10 mm length (Supelco), purchased from SigmaAldrich, Germany, was chosen because of its low enrichment of ethanol, which avoided overload of the GC column. The extraction time for the PDMS fiber was 45 min at a temperature of 40 °C. The prior incubation time was 15 min at 40 °C, and the samples were agitated at 250 rpm during incubation and extraction. Immediately after the extraction, the desorption of analytes was carried out by hot injection at 260 °C, splitless for 2 min, in a narrow liner (0.75 mm internal diameter, Supelco). The split/splitless injector was equipped with a Merlin Microseal, Supelco, and, therefore, operated septumless. Bake-out of the SPME fiber was conducted for 10 min in a bake-out station at 250 °C immediately before every sample extraction. To check fiber intactness and detect possibly carry over of substances, fibers were desorbed without prior extraction twice a day and once every 10−15 samples (fiber blank). New SPME fibers and fibers not used for more than 1 day were conditioned at 250 °C for 30 min. Analysis was carried out on an Agilent Technologies GC 6890 and MS detector 5973 inert (Böblingen, Germany). The desorbed sample analytes were separated on an Agilent J&W DB 1701 (30 m length, 0.25 μm film thickness, and 0.25 nm internal diameter). A helium flow of 1.1 mL min−1 was used. The initial oven temperature of 40 °C was held for 3 min and, after that, was first increased by 6 °C min−1 to 100 °C, with a hold time of 1 min, and second increased by 6 °C min−1 to 240 °C, with a hold time of 5 min. Finally, the temperature of 280 °C was held for 5 min to reduce contamination (total run time of 48 min). After transfer at 280 °C to the MS source with standard electron impact ionization at 230 °C, the quadrupole analysis (150 °C) was carried out in the range of mass/charge ratios of m/z 35−320, with 4.86 scans s−1. Another extraction device and GC−MS instrument was used for independent measurements; a different operator examined the similarly prepared samples of data set 2. The instrument used was of the same type and, therefore, identical in construction to the instrument described above. Identical analytical equipment, GC column type, and HS−SPME−GC−MS procedure were used. Pre-processing and Data Treatment. The three-dimensional GC−MS raw data were converted to a matrix (N × M) giving the intensity values for N samples of M extracted mass features or variables, which comprise the mass/charge ratio (m/z) per time unit (scan). On the basis of the original Agilent files, we performed this pre-processing step with the MetAlign software (http://www. wageningenur.nl, version 080311).9,26,27 The result was stored as a CSV file. After this, already extracted mass features, which could confound classification, were removed: (1) previously known and constant GC contaminants comprising cyclosiloxanes from SPME, siloxanes from sample vial septa (m/z 207, 281, etc.), and phthalates (m/z 149), (2) exceptionally disturbing mass features detected by exploratory data analysis with principal component analysis, and (3) mass features

Wine volatiles are huge in number because they derive from three factors: the grapes, the fermentation by yeast, and the storage of wine.20 Moreover, grape variety and its cultivation, oenological practice, and aging influence the specific aroma profile of a wine.11 Thus, an enormous natural and processrelated variation is achieved. Untargeted MS data of wines underpin this complexity. Hence, a clearly defined sample collection and sufficient number of samples are of particular relevance for classification. Further, the appropriate procedure for data treatment that follows data conversion is of special interest. For example, a large amount of irrelevant variables occurs in GC−MS data and could mislead interpretation of data.21 The appropriate sampling and treatment strategy is particularly important for serving the current demand for the validation of the resultant classification models.22−25 To face such a challenge, the present fingerprinting study of volatiles was conducted and its suitability for German white wine varietal authentication was explored. The analysis was carried out with commercially available white wine samples, and it is therefore assumed that the natural diversity and a wide range of biological and oenological effects on the wine composition were covered. After the HS−SPME−GC−MS analysis, the raw data were converted applying an available metabolomics tool. Then, the appropriateness of the proposed method, including data treatment and processing, was shown by investigating reliability and relevance. Finally, a validated multi-class model was achieved. Additionally, details of the resultant classification model were shown by visualizing the classification performance.



MATERIALS AND METHODS

Samples. A total of 272 commercial German dry white wine samples, obtained in 2011, with a varietal purity of 100% according to the producers, were transferred to 50 mL amber glass bottles and stored at 4 °C in the dark until analysis. A total of 198 wine samples of the botanical origin Riesling (56 samples), Müller-Thurgau (34 samples), Silvaner (44 samples), Pinot Gris (30 samples), and Pinot Blanc (34 samples) formed the main data set 1. Further, 74 additional wine samples, including Riesling (40 samples), Pinot Gris (20 samples), and Pinot Blanc (14 samples) origin, formed data set 2. Samples were of vintage 2009 and 2010 and diverse origin regarding vineyard location in Germany. The German wine quality categories employed were (a) quality wine from specific wine-growing regions and (b) Kabinett, a quality wine with special attributes. The distribution of vintage in data set 2 was of 2009 with 40 samples (all Riesling) and 2010 with 34 samples (all Pinot). Chemicals. All reagents used were of analytical quality. To form the internal standard mix, 3,7-dimethyl-3-octanol (purity > 98%) and 2-hexyl-1-decanol (purity > 97%), both purchased from Sigma-Aldrich (Schnelldorf, Germany), were solved in ethanol absolute per analysi (p.A.), Merck (Darmstadt, Germany). The solution contained on average 30.9 μg g−1 of 3,7-dimethyl-3-octanol and 1.5 μg g−1 of 2hexyl-1-decanol. The mix was stored at −20 °C until analysis and renewed each 5−7 days. The compounds were chosen because of their general absence in wine. They demonstrated stability within the wine matrix and were used to control the mass feature alignment in the preprocessing step. The repeatability of their main fragment areas was tested by successive injections of 17 independent aliquots of a wine sample, each 10 mL mixed with 20 μL of internal standard mix. This resulted in 7 and 8% relative standard deviation of the peak area for tetrahydrolinalool and 2-hexyl-1-decanol, respectively. The internal standard mix was equally prepared and used for data set 2 sample preparation. Furthermore, butylhydroxytoluene (BHT), purchased from SigmaAldrich, was used for identification of a non-wine compound causing sample outliers. The reference standards (R)-(+)-limonene (purity = 6845

dx.doi.org/10.1021/jf502042c | J. Agric. Food Chem. 2014, 62, 6844−6851

Journal of Agricultural and Food Chemistry

Article

Figure 1. PCA score plots based on restricted numbers of white wine samples (data set 1): (A) Riesling and Pinot (N = 120), (B) Riesling and Silvaner (N = 100), (C) Riesling and Müller-Thurgau (N = 90), (D) Pinot and Müller-Thurgau (N = 98), (E) Pinot and Silvaner (N = 108), and (F) Müller-Thurgau and Silvaner (N = 78). Data set completely treated as proposed, N × M = 198 × 952. PC n(x): n order number of PC and x explained variance in X [R2(X)] regarding PC n. Ries (circles) = Riesling; Pin (triangles) = Pinot; Silv (crosses) = Silvaner; and MueTh (stars) = Müller-Thurgau. of a model represents the explained variance used to extract the latent variables [principal components (PCs)] from the total data in relation to the class membership (Y). Second, the goodness of prediction of Y (Q2) of a model gives an estimation of the result of internal crossvalidation by leaving out 1/7 of the data in turn.29 Third, the classification power of classes was established by assigning test samples to a training model and revealing the percentage of accurately classified samples in relation to all predicted samples per class (% CC). Finally, the visual tool of receiver operating characteristic (ROC) curve analysis was employed, recently recommended to be used to evaluate the classification power of models.22,24,25,31 A ROC curve is performed when plotting the proportion of true positives (TP) or the sensitivity against the proportion of false positives (FP)22 or 1 minus the specificity under a varying decision threshold of classification. Thereby, the sensitivity is defined as TP/(TP + FN), and the specificity is defined as TN/(TN + FP), where FN values are false negatives and TN values are true negatives. The optimization of each performed PLS-DA model was necessary, which means that the relevant number of PC was determined to avoid overoptimiztic modeling. This was carried out by considering (a) the gap between R2 and Q2, which should be as small as possible, and (b) Q2, which should still be increasing.29 Furthermore, for optimization, the results of a response permutation test, examined with 100 iterations and calculated by linear regression, in general, may not exceed 30% of the permuted R2 of the random models; otherwise, overfitting of the original model was assumed.29 For the external validation of classification models, the OPLS-DA training models were made up of classes of equal size. At least 22 samples (two-class models) or 29 samples (four-class model) of each class were used to form representative training classes. To obtain these, the test samples were removed from the total set of class samples by a randomized selection per measuring block. This procedure was of advantage in comparison to a pure randomized selection because of the restricted number of possible iterations of the validation process. Thus, the chance of each sample to be a part of the test data was increased. At the same time, the analytical variation was considered. The variables were selected from an OPLS-DA training model to build a PLS-DA training model. The latter was optimized as described above, and the test samples were assigned to the PLS-DA training model, giving the prediction result % CC of each class. The

associated with the internal standards after completion of controlling the alignment step of MetAlign. Additional preliminary variable reduction to reduce noise22 was carried out, so that the mass features with occurrence in less than 80% of samples in relation to each of the individual varietal classes used for modeling were removed. Thereby a mass feature with “no occurrence” in a sample was detectable by the constant low value set in the preprocessing step regarding intensities below the absolute threshold. A further step comprised the substitution of all retained mass feature intensities below the absolute threshold by zero to enhance usable signals of low signal-to-noise ratio. Finally, three steps of common pretreatment methods were followed:22,28,29 (1) square root transformation lowered heteroscedasticity typical for GC−MS data without changing the actual distribution (pseudo-scaling), (2) the row (sample-wise) scaling to constant total was used to compensate for the wide time range of measuring, and (3) directly prior to data analysis, the column scaling method autoscaling was used. The variable reduction and pretreatment were examined using Excel (Microsoft Office, version 2010) and SIMCA-P+ (Umetrics, version 12.01.0). Processing and Multivariate Data Analysis. Principal component analysis (PCA) was used for visualization of the analytical-derived connections among samples and between samples and mass features independent of the origin of samples (unsupervised analysis). It was performed for the purpose of exploratory data analysis. PLS-DA was used for the modeling of samples in relation to the settled classes Y (supervised analysis). A PLS-DA model was based on selected variables deriving from the variable importance in the projection (VIP) value of the variables of an orthogonal PLS-DA (OPLS-DA) model. VIP is the sum of all model dimensions of the contributions of variable influence. Variables with VIP values above 1 are supposed to be the most relevant for explaining Y in relation to the whole model29 and were, therefore, exclusively used to build a PLS-DA model. OPLS-DA separates the modeled variation into the Y-predictive and Y-orthogonal (Y-unrelated) variation30 and removes the latter from the model. This statistical technique was therefore most suitable to select the most contributing variables. Four diagnostic parameters were considered to evaluate the performance of a classification model. First, the goodness of fit (R2) 6846

dx.doi.org/10.1021/jf502042c | J. Agric. Food Chem. 2014, 62, 6844−6851

Journal of Agricultural and Food Chemistry

Article

Table 1. Comparison of Two-Class PLS-DA Models Based on Independent Samples and Measurements (Selected Samples of Data Set 1 and Data Set 2)a model 1 2

class

members (true class)

correct classification (%)

Pinot Riesling Pinot Riesling

6 21 11 17

100 100 100 100

R2 (%)

Q2 (%)

RMSEP

selected variables (VIP > 1)

GC−MS peaks (VIP > 1)

93

92

0.16

358

41

85

84

0.22

175

29

a

Classification of Pinot (Pinot Gris/Pinot Blanc) versus Riesling samples, with test data prediction to the training data (average of three iterations). Training set of model 1: 22 samples per class (N = 44). Training set of model 2: 23 samples per class (N = 46). RMSEP = root mean square error of prediction. procedure was repeated 3 times (two-class models) or 10 times (fourclass model). Excel (Microsoft Office, version 2010) was used for data set splitting. PCA, PLS-DA, OPLS-DA, response permutation, and VIP/ variable selection were examined using SIMCA-P+ (Umetrics, version 12.01.0). ROC curve analysis was carried out with SPSS for Windows (IBM, version 12.0.2) based on the Y prediction values of the test samples assigned to the PLS-DA training models. The determination of associated signals or compounds based on mass features detected in exploratory analysis was examined by consulting the total ion chromatograms (TICs) of samples and a database match of their mass spectra with the National Institute of Standards and Technology (NIST) standard reference database (NIST, version 08).



RESULTS AND DISCUSSION Exploratory Data Analysis. An untargeted approach was pursued for the classification of wine varietals, so that no GC peaks or mass features were a priori selected. Instead, the full information on 7290 scans (data set 1) was pre-processed. The time window of 10−36 min was used, and a data matrix was obtained, comprising 198 observations and the intensities of 6911 mass features for m/z 50 to 320 (N × M = 198 × 6911). During exploratory analysis using PCA, we identified and then removed disturbing mass features. These were detected causing strikingly outer-lying samples by comparison of score plots (samples) with loading plots (variables). The TIC of samples were subsequently considered and followed by the examination of MS database match. Non-wine-related mass features comprised additional siloxane signals (m/z 73, 147, 207, 281, etc.) and m/z 205 and 220 deriving from butylhydroxytoluene (BHT), identified by standard addition. All of these mass signals are known to be frequently detected in GC−MS because they derive from common non-wine compounds widely used in technical applications.32 Furthermore, wine-related mass features causing outer lying of samples were observed, among them the main fragments of ethyloctanoate and ethyldecanoate. The disturbance was caused by the extreme detector saturation and the oversized width of these peaks, which could not be correctly aligned in the preprocessing step. Thus, the complete set of mass features in relation to these compounds was removed. Other observed alignment errors could be corrected by adjusting the userdefined parameters in the pre-processing step. Finally, the cleared data matrix of data set 1 was of size N × M = 198 × 5638. The clustering ability of the data was also revealed by PCA, and the varietal classes used for classification were determined. Pinot Gris and Pinot Blanc wine samples showed identical centering and distributing tendencies and, therefore, formed a joint group “Pinot”. This might be due to the fact that, from a botanical view, the two Pinot varietals are related. The similarity has been confirmed by a two-class PLS-DA model of Pinot Gris

Figure 2. Score plots of example four-class PLS-DA training models (data set 1). Classification of Pinot (Pin, triangles; Pinot Gris and Pinot Blanc) versus Silvaner (Silv, crosses) versus Müller-Thurgau (MueTh, stars) versus Riesling (Ries, circles). A) Training model 2: N × M = 116 × 290, four PCs, 70.3% R2; 65.3% Q2. (B) Training model 7: N × M = 116 × 305, three PCs, 62.7% R2; 57.7% Q2. (C) Training model 10: N × M = 116 × 285, four PCs, 73.8% R2; 66.6% Q2.

versus Pinot Blanc wine, which resulted in a model of zero PC, revealing the non-existence of a difference in variation between the two groups. Finally, the four disjoint groups Riesling, Müller-Thurgau, Silvaner, and Pinot are examined. The data display the most distinct variation between Pinot and Riesling wines. Figure 1A shows their grouping mainly along the axis of 6847

dx.doi.org/10.1021/jf502042c | J. Agric. Food Chem. 2014, 62, 6844−6851

Journal of Agricultural and Food Chemistry

Article

Table 2. Test Sample Predictions to Four-Class PLS-DA Training Models (Data Set 1)a class

members (true class)

correct classification (%)

Pinot (predicted)

Silvaner (predicted)

MueTh (predicted)

Riesling (predicted)

AUC

Pinot Silvaner MueTh Riesling

35 15 5 27

91 97 80 93

31.8 0.2 0.6 0

2 14.5 0.1 0.8

1 0.2 4 1.2

0.2 0.1 0.3 25

0.97 0.98 0.96 0.99

a

Classification of Riesling versus Müller-Thurgau (MueTh) versus Silvaner versus Pinot (Pinot Gris and Pinot Blanc), with correct classification and confusion matrix of the average of 10 iterations. Training data: N × M = 116 × 289 (average). Test data: N = 82. AUC = area under the ROC curve (average).

cases show sufficient but less variation regarding varietal differences. Discrimination within the used sample set could not be demonstrated between the vintages 2009 and 2010, between the different quality categories, and between different German vineyard locations. A different study design might be necessary to prove any differences based on these factors. Accordingly, the found varietal classes appeared virtually homogeneous because no influence of the above factors could be detected. Therefore, multi-class varietal classification was examined on the basis of all samples. However, a few samples visually appeared slightly outside a varietal class. This was due to either natural variation or unknown reasons because no further information about samples was available. Investigating Reliability of Classification. The two-class classification of Riesling versus Pinot using PLS-DA training models resulted in 100% CC for test samples of both classes. However, we could not be certain whether this excellent classification outcome is constant or varies depending upon other factors than the difference of the varietals. Thus, our objective was to conduct a simple approach to find out whether the classification ability was indeed of general validity. Other factors of variation for a given method of analytics, data treatment, and statistical analysis are as follows: (a) the wine samples used, (b) the extraction by SPME, the GC separation, the MS ionization, and separation as well as detection, and finally, (c) the actual implementation of the given concept of feature extraction. The present experiment was performed in varying all of these three factors. This comprised new samples (data set 2) measured on a second device. The obtained data were then pre-processed with newly established and appropriate parameters. Details on the MetAlign parameters applied are given with the Supporting Information. All equivalent samples of data set 1 were selected (71 samples). The proposed pre-processing and data treatment was equally applied to the two data sets (selected samples of data set 1 and data set 2). Then, PLS-DA training models based on selected variables were build. Model 1 was then related to the selected samples of data set 1, and model 2 was related to data set 2. Table 1 shows the results of external validation for the two models. A 100% correct classification for Riesling and Pinot is always obtained, independent from the sample collection and the analytical device used. For modeling the classification question, in both cases, only one PC is necessary. The obtained difference between R2 and Q2 for each model was rather small, which demonstrates for both models their excellent prediction ability. However, the prediction of model 2 was based on a lower average percentage of explained variance (R2). This could be due to a relatively higher degree of other variation sources constituting the data and the possible fact that the differences in relation to the varietals are less pronounced, respectively. Slight

Figure 3. ROC curve analysis of four-class PLS-DA model (data set 1). Per class, all 10 ROC curves of test sample assignment to the training models are shown for classification of (A) Riesling (Ries) versus (B) Silvaner (Silv) versus (C) Müller-Thurgau (MueTh) versus (D) Pinot (Pin; Pinot Gris and Pinot Blanc).

the first PC with an explained variation of 23% R2(X). In contrast, more PCs are needed to achieve discrimination between other groups (panels B−F of Figure 1). Thus, these 6848

dx.doi.org/10.1021/jf502042c | J. Agric. Food Chem. 2014, 62, 6844−6851

Journal of Agricultural and Food Chemistry

Article

additionally verified by retention time and mass spectral accordance to reference standards. These findings revealed some varietal-related compounds of monoterpenoids and C13norisoprenoids11,34−36 contributing to the white wine classification. These are typical white-wine-related odorants.34 For example, TDN is typical for Riesling and has a kerosene like odor.11 Additionally, esters as alcoholic-fermentation-derived aroma compounds contributed, presumably indirectly, as previously reported, for example, regarding the classification of South African wines.37 All of the contributing compounds were favorably extracted by the used PDMS SPME fibers. Polar compound groups, such as the aliphatic alcohols and acids, were not considered for classification because these were hardly extracted. This shows the volatiles “editing” character of the SPME technique.1 In the present study, a validated German white wine prediction based on untargeted data of volatile compounds is established for the first time. The most important German varieties are presented on the basis of commercial samples from all over Germany, comprising two vintages and two quality categories. Therefore, a broad variation of samples is considered. Moreover, the contributing variables are related to important odorants. It should be noted here that the developed method of data treatment, in particular, underpins the consideration of mass features with low signal-to-noise intensity in GC−MS, such as, e.g., linalool. Therefore, also compounds of low sensory threshold are considered for classification. On the whole, the significant relevance of the analytical method and data treatment for white wine is shown. Because the method also shows its ability to achieve reliable results, it is therefore highly adequate for German white wine varietal classification. The untargeted HS−SPME−GC−MS approach might be therefore considered along with other methods, which are used for verifying the botanical origin of white wine varietals, such as the determination of shikimic acid concentration or the application of NMR, to discriminate among a higher number of varietals. Godelmann et al.38 successfully classified German white wine according to the botanical origin using untargeted NMR data. However, the NMR study equally joined Pinot Gris and Pinot Blanc samples for varietal classification because of their similarity. Therefore, further research on Pinot Gris and Pinot Blanc wine discrimination is of interest. Moreover, the current study could be extended regarding the number of Müller-Thurgau samples. To additionally meet the requirements of the global market, the extension to internationally cultivated white wine varietals would be useful. Performance of Classification. The correct classification rate (% CC) is widely used to express the results of a classification model. However, this depends upon not only the model but also the decision that a sample belong to a certain class and not to the other nearest or overlapping class. Thus, the % CC normally does not give the full information about a class. ROC curve analysis overcomes the predefinition of a decision rule and visualizes the full performance of a class under varying thresholds. The area under the ROC curve (AUC) additionally gives a value to assess the classification quality of a class. Very good classification performance is discussed to be achieved for an AUC value of above 0.9 and good performance for a value of at least 0.8.24 An AUC value of 0.5 indicates a full lack of discriminative power. Thus, the here obtained model achieves very good performance for all classes. The AUC of all classes was above

differences in the data distribution of the two models could therefore be assumed. A difference in variable selections was shown in relation to the number resulting from the two data sets, which apply to a different number of peaks in the TIC. Nevertheless, both variable selections qualitatively provided a consistent prediction result. In contrast to the explained variance, the root-mean-square error of prediction (RMSEP) of the external validation is an absolute measure, with its use most suitable to compare models.21 The average prediction error here was only slightly higher for model 2 and, on the whole, was fairly low pronounced. Hence, such factors as samples, instrumental device, and feature extraction do not influence the classification result among Riesling and Pinot wine. The difference in the measuring results regarding the varietals was larger than the variation caused by the other factors. Therefore, a constant varietal prediction results from the proposed strategy. This suggests the applicability also to other laboratories. According to Esslinger et al., this kind of investigation is rare in the field of untargeted research but will be needed for future applications to meet control requirements.4 When reliability is defined here as the frequency with which the classification results are within an allowable error, the experiment achieves 100% reliability for an example classification question. This is due to the fact that the two trials achieve identical results. Multi-class Varietal Classification and Its Relevance. The present study focuses on the five most frequent German white wine varietals. These are cultivated in Germany on 77% of the white grape vineyard area (201233). For classification, all samples of data set 1 were used, forming the four proposed classes. The PLS-DA training models on average were build with 289 selected variables. Three examples of score plots of the 10 training models are shown in panels A−C of Figure 2. Fairly good clustering of classes is observed. Four PCs on average were necessary to model the differences between the classes. All training models were on average explained by 70% of the variance (R2). The goodness of prediction obtained was on average 64% (Q2) which is a typical result achieved by metabolomics data.29 Because the presented model additionally showed a low difference between R2 and Q2, its fairly good prediction ability is demonstrated. In external validation, the test samples of Riesling, Silvaner, and Pinot classes achieved on average 91−97% CC (Table 2). We could not achieve such results for Müller-Thurgau samples. The small number of Müller-Thurgau test samples might be the reason for a less accurate classification. Furthermore, we determined some of the substances here contributing to the classification among the four classes. This was examined considering the selected variables with occurrence in at least 8 of 10 PLS-DA training models. The concerned mass features were then related to peaks in the TIC of several example samples. Further, those were subjected to a MS database match. Identification was assumed for more than 90% compliance. It is to be noted that, from the results, no information was available on the absolute contribution of substances to the prediction because of the pretreatment applied. Further, the relation of compounds to a certain varietal class could not be revealed because of the complexity of the model. The monoterpenoids linalool, γ-terpinen*, α-terpineol*, and limonene*, the C13-norisoprenoids 1,1,6-trimethyl-1,2dihydronaphthalene (TDN)* and vitispirane, and the esters 2-phenylethyl acetate*, methyloctanoate, and methyldecanoate were reliably identified. The marked substances (∗) were 6849

dx.doi.org/10.1021/jf502042c | J. Agric. Food Chem. 2014, 62, 6844−6851

Journal of Agricultural and Food Chemistry

Article

(3) Koek, M. M.; Jellema, R. H.; van der Greef, J.; Tas, A. C.; Hankemeier, T. Quantitative metabolomics based on gas chromatography mass spectrometry: Status and perspectives. Metabolomics 2011, 7, 307−328. (4) Esslinger, S.; Riedl, J.; Fauhl-Hassek, C. Potential and limitations of non-targeted fingerprinting for authentication of food in official control. Food Res. Int. 2014, 60, 189−204. (5) Shepherd, L. V. T.; Fraser, P.; Stewart, D. Metabolomics: A second-generation platform for crop and food analysis. Bioanalysis 2011, 3, 1143−1159. (6) Hall, R. D.; Hardy, N. W. Practical applications of metabolomics in plant biology. In Plant Metabolomics. Methods and Protocol; Hardy, N. W., Hall, R. D., Eds.; Humana Press (Springer): New York, 2012; pp 1−10. (7) Checa, A.; Saurina, J. Metabolomics and PDO. In Food Protected Designation of Origin Methodologies and Applications; de la Guardia, M., Gonzálvez, A., Eds.; Elsevier: Oxford, U.K., 2013. Checa, A.; Saurina, J. Chapter 6Metabolomics and PDO. Compr. Anal. Chem. 2013, 60, 123−143. (8) Smith, C. A.; Want, E. J.; O’Maille, G.; Abagyan, R.; Siuzdak, G. XCMS: Processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 2006, 78, 779−787. (9) Lommen, A. Metalign: Interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing. Anal. Chem. 2009, 81, 3079−3086. (10) Cajka, T.; Hajslova, J. Volatile compounds in food authenticity and traceability testing. In Food Flavors: Chemical, Sensory and Technological Properties; Jelen, H., Ed.; CRC Press: Boca Raton, FL, 2012; pp 355−411. (11) Styger, G.; Prior, B.; Bauer, F. F. Wine flavor and aroma. J. Ind. Microbiol. Biotechnol. 2011, 38, 1145−1159. (12) Cozzolino, D.; Smyth, H. Analytical and chemometric-based methods to monitor and evaluate wine protected designation. In Food Protected Designation of Origin Methodologies and Applications; de la Guardia, M., Gonzálvez, A., Eds.; Elsevier: Oxford, U.K., 2013. Cozzolino, D.; Smyth, H. Chapter 15Analytical and chemometricbased methods to monitor and evaluate wine protected designation. Compr. Anal. Chem. 2013, 60, 385−408. (13) Caligiani, A.; Cirlini, M.; Palla, G. Volatile compounds for the authentication of wine and derived products. In Food Authentication Using Bioorganic Molecules; Sforza, S., Ed.; DEStech Publications: Lancaster, PA, 2013; pp 327−360. (14) Tikunov, Y.; Lommen, A.; De Vos, C. H. R.; Verhoeven, H. A.; Bino, R. J.; Hall, R. D.; Bovy, A. G. A novel approach for nontargeted data analysis for metabolomics. Large-scale profiling of tomato fruit volatiles. Plant Physiol. 2005, 139, 1125−1137. (15) Guo, J.; Yue, T.; Yuan, Y. Feature selection and recognition from nonspecific volatile profiles for discrimination of apple juices according to variety and geographical origin. J. Food Sci. 2012, 77, C1090−C1096. (16) Malheiro, R.; Guedes De Pinho, P.; Soares, S.; César da Silva Ferreira, A.; Baptista, P. Volatile biomarkers for wild mushrooms species discrimination. Food Res. Int. 2013, 54, 186−194. (17) Rocha, S. M.; Coutinho, P.; Barros, A.; Delgadillo, I.; Coimbra, M. A. Rapid tool for distinction of wines based on the global volatile signature. J. Chromatogr. A 2006, 1114, 188−197. (18) Marti, M. P.; Busto, O.; Guasch, J. Application of a headspace mass spectrometry system to the differentiation and classification of wines according to their origin, variety and ageing. J. Chromatogr. A 2004, 1057, 211−217. (19) Boscaini, E.; Mikoviny, T.; Wisthaler, A.; Hartungen, E. V.; Märk, T. D. Characterization of wine with PTR−MS. Int. J. Mass Spectrom. 2004, 239, 215−219. (20) Bakker, J.; Clarke, R. J. Wine Flavour Chemistry, 2nd ed.; WileyBlackwell: Chichester, U.K., 2012. (21) Kjeldahl, K.; Bro, R. Some common misunderstandings in chemometrics. J. Chemom. 2010, 24, 558−564.

0.95 (Table 2). However, to observe the differences of the classes in detail, we assessed the performance of each class by the shape of the curves. In a class with a high prediction performance, the ROC curve first rises vertically from the point (0/0) and then abruptly changes direction and continues horizontally toward the point (1/1). In cases of a lower predictive ability, the curve is flattened toward the diagonal line. This line represents pure random prediction. To compare the class performance, ROC curves of the 10 PLS-DA test data predictions were displayed overlapped per class (panels A−D of Figure 3). First, of all classes, the Riesling class (Figure 3A) shows the best classification performance because all curves are close to the top left corner. In contrast, some of the Müller-Thurgau class curves (Figure 3C) are mostly flattened against the diagonal line and, therefore, show the lowest overall classification performance. This again could be due to a low test sample number. It can be seen that the height of a step in a curve is in relation to the sample number of the class. Furthermore, the Pinot and Silvaner classes show more cases of curves that are lowered at the horizontal part (panels B and D of Figure 3), meaning observed high FP rates. These results of ROC curve analysis give a comprehensive impression of a class. For example, when considering a possible low FP rate, the Riesling class turns out to be the least sensitive because the test samples showed the maximum of about 30% FP. In contrast, the Pinot class proves to be the most sensitive class because up to about 90% FP was observed. Additionally, this technique could be of interest for future investigations, adjusting a model, for example, in relation to the requirement of a low outcome of FP.



ASSOCIATED CONTENT

S Supporting Information *

Additional information on the established pre-processing and data treatment method, with the MetAlign process including the parameter choice (Table S1) and deviations from the MetAlign processing, formulas of the applied pretreatment (Table S2), impact of the proposed data treatment on the discrimination of white wines (Figure S1), and advantage of the applied variable selection strategy (Table S3 and Figure S2). This material is available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*Telephone: +49-30184123393. E-mail: carsten.fauhl-hassek@ bfr.bund.de. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors thank G. Scholten and H. Rudy from DLR Mosel, Germany, for providing the reference chemical 1,1,6-trimethyl1,2-dihydronaphthalene (TDN).



REFERENCES

(1) Pico, Y. Chemical Analysis of Food: Techniques and Applications; Academic Press: Waltham, MA, 2012. (2) Chabreyrie, D.; Chauvet, S.; Guyon, F.; Salagoïty, M.-H.; Antinelli, J. F.; Medina, B. Characterization and quantification of grape variety by means of shikimic acid concentration and protein fingerprint in still white wines. J. Agric. Food Chem. 2008, 56, 6785−6790. 6850

dx.doi.org/10.1021/jf502042c | J. Agric. Food Chem. 2014, 62, 6844−6851

Journal of Agricultural and Food Chemistry

Article

(22) Brereton, R. G. Chemometrics for Pattern Recognition; Wiley: Chichester, U.K., 2009. (23) Brereton, R. G. Consequences of sample size, variable selection, and model validation and optimization, for predicting classification ability from analytical data. Trends Anal. Chem. 2006, 25, 1103−1111. (24) Broadhurst, D. I.; Kell, D. B. Statistical strategies for avoiding false discoveries in metabolomics and related experiments. Metabolomics 2006, 2, 171−196. (25) Oliveri, P.; Downey, G. Multivariate class modeling for the verification of food-authenticity claims. Trends Anal. Chem. 2012, 35, 74−86. (26) Lommen, A. Data (pre-)processing of nominal and accurate mass LC−MS or GC−MS data using metalign. In Plant Metabolomics. Methods and Protocol; Hardy, N. W., Hall, R. D., Eds.; Humana Press (Springer): New York, 2012; pp 229−253. (27) De Vos, R. C. H.; Moco, S.; Lommen, A.; Keurentjes, J. J. B.; Bino, R. J.; Hall, R. D. Untargeted large-scale plant metabolomics using liquid chromatography coupled to mass spectrometry. Nat. Protoc. 2007, 2, 778−791. (28) van den Berg, R. A.; Hoefsloot, H. C. J.; Westerhuis, J. A.; Smilde, A. K.; van der Werf, M. J. Centering, scaling, and transformations: Improving the biological information content of metabolomics data. BMC Genomics 2006, 7, No. 142. (29) Eriksson, L.; Johansson, E.; Kettaneh-Wold, N.; Trygg, J.; Wikström, C.; Wold, S. Multi- and Megavariate Data Analysis. Part I Basic Principles and Applications; Umetrics AB: Umeå, Sweden, 2006. (30) Eriksson, L.; Johansson, E.; Kettaneh-Wold, N.; Trygg, J.; Wikström, C.; Wold, S. Multi- and Megavariate Data Analysis. Part II Advanced Applications and Method Extensions; Umetrics AB: Umeå, Sweden, 2006. (31) Brown, C. D.; Davis, H. T. Receiver operating characteristics curves and related decision measures: A tutorial. Chemom. Intell. Lab. Syst. 2006, 80, 24−38. (32) Hübschmann, J. Handbook of GC/MS. Fundamentals and Applications; Wiley-VCH: Weinheim, Germany, 2001. (33) Statistics on German Wine 2013−2014; http://www. germanwines.de (press service/statistics, accessed April 2014). (34) Fischer, U. Wine aroma. In Flavours and Fragrances. Chemistry, Bioprocessing and Sustainability; Berger, R. G., Ed.; Springer: Berlin, Germany, 2007; pp 241−267. (35) Ebeler, S. E.; Thorngate, J. H. Wine chemistry and flavor: Looking into the crystal glass. J. Agric. Food Chem. 2009, 57, 8098− 8108. (36) Mendes-Pinto, M. M. Carotenoid breakdown products−the norisoprenoids−in wine aroma. Arch. Biochem. Biophys. 2009, 483, 236−245. (37) Tredoux, A.; De Villiers, A.; Májek, P.; Lynen, F.; Crouch, A.; Sandra, P. Stir bar sorptive extraction combined with GC−MS analysis and chemometric methods for the classification of South African wines according to the volatile composition. J. Agric. Food Chem. 2008, 56, 4286−4296. (38) Godelmann, R.; Fang, F.; Humpfer, E.; Schütz, B.; Bansbach, M.; Schäfer, H.; Spraul, M. Targeted and nontargeted wine analysis by 1 H NMR spectroscopy combined with multivariate statistical analysis. Differentiation of important parameters: Grape variety, geographical origin, year of vintage. J. Agric. Food Chem. 2013, 61, 5610−5619.

6851

dx.doi.org/10.1021/jf502042c | J. Agric. Food Chem. 2014, 62, 6844−6851