Real-Time Food Authentication Using a Miniature Mass Spectrometer

Sep 11, 2017 - Minimum sample preparation was needed for the analysis of liquid and solid food samples. Mass spectrometric data was processed using th...
0 downloads 7 Views 774KB Size
Subscriber access provided by UNIVERSITY OF THE SUNSHINE COAST

Article

Real-time food authentication using a miniature mass spectrometer Stefanie Gerbig, Stephan Neese, Alexander Penner, Bernhard Spengler, and Sabine Schulz Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.7b01689 • Publication Date (Web): 11 Sep 2017 Downloaded from http://pubs.acs.org on September 11, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Real-time Food Authentication Using a Miniature Mass Spectrometer #

#

Stefanie Gerbig , Stephan Neese , Alexander Penner, Bernhard Spengler, Sabine Schulz* #

S.G. and S.N. contributed equally to this work

Institute of Inorganic and Analytical Chemistry, Justus Liebig University Giessen, 35392 Giessen, Germany *Corresponding Author: Sabine Schulz [email protected] Tel: 0641-9934807 Fax: 0641-9934809

Abstract Food adulteration is a threat to public health and economy. In order to determine food adulteration efficiently, rapid and easy-to-use on-site analytical methods are needed. In this study, a miniaturized mass spectrometer in combination with three ambient ionization methods was used for food authentication. The chemical fingerprints of three milk types, five fish species and two coffee types were measured using electrospray ionization, desorption electrospray ionization and low temperature plasma ionization. Minimum sample preparation was needed for the analysis of liquid and solid food samples. Mass spectrometric data was processed using the laboratory-built software MS food classifier which allows for the definition of specific food profiles from reference data sets using multivariate statistical methods and the subsequent classification of unknown data. Applicability of the obtained mass spectrometric fingerprints for food authentication was evaluated using different data processing methods, leave-10%-out cross-validation and real-time classification of new data. Classification accuracy of 100 % was achieved for the differentiation of milk types and fish species and 96.4% for coffee types in cross-validation experiments. Measurement of two milk mixtures yielded correct classification > 94%. For real-time classification, the accuracies were comparable. Functionality of the software program and its performance is described. Processing time for a reference data set and a newly acquired spectrum was found to be 12 seconds and 2 seconds, respectively. These proof-of-principle experiments show that the combination of a miniaturized mass spectrometer, ambient ionization and statistical analysis is suitable for on-site real-time food authentication. Keywords Food authentication, statistical data analysis, ambient ionization, miniaturized mass spectrometer, real-time data analysis Introduction The globalization of markets, which is accompanied by export and import of food from various countries, reduces the traceability of food production and transport chains. This facilitates food fraud and adulteration which can manifest in many different ways. One is the exchange of expensive ingredients partially or totally by cheaper ones without declaration. Another way is to adulterate food with prohibited substances to enhance its appearance, texture or flavor. A third way to increase profits with inferior or low cost products is the false declaration of food origin and production processes. Besides economic losses, food fraud represents a threat to human health1, e.g. if prohibited additives are toxic or contaminated with pathogens, or if non-declared substitutes and production processes can cause health problems such as allergic reactions. Products which are regularly the target of food

1 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 15

fraud include oil, honey, coffee, tea, dairy products, meat, fish, alcoholic beverages, fruits, vegetables (as well as juices thereof) and spices2. In the European Union the right of consumers to purchase authentic food is outlined in Regulation No. 178/2002 and enforced by legal authorities of the member states, but also producers and retailers of high quality food are interested to protect their products from fraud. The Grocery Manufacturers Association estimates that fraud may cost the global food industry between $10 billion and $15 billion per year3 due to reduced brand equity, recalls and public fear. From the analytical point of view, food authentication remains a challenging task, since on the one hand a constantly advancing performance of analytical methods is accompanied by more sophisticated food fraud. On the other hand appropriate methods for the verification of food origin and production processes (e.g. organic vs. conventional) are not fully established but currently under research4,5. Analytical methods used for food authentication include spectroscopic technologies, methods based on isotopic, genetic and enzymatic analysis, chromatography, electrophoresis and thermal methods6,7. Analysis targets either specific compounds such as additives and food/species specific markers (based on enzymes, genes or metabolites) or uses the chemical fingerprint of the food (e.g. genes, elements, proteins, metabolites) to identify food fraud or adulteration8-11. Especially for the verification of the food origin and production process chemical fingerprinting of food in combination with chemometric methods has shown promising results10,12. Due to the high number of potentially adulterated food, cheap and fast analytical methods are needed. Ideally authentication testing of products should be performed before marketing for an effective consumer protection, which would involve on-site analysis for imported perishable products such as fruits, vegetables, meat and fish. Currently on-site food authentication is mainly performed using infrared (IR), near-infrared (NIR) or Raman spectroscopy with portable and handheld commercial devices 13. These devices record a fingerprint of the vibrational absorption spectrum of the food sample to be compared to a database for food authentication. In some cases, individual chemical substances can be identified via their specific absorption bands, but the number of resolved spectral features is limited. Mass spectrometry (MS) is considered as an alternative method that might be suitable for on-site food identification. It offers the advantage of both fingerprinting the mass spectral signal pattern but also the possibility to identify specific compounds (adulterants of food items). Besides, it is not sensitive to water, which is the case for IR spectroscopy and a disadvantage for almost all food items of plant origin. Disadvantages of MS however are, that direct contact to the material is required, the larger instrumentation and the more expensive purchasing price. However, the spectral resolution of MS is higher so there could be a better chance of discovering adulteration via fingerprinting methods. MS also provides a higher versatility due to exchangeable ion sources. This allows the ionization and measurement of chemically very different compound classes. Once the instrumental setup is running, spectroscopic and MS analyses perform similarly in terms of time requirement and cost per sample. Cooks and Ouyang have pioneered on-site mass spectrometric analysis of condensed-phase materials having low vapor pressures. Their research focused on the development of miniaturized mass spectrometers 14-17 and appropriate ionization sources for high-throughput preparation-free sample analysis 18-21. A series of portable ion trap mass spectrometers were developed with improving performance14-17. The smallest instrument, called Mini 1114, has a weight of 4 kg. On-site applications in the field of medicine and home-land security have been demonstrated by the detection of explosives22, drugs14,17 and pharmaceuticals14,15,22. Recently, the ability of the latest system to distinguish between different bacteria strains based on their metabonomic fingerprint has been demonstrated23. In addition, food has been analyzed for its contamination with pesticides22,24, but so far the system has not been tested for food authentication. Here, we show the food authentication capabilities of the Mini 11 as exemplified for three different food groups

2 ACS Paragon Plus Environment

Page 3 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(milk, coffee and fish) using three different ambient ionization methods that are well suited for the respective analytical task (electrospray ionization (ESI), desorption electrospray ionization (DESI) and low temperature plasma ionization (LTP)). Food authentication in this study is based on the metabonomic fingerprints of the different food items as they are accessible by the applied ionization methods and chemometric methods. A customized computer program that allows chemometric analysis of reference food data and on-site classification of new food data in real time is introduced. Experimental section Chemicals and Reagents Water and formic acid were purchased from Fluka (Neu Ulm, Germany) in LC-MS quality. Methanol and acetonitrile were bought from Merck KGaA (Darmstadt, Germany) in Uvasol quality. Food samples Milk: Ultra heat-treated (UHT) organic goat milk (3.2% fat), UHT cow whole milk (3.5% fat) and soy milk (1.8% fat) were purchased from a local supermarket. Two different brands of goat and soy milk and three different brands of cow milk were purchased. Milk was diluted 1:9 with water. Mixtures of cow and goat milk (50:50) and different ratios of cow and soy milk mixtures were also diluted 1:9 with water. Coffee: 100% Arabica coffee beans and 100% Robusta coffee beans were purchased from an online coffee shop. Both coffees were produced in the Kerala region of India. Beans were analyzed with LTP-MS without further sample preparation. Fish: Atlantic salmon (Salmo salar), pollack (Pollachius virens), trout (Oncorhynchus mykiss), redfish (Sebastes spec.) and atlantic cod (Gadus morhua) were purchased from local supermarkets. A small piece of fish (about 1 g) was smeared on a glass slide, solid parts were removed and the glass slide was analyzed with DESI-MS. The authenticity of all samples was confirmed by the Landesbetrieb Hessisches Landeslabor (LHL) using validated analysis procedures. Mass spectrometry measurement Measurements were performed at a portable prototype Mini 11 mass spectrometer (manufactured by Purdue University, USA)14 in positive ion full scan mode in the mass-to-charge (m/z) range 70-800. The whole system has a size of 20 x 28.5 x 17.5 cm, weighs 4 kg and was operated at 24 V and 6.3 A DC power supply. It features a mass resolution of R = 100 @ m/z 104, unit mass accuracy and a cycle time of about 1.5 s per single spectrum. A homebuilt sprayer18 with fused silica capillaries for the solvent (ID 100 µm, OD 200 µm) and nitrogen transport (ID 250 µm, OD 375 µm) was used for ESI measurements. The spray voltage was set to 4 kV, the nitrogen pressure to 6 bar and the flow rate of the spray solvent to 5 µl/min. Geometrical settings were adjusted to the following values: distance of sprayer to MS inlet capillary = 5 mm, spray angle to the inlet capillary = 70°. A varying number of milk spectra was recorded for the different experiments. After analysis of each milk type, the system was rinsed with 250 µL of pure water and the inlet capillary was wiped with cotton cloth to avoid carryover. For DESI measurements, methanol: water in the ratio 9:1 (v/v) with 0.1 % formic acid was used as solvent. Flow rate, spray voltage and nitrogen pressure were the same as for the ESI experiments. Geometrical settings were adjusted to the following values: distance from sprayer to sample = 2 mm, distance of DESI sprayer to MS inlet capillary = 3 mm, spray angle (between DESI sprayer and sample plane) = 65°. For each fish species about 50 single spectra were collected. For LTP measurements a home-built LTP probe19 with an AC voltage of 10 kV applied to the outer electrode at a frequency of 3.4 kHz and helium gas flow of 99.999% purity at a flow rate of 0.5 L/min was used.

3 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 15

The probe was directed onto the coffee beans at an angle of 65° and a distance of 5 mm. The distance of the LTP probe to the inlet capillary of the MS was 1 cm. About 250 single spectra were collected for each coffee type. Mass spectral data were saved in csv text format, one spectrum per file. Data analysis A customized program “MS Food Classifier” was written in Java programming language for the analysis and realtime classification of the mass spectral data. The program uses principal component analysis (PCA) to compare mass spectral fingerprints of reference food items. The transformed reference data are subsequently used to classify new mass spectral data offline or in real time. The program’s user interface (Supporting Figures S-1 to S-4) is divided into six tabs. The six tabs comprise the calculation of a profile for selected food items, the generation of a score plot of a profile, the generation of a loading plot of a profile, cross-validation of a profile, the classification of new food items using an existing profile and the classification of new food items during an MS experiment, called live classification. The software code will be made publicly available in the “Research” section of Prof. B. Spenglers research group web site (https://www.uni-giessen.de/fbz/fb08/Inst/iaac/spengler/forschung/dateien/MSclassifier/view). Details on data handling and classification process are given below. Data preparation

The program reads the csv text files containing the mass spectral reference data as generated by the Mini 11 consisting of intensity data each 0.0904 m/z throughout the entire mass range. The spectral data are binned to a chosen m/z range, given as an input parameter of the software program. Data are subsequently normalized by dividing each bin of the spectral data by the mean intensity of the spectrum. Then PCA is performed using either the NIPALS algorithm 25 or the QR algorithm26. The data preparation algorithms are built on top of the Waikato Environment for Knowledge Analysis (WEKA) data mining library27, which has become an established system in machine learning and data mining. The NIPALS algorithm calculates a user-definable number of principal components (PCs). The QR algorithm calculates all PCs and chooses a subset of maximum 60 PCs28 based on a user-definable percentage of variance to be covered by PCA. Finally, the transformed reference data for all food items are saved to a profile file used in the following for classification of unknown food items. Classification process

New unknown spectral data are preprocessed and transformed into the PCA space of the profile according to the parameters provided with the profile file. The classification is then performed using one of the three available classifiers: Euclidean distance, Mahalanobis distance29 and Linear Discriminant analysis (LDA)30. A score is calculated to inform about the confidence of the classification. Data evaluation

The program allows the generation of PCA score and loading plots to evaluate the data and to optimize the input parameters for data pre-processing and PCA transformation and to find the m/z bins that differ the most between the data groups of the profile. In addition, the program offers the possibility to carry out leave-10%-out crossvalidations on given mass spectral data. Cross-validation can be used to test the stability of the profile against variations and to evaluate which set of input parameters for profile creation maximizes the classification capability. The spectra left out for validation are classified using all three classifiers and the percentage of correct classification is reported.

4 ACS Paragon Plus Environment

Page 5 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Program evaluation The classification ability of the program was evaluated using mass spectral data containing around 600 spectra of the milk samples. A Lenovo T410 Notebook with Core i5 M520 processor (2 x 2.40 GHz core speed) and 2.8 gigabytes of RAM memory was used for carrying out the tests. Eight leave-10%-out cross-validations were performed using different parameters for each analysis. To test the dependence of bin size on the classification process, analysis was carried out using two different bin sizes: 0.5 u and 2.0 u. For both bin sizes the log-scaled and non-log-scaled input data were analyzed using the QR algorithm26 and the NIPALS algorithm25 for PCA. The QR algorithm was set to calculate principal components with 90% variance coverage. This equals 7 PCs for the nonlog-scaled data with a bin size of 2.0 u and 60 PCs for the other cross-validation settings (see Supporting Tables S1,S-2 and S-3 for further information). The same number of PCs was used for NIPALS. Processing time and memory usage for creating a profile and classifying spectra was tested with the same data set. The dataset had a file size of 109 MB. Profiles were created with the log-scaled data using bin sizes of m/z= 0.25, 0.5 and 2.0 u, both PCA algorithms and 60 PCs. All profiles were calculated five times to obtain mean values and standard deviations. Results and Discussion The ability to differentiate and classify food items based on their chemical fingerprint using a portable mass spectrometer in combination with chemometric methods was tested using three different food classes and three different ambient ionization sources. Three types of milk were analyzed using electrospray ionization in a diluteand-shoot approach representing the analysis of liquid food samples. Five fish species were analyzed using DESI exemplifying the fingerprinting of (semi)-polar non-volatile analytes in condensed-phase food samples. Two coffee types were measured using LTP ionization as a proof-of-concept for detecting the volatile chemical fingerprint of condensed products. Food groups were selected from the TOP 10 list of most frequently adulterated food including liquid and solid samples from plant and animal origin. Analytical tasks for these food groups represent common food authentication issues. Ionization methods for the food classes were chosen based on two factors: accessible compound classes and minimal sample preparation. Some compound classes are known to be well suited for food authentication and best accessible with certain ambient ionization methods31,32. With this selection of experiments we aimed to show the performance of the instrument for a broad range of samples. Analysis of milk species with ESI MS Milk is often affected by food adulteration. In developing countries, water is the most common adulterant. Since water decreases the nutritional value, taste and appearance of milk, other potentially hazardous substances such as melamine may be added to mimic natural milk, potentially causing serious health problems33. Alternatively, premium milk is often diluted by cheaper milk. E.g. milk from more exotic species like yak, buffalo or camel was diluted with milk from the cow or goat, goat milk was adulterated with cow milk34 and cow milk was mixed with soy milk.35 For consumers either intolerant or allergic to certain (undeclared) milk types this may cause serious health problems.36 In order to evaluate the possibility to differentiate goat, cow and soy milk based on their chemical fingerprint, milk samples were diluted 1:9 with water and were directly analyzed with the Mini 11 using electrospray ionization. Figure 1 shows the sum spectra (n=20) obtained from a) cow milk, b) goat milk and c) soy milk. Clear differences can be seen between the three milk types concerning number of peaks and peak intensities. Figure 1d shows a score plot of the three pure milk types as well as two mixtures (cow/soy (1:1) and cow/goat (1:1)). The spectra of

5 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 15

the three pure milk samples are well differentiated in the plot, while data points representing the mixture of cow and goat milk are located in close vicinity to those two types. The data points of cow/soy milk mixture are located close to the group of cow milk indicating high similarity to cow milk spectra. Additional PCA score plots of the milk mixtures and the respective pure milk types can be found in Supporting Figure S-5. An extended analysis of cow/soy milk mixtures with different ratios can be found in the Supporting Material in Figure S-6. About 200 spectra of each pure milk type, 170 spectra of cow/goat milk mixture and 50 spectra of cow/soy milk mixtures were collected to serve as reference data set. The pure milk data set was extensively analyzed with our in-house developed program MS food classifier in order to test the quality of the data for food differentiation, to optimize food classification and evaluate the overall program performance.

Figure 1: ESI mass spectra of a) cow milk, b) goat milk and c) soy milk. Sum of 20 spectra each. d) PCA score plot of the milk data set. Each point represents a mass spectrum of cow milk (yellow), goat milk (blue) and soy milk (red), a 1:1 mixture of cow and soy milk (black) and a mixture of cow and goat milk (green). Data was non-log-scaled. QR Algorithm was used with 0.99 variance coverage and a bin size of 2 u. e) PCA loading plot of the milk data set.

The MS food classifier is based on principal component analysis (PCA) for data reduction and variance analysis of a reference data set and offers several classifiers for the assignment of mass spectral data into predefined food groups. For data preprocessing, the bin size, in which the mass-to-charge range of the data set is subdivided, is user-definable and the user can select between normalized peak intensities or log10-transformed normalized peak intensities. In addition, two PCA algorithms NIPALS and QR are implemented and Euclidean distance, Mahalanobis distance or the calculated coefficient from Linear Discriminant analysis (LDA) can be used for the classification of mass spectral data. The correct implementation of the program’s PCA transformation was confirmed with commercial software (Unscrambler X, Version 10.1, CAMO Software). In order to evaluate the quality of the PCAtransformed reference data set for food classification and the influence of different settings for data analysis, leave-10%-out cross-validation was applied. The influence of the different settings on the classification accuracy for this data set was evaluated using eight different combinations of bin size, PCA algorithm, intensity scaling and number of principal components (Supporting Table S-1). Overall more than 85% of the spectra in this data set were classified correctly through

6 ACS Paragon Plus Environment

Page 7 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

cross-validation with all setting combinations. This indicates that the overall data quality is good. The classifiers Mahalanobis distance and LDA coefficient resulted for all combinations in 100 % correct classification. The Euclidian distance proved to be less suited. This classifier is based on the assumption of a normal distribution of data around the group centre, which was not observed for the milk data set using the normalized peak intensities. Log10-transformation of the normalized peak intensities led to more normal distributed data groups hence improving the performance of the Euclidian distance to 99.9% correct classification. No significant influence of the used PCA algorithm and bin size on the classification accuracy was observed for this data set. Processing time and memory usage were evaluated for the PCA transformation. Comparison of both parameters for different bin sizes and PCA algorithms can be found in the Supporting Information. Figure S-7 shows processing time and memory usage for the different algorithms, while Figure S-8 provides related information about the dependence of processing time and memory usage on the two parameters bin size and type of classification method. In order to save time and memory space a bin size of 2 u and QR algorithm are recommended. The classifiers LDA coefficient and Mahalanobis distance worked equally well, Euclidian distance is not recommended. Since Log10transformation of the peak intensities gives more weight to low abundant peaks and chemical noise in the PCA, usage of the non-log transformed peak intensities is recommended. PCA analysis of 600 reference spectra and classification of these spectra is performed in less than 30 seconds using these settings. Figure 1 d) shows the PCA score plot of the milk data set including the pure milk types and two mixtures (1:1) using a bin size of 2 u, non-logscaled peak intensities and QR algorithm with 0.99 variance coverage (only the first 3 PCs are displayed). The milk types are well separated into five groups. Soy milk showed the highest variability within the group, explaining the low performance of the Euclidian distance for classification (85 % correct classification, soy milk spectra were misclassified as cow milk spectra) in comparison to the other classifiers (100 % correct classification). We assume that the high deviation of soy milk data points is related to the observation that soy milk didn’t form a stable homogenous mixture in water. After approx. 1 minute, the liquid in the syringe showed formation of separate phases. The syringe was shaken to minimize this effect, but it was probably not entirely avoided. Figure 1 e) shows the corresponding PCA loading plot. Mass-to-charge ratio 104, 112, 116, 156, 364 and 366 contributed the most to the group separation. Signal at m/z= 104 can be assigned to choline and signal at m/z = 366 to the sodium adduct of disaccharides. Reference mass spectra of all three milk types were recorded with a high-resolving orbital trapping mass spectrometer to verify the identity of those signals. Although the Mini has relatively poor mass resolution, signals with ∆ m/z > 3 should be resolved. The cross-validation was also carried out for the milk mixtures using LDA coefficient and yielded 94 % correct classification for cow and goat milk mixture (Figure S-5) and 97.5 % correct classification for cow and soy milk mixture (Figure S-6). All pure milk types and mixtures were included in the classification model. We also tested the repeatability and stability of the method using a larger cohort of milk spectra comprising different brands of milk on the one hand and around 6600 spectra from all milk measurements that we have performed during a period of more than one year on the other hand. This data is shown in the Supporting Material in Figures S-9 and S-10. The cross-validation yielded 99.8% correct classification using LDA coefficient for different milk brands (Figure S-9) and 86 % correct classification for all milk measurements done so far (Figure S-10). The good group separation in the PCA plot as well as the high correct-classification rates in cross-validation experiments show the ability of the Mini 11 to generate a high-quality reference data set for milk species classification and hence authentication. The pure milk reference data set was used for the real-time classification of unknown milk in the following.

7 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 15

Real-time classification The ability to classify freshly recorded data in real time, based on a previously measured reference data set, was tested using goat milk as unknown milk sample. A new sample of goat milk was measured again about one month after measuring the reference data set. The MS food classifier was executed in the “live classification mode” in parallel to the Mini 11 control software. As soon as a new mass spectrum was saved by the Mini 11 control software, this spectrum was analyzed by the MS food classifier. The result of the classification analysis was shown in a window in the lower right corner of the laptop screen (see Figure S-11 in the Supporting Information). From top to bottom the file name of the spectrum, the identified milk species, a confidence score and the distance to the group center of the identified milk species is given in the result window. Mahalanobis distance was used as classifier for this experiment. Video S-1 shows the live experiment. Classification results for consecutively measured spectra were saved in a csv-file. From 102 recorded single spectra of goat milk only one was misclassified as soy milk showing that a high rate of correct classification can also be achieved in a real-time experiment. The simple sample preparation, the operation of the Mini 11 with predefined settings and the fully automated data interpretation allows users to perform food authentication tests on the Mini 11 in real time, even without analytical expertise. This proof-of-principle experiment points out the potential of the method in food authentication. In this first demonstration of using a portable mass spectrometer for food authentication we showed that milk types can be differentiated with this system. In the future milk mixtures as well as larger sample cohorts will be analyzed to validate these findings. As shown for the milk example, liquid food samples can be easily analyzed with this system using electrospray ionization after a simple dilution step. The majority of food samples, however, are not liquid, therefore the following experiments focused on the analysis of solid food samples. Analysis of fish species with DESI MS Fish fraud can concern species, geographical origin or production method. Since 2002, European Union directives and regulations regarding fishery and aquaculture products have established that this information must be labeled on the product (Council Regulation (EC) No.104/2000 and 2065/2001 of the European Parliament) in order to guarantee market transparency. Substitution of expensive rare species with more common and cheaper fish species is one of the most common types of food fraud 37-39. While the species is easily distinguishable in most cases if the fish is intact, it becomes quite challenging after processing to e.g. filet for several species. Among the most mislabeled white fish species (> 10 % mislabeled of samples) in Europe are grouper, common sole, redfish and hake40. Trout, Atlantic cod, pollack, seabass and saithe are object to food fraud to a lower extent (≤ 5% of mislabeled samples)40. Common substitution for cod is Alaska pollack and for salmon steel head it is trout41. Here, we analyzed filets of five fish species with the Mini 11 using DESI. For ESI and DESI measurements the same setup was used. The ESI sprayer is directed to the sample surface during DESI measurements. The only sample preparation step was to smear a small piece of filet onto a glass slide. The glass slide was then placed underneath the ESI sprayer. After impact of the charged solvent droplets on the glass surface, non-volatile metabolites were extracted by the droplets, desorbed into the gas phase and ionized. About fifty spectra were recorded for each fish species and subsequently analyzed with the MS food classifier. Spectra of the fish groups expressed very high similarity, allowing no immediate differentiation of the fish species (Supporting Figure S-12). The MS food classifier software was able to bring out the differences within the groups of this data set. Best choice of data processing variables (bin size, intensity scaling, PCA algorithm) was evaluated analogous to the milk data set and mainly confirming the previous optimal settings of bin size (2u) and PCA algorithm (QR) (Supporting Table S-3). Figure 2a

8 ACS Paragon Plus Environment

Page 9 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

shows the PCA score plot for the fish data set using the QR algorithm with 0.99 variance coverage, normalized intensity scaling and a bin size of 2 u. The fish groups are not fully separated by the first three principal components and overlap to some extent with other fish species. However, for each group a group center can be recognized, indicating that a classification with good accuracy might be possible using all principal components calculated.

Figure 2a) PCA score plot of the MS data of redfish (blue), pollack (green), trout (yellow), cod (black) and salmon (red). The QR algorithm was used with 0.99 variance coverage and a bin size of 2 u. Since the depicted figures do not capture every angle, we have provided 4 additional figures from different points of view in the Supporting Information (Figure S-13) b) Loading plot with assignment of data points that strongly contribute to the separation of data. The assignment was based on reference measurements using a high-resolution orbital trapping mass spectrometer.

This was evaluated using the cross-validation and classification function of the MS food classifier (Supporting Table S-3). Among the classifiers, the LDA coefficient was the best choice, but in contrast to the milk data set Mahalanobis distance performed the worst of all classifiers probably due to the lower number of spectra per group and a comparable normal distribution of data around the group center. Cross-validation revealed that 100 % of the fish spectra were correctly identified using the LDA coefficient as classifier with the profile shown in Figure 2a. Figure 2b shows the loading plot that was generated based on the data shown in 2a. Typical components of fish were determined, contributing to the separation of the fish species. Additional figures from other points of view corresponding to Figure 2a can be found in the Supporting Information (Figure S-13). Furthermore, we analyzed fresh and frozen fish samples, but this analysis carried out using samples of salmon and cod, showed no

9 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 15

separation in the PCA score plots (data not shown). The possibility to differentiate fish species and their common fraud substitutes on a portable mass spectrometer in combination with DESI as exemplified here for salmon and trout as well as cod and pollack is of high relevance for on-site real-time fish authentication providing an alternative to infrared spectroscopy42-44. Further research is needed using more biological replicates from different origin, but these first results clearly demonstrate the suitability of portable mass spectrometry for this analytical task. Analysis of coffee with LTP MS In addition to the non-volatile chemical profile as shown above, also the volatile chemical profile of food can be employed to identify different food species such as coffee varieties. The global coffee market is almost entirely dominated by Arabica and Robusta varieties45. Arabica coffee in general has a higher-quality aroma and commercial value than Robusta coffee. Blending Arabica coffee with the cheaper Robusta coffee without declaration is therefore the major fraudulent manipulation in this sector. In this study Arabica and Robusta coffee beans were analyzed with LTP MS, since the aroma profile of each coffee type created by the roasting procedure features volatile and semi-volatile compounds that can be readily analyzed by plasma-based ionization techniques. LTP is a non-destructive ionization method and can be operated under ambient conditions and on site.46,47 These characteristics are well suited for food authentication and especially for combination with the Mini 11. In order to minimize the influence of the geographical origin on the classification of the coffee beans, two species from the Kerala region in India were chosen for the first study. The beans were subjected to analysis by placing the whole, untreated beans in front of the inlet capillary of the Mini 11, while probing the surface with the protruding plasma tip of the LTP ion source. Around 20 coffee beans were analyzed for recording of 250 spectra of each coffee type to take into account the natural variation of the samples. Supporting Figures S-14 a) and b) depict typical spectra of the respective coffee types. 30 single spectra were summed up to give the displayed signal patterns. Only slight variations around m/z 150, m/z 380 and m/z 700 can be seen in the spectra, minimizing the chance for visual differentiation of the data. Molecular markers like 16-Omethylcafestol, that only occurs in Robusta coffee and is used to exclude blending of Arabica coffee with Robusta beans in HPLC-MS experiments48, were not detected due to the limited sensitivity and mass resolution of the Mini instrument. The PCA score plot of the coffee reference set, which is shown in Figure 3a, was created using a bin size of 2 u and the QR algorithm with 99% variance coverage. 500 spectra were subjected to statistical analysis, forming two closely neighboring data groups. Despite the very high spectral similarity, the statistical separation of the data groups was sufficient for the classification of unknown samples. The quality of the data set was tested via crossvalidation, yielding 96.4 % correct classification using the LDA coefficient. Furthermore, the reference data set was used for a live-classification experiment which was performed two months after the measurement of the reference data (Video S-2). Whole Arabica beans (A) and Robusta (R) beans were analyzed consecutively by low temperature plasma ionization. Recorded mass spectra were classified with our customized software in real time. As shown in Video S-2 the method presented here allows accurate classification even when changing samples within less than a minute, demonstrating its potential for high-throughput analyses. Our method thus offers comparable analytical performance with infrared spectroscopy49, in terms of classification accuracy, highthroughput, sample preparation, ease of use and capability for on-site analysis. Figure 3b shows the loading plot of the presented data indicating those mass bins that contribute to the separation of data points. Different chemical compounds were assigned using parallel exact mass analysis with a high-resolution orbital trapping mass

10 ACS Paragon Plus Environment

Page 11 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

spectrometer. Future experiments will include a broader variety of coffee species and roasting grades as well as the analysis of grounded coffee beans. The aroma profile depends highly on the roasting process of the beans and can vary according to the grade of roasting as shown in a study using a combination of photoionization (PI) and time-of-flight mass spectrometry (TOF) MS.50 The aforementioned experiments will give the opportunity to analyze mixtures of different coffee types and to test the classification quality on those data sets.

Figure 3a) PCA score plot of the non-log-scaled MS data of Robusta (blue) and Arabica (red) coffee beans. The QR Algorithm was used with 0.99 variance coverage and a bin size of 2u. Cross-validation results were 96.4 % with LDA coefficient. b) Loading plot corresponding to data set in a). The assignment of the m/z bins that contribute to the separation of data points is based on reference measurements using a highresolution orbital trapping mass spectrometer

Conclusions The aim of this study was to show that the combination of ambient ionization, a miniaturized mass spectrometer and statistical data analysis has the potential to become a powerful tool for on-site real-time food authentication. In three proof-of-principle experiments targeting frequently adulterated food groups the performance of this combination was evaluated. It was shown that the chemical fingerprints from the food samples, obtained either directly or after minimal sample preparation via ambient ionization mass spectrometry with a miniaturized MS system, were of sufficient quality to differentiate food subclasses. Classification accuracy for the differentiation of goat, cow and soy milk was found to be greater than 99% in cross-validation and real-time experiments using

11 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 15

statistical analysis. Mixtures of these milk types were identified correctly in > 94 % of all cases. The classification accuracies for five fish species (cod, pollack, salmon, trout and redfish) and two coffee species (Arabica and Robusta) were determined to be 100% and 96.4%, respectively, in cross-validation experiments. An in-house developed software allowed fast and customized analysis of the data produced by the miniaturized mass spectrometer as well as the implementation of a real-time classification feature for freshly acquired data. Realtime classification, which is essential for on-site analysis, was demonstrated for the food groups of milk and coffee, showing that accurate classification and high throughput analysis is possible with this setup. It has also been shown that different ambient ionization methods are suitable for food authentication with a miniaturized mass spectrometer. The appropriate ionization method can thus be chosen for food analysis, reducing the need for sample preparation and enhancing the food subclass authentication capabilities by targeting the most selective chemical substance class. Further research is needed to fully evaluate the performance of this combination for larger sample cohorts. The first results presented here indicate that the performance of the system in terms of classification accuracy, sampling speed and ease of use is comparable to currently suggested real-time on-site analytical methods for food authentication. Supporting Information Supporting Information is available [Detailed view of the developed statistical data analysis tool “MS food classifier”, Cross-validation data for the employed classifiers, calculation of time and memory consumption for statistical analysis, mass spectra of fish and coffee samples, PCA score plots of extended samples sets of milk analyses.] Acknowledgements Financial support by the State of Hesse (LOEWE Research Focus ‘Ambiprobe’) and by the Justus Liebig University (research grant for junior academic staff) is gratefully acknowledged. The Landesbetrieb Hessisches Landeslabor is greatly acknowledged for validating the authenticity of our samples. Thermo Fisher Scientific is gratefully acknowledged for providing access to an orbital trapping mass spectrometer. Safety considerations MeOH was used as DESI spray solvent. While performing the analysis, a flexible exhaust was placed above the DESI sprayer. All used ionization methods need high voltage to produce ions. To ensure safe handling the experimental area was shielded and current was limited to 0.01 mA. References (1) Spink, J.; Moyer, D. C. J Food Sci 2011, 76, R157-R163. (2) Committee on the Environment, P. H. a. F. S., Report on the food crisis, fraud in the food chain and the control thereof (A70434/2013); 2013. (3) Grocery Manufacturers Association and A.T. Kearney, Consumer product fraud: deterrence and detection 2010. (4) Capuano, E.; Boerrigter-Eenling, R.; van der Veer, G.; van Ruth, S. M. J Sci Food Agr 2013, 93, 12-28. (5) Riedl, J.; Esslinger, S.; Fauhl-Hassek, C. Anal Chim Acta 2015, 885, 17-32. (6) Sun, D.-W. Modern Techniques for Food Authentication; Elsevier Academic Press Inc., Amsterdam 2008, 720. (7) Danezis, G. P.; Tsagkaris, A. S.; Camin, F.; Brusic, V.; Georgiou, C. A. TrAC Trend Anal Chem 2016, 85, 123-132. (8) Kelly, S.; Heaton, K.; Hoogewerff, J. Trends Food Sci Tech 2005, 16, 555-567. (9) Ortea, I.; O'Connor, G.; Maquet, A. J Prot 2016, 147, 212-225. (10) Cubero-Leon, E.; Penalver, R.; Maquet, A. Food Res Int 2014, 60, 95-107. (11) Mafra, I.; Ferreira, I.; Oliveira, M. Eur Food Res Technol 2008, 227, 649-665. (12) Ellis, D. I.; Brewster, V. L.; Dunn, W. B.; Allwood, J. W.; Golovanov, A. P.; Goodacre, R. Chem Soc Rev 2012, 41, 5706-5727.

12 ACS Paragon Plus Environment

Page 13 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(13) Ellis, D. I.; Muhamadali, H.; Haughey, S. A.; Elliott, C. T.; Goodacre, R. Anal Meth 2015, 7, 9401-9414. (14) Gao, L.; Sugiarto, A.; Harper, J. D.; Cooks, R. G.; Ouyang, Z. Anal Chem 2008, 80, 7198-7205. (15) Li, L.; Chen, T.-C.; Ren, Y.; Hendricks, P. I.; Cooks, R. G.; Ouyang, Z. Anal Chem 2014, 86, 2909-2916. (16) Gao, L.; Song, Q.; Patterson, G. E.; Cooks, R. G.; Ouyang, Z. Anal Chem 2006, 78, 5994-6002. (17) Hendricks, P. I.; Dalgleish, J. K.; Shelley, J. T.; Kirleis, M. A.; McNicholas, M. T.; Li, L.; Chen, T.-C.; Chen, C.-H.; Duncan, J. S.; Boudreau, F.; Noll, R. J.; Denton, J. P.; Roach, T. A.; Ouyang, Z.; Cooks, R. G. Anal Chem 2014, 86, 2900-2908. (18) Takats, Z.; Wiseman, J. M.; Gologan, B.; Cooks, R. G. Anal Chem 2004, 76, 4050-4058. (19) Harper, J. D.; Charipar, N. A.; Mulligan, C. C.; Zhang, X. R.; Cooks, R. G.; Ouyang, Z. Anal Chem 2008, 80, 9097-9104. (20) Liu, J.; Wang, H.; Cooks, R. G.; Ouyang, Z. Anal Chem 2011, 83, 7608-7613. (21) Liu, J.; Wang, H.; Manicke, N. E.; Lin, J.-M.; Cooks, R. G.; Ouyang, Z. Anal Chem 2010, 82, 2463-2471. (22) Mulligan, C. C.; Talaty, N.; Cooks, R. G. Chem Comm 2006, 1709-1711. (23) Pulliam, C. J.; Wei, P.; Snyder, D. T.; Wang, X.; Ouyang, Z.; Pielak, R. M.; Cooks, R. G. Analyst 2016, 141, 1633-1636. (24) Soparawalla, S.; Tadjimukhamedov, F. K.; Wiley, J. S.; Ouyang, Z.; Cooks, R. G. Analyst 2011, 136, 4392-4396. (25) Miyashita, Y.; Itozawa, T.; Katsumi, H.; Sasaki, S.-I. J Chemometr 1990, 4, 97-100. (26) Sharma, A.; Paliwal, K. K.; Imoto, S.; Miyano, S. Int J Mach Learn Cybern 2013, 4, 679-683. (27) Hall, M.; Frank, E.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I. H. SIGKDD Explor. Newsl. 2009, 11, 10-18. (28) Balog, J.; Szaniszlo, T.; Schaefer, K. C.; Denes, J.; Lopata, A.; Godorhazy, L.; Szalay, D.; Balogh, L.; Sasi-Szabo, L.; Toth, M.; Takats, Z. Anal Chem 2010, 82, 7343-7350. (29) De Maesschalck, R.; Jouan-Rimbaud, D.; Massart, D. L. Chemometr and Intel Lab 2000, 50, 1-18. (30) Xanthopoulos, P.; Pardalos, P. M.; Trafalis, T. B. In Robust Data Mining; Springer New York: New York, NY, 2013, pp 27-33. (31) Campbell, D. I.; Dalgleish, J. K.; Cotte-Rodriguez, I.; Maeno, S.; Cooks, R. G. Rapid Commun Mass Spectrom 2013, 27, 1828-1836. (32) Huang, M.-Z.; Cheng, S.-C.; Cho, Y.-T.; Shiea, J. Anal Chim Acta 2011, 702, 1-15. (33) Handford, C. E.; Campbell, K.; Elliott, C. T. Compr Rev Food Sci F 2016, 15, 130-142. (34) Calvano, C. D.; De Ceglie, C.; Monopoli, A.; Zambonin, C. G. J Mass Spectrom 2012, 47, 1141-1149. (35) Jaiswal, P.; Jha, S. N.; Borah, A.; Gautam, A.; Grewal, M. K.; Jindal, G. Food Chem 2015, 168, 41-47. (36) Sampson, H. A. Journal Allergy Clin Immun 2003, 111, S540-S547. (37) Helyar, S. J.; Lloyd, H. A.; de Bruyn, M.; Leake, J.; Bennett, N.; Carvalho, G. R. Plos One 2014, 9 (6)doi: 10.1371/journal.pone.0098691. (38) Mariani, S.; Griffiths, A. M.; Velasco, A.; Kappel, K.; Jerome, M.; Perez-Martin, R. I.; Schroder, U.; Verrez-Bagnis, V.; Silva, H.; Vandamme, S. G.; Boufana, B.; Mendes, R.; Shorten, M.; Smith, C.; Hankard, E.; Hook, S. A.; Weymer, A. S.; Gunning, D.; Sotelo, C. G. Front Ecol Environ 2015, 13, 536-540. (39) Benard-Capelle, J.; Guillonneau, V.; Nouvian, C.; Fournier, N.; Le Loet, K.; Dettai, A. Peerj 2015, doi:10.7717/peerj.714. (40) European Commission, Fish substitution, http://ec.europa.eu/food/safety/official_controls/food_fraud/fish_substitution/tests/index_en.htm 2015, date of access: 17.03.2017. (41) U.S. Food and Drug Administration, Seafood Species Substitution and Economic Fraud 2014. (42) O'Brien, N.; Hulse, C. A.; Pfeifer, F.; Siesler, H. W. J near Infrared Spec 2013, 21, 299-305. (43) Alamprese, C.; Casiraghi, E. Lwt-Food Sci Technol 2015, 63, 720-725. (44) Ottavian, M.; Facco, P.; Fasolato, L.; Novelli, E.; Mirisola, M.; Perini, M.; Barolo, M. J Agr Food Chem 2012, 60, 639-648. (45) Toci, A. T.; Farah, A.; Pezza, H. R.; Pezza, L. Critical Rev Anal Chem 2016, 46, 83-92. (46) Kamm, W.; Dionisi, F.; Fay, L. B.; Hischenhuber, C.; Schmarr, H. G.; Engel, K. H. J Am Oil Chem Soc 2002, 79, 1109-1113. (47) Hendricks, P. I.; Dalgleish, J. K.; Shelley, J. T.; Kirleis, M. A.; McNicholas, M. T.; Li, L. F.; Chen, T. C.; Chen, C. H.; Duncan, J. S.; Boudreau, F.; Noll, R. J.; Denton, J. P.; Roach, T. A.; Ouyang, Z.; Cooks, R. G. Anal Chem 2014, 86, 2900-2908. (48) Wiley, J. S.; Shelley, J. T.; Cooks, R. G. Anal Chem 2013, 85, 6545-6552.(49) Barbin, D. F.; Felicio, A.; Sun, D. W.; Nixdorf, S. L.; Hirooka, E. Y. Food Res Int 2014, 61, 23-32. (50) Czech, H.; Schepler, C.; Klingbeil, S.; Ehlert, S.; Howell, J.; Zimmermann, R.; J Agr Food Chem 2016, 64, 5223-5231.

13 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 15

14 ACS Paragon Plus Environment

Page 15 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

250x79mm (150 x 150 DPI)

ACS Paragon Plus Environment