Authentication of Fish Products by Large-Scale Comparison of

Keywords: .... In the X!Tandem search, a mass measurement error of 2.5 Da plus “isotope ... Best Matching (and Second Best) Spectral Library for 25 ...
0 downloads 0 Views 660KB Size
Article pubs.acs.org/jpr

Authentication of Fish Products by Large-Scale Comparison of Tandem Mass Spectra Tune Wulff,*,†,‡ Michael Engelbrecht Nielsen,† André M. Deelder,‡ Flemming Jessen,† and Magnus Palmblad‡ †

National Food Institute, Technical University of Denmark, Mørkhøj Bygade 19, Soborg 2860, Denmark Leiden University Medical Center (LUMC), Center for Proteomics and Metabolomics, Albinusdreef 2, 2333 ZA Leiden, The Netherlands



S Supporting Information *

ABSTRACT: Authentication of food is a major concern worldwide to ensure that food products are correctly labeled in terms of which animals are actually processed for consumption. Normally authentication is based on species recognition by comparison of selected sequences of DNA or protein. We here present a new robust, proteome-wide tandem mass spectrometry method for species recognition and food product authentication. The method does not use or require any genome sequences or selection of tandem mass spectra but uses all acquired data. The experimental steps were performed in a simple, standardized workflow including protein extraction, digestion, and data analysis. First, a set of reference spectral libraries was generated using unprocessed muscle tissue from 22 different fish species. Query tandem mass spectrometry data sets from “unknown” fresh muscle tissue samples were then searched against the reference libraries. The number of matching spectra could unambiguously identify the origin of all fresh samples. A number of processed samples were also analyzed to further test the robustness and applicability of the method. The results clearly show that the method is also able to correctly identify heavily processed samples. KEYWORDS: phylogeny, tandem mass spectrometry, authentication, DNA barcoding, spectral libraries, fish, cod, food, salmon



INTRODUCTION Authentication of food products is a major concern throughout the food industry, best illustrated by several recent scandals in which low value species were sold as higher value species, stressing the necessity for continuous control. The main problem is that even in mildly processed products it is often impossible to visually identify the origin of a product, making the use of molecular methods necessary for species validation. Within seafood, authorities have introduced legislation to ensure that products are correctly labeled, requiring information on species, production method, and geographical origin.1 However, mislabeling of seafood products is internationally recognized as a major issue, and multiple studies have addressed the magnitude of the problem.2−4 The mentioned studies and national control programs have revealed massive fraud within both fish and other seafood products, and the media attention has increased the public awareness of the problem. In addition, as a result of the rapidly growing world population, there is a growing demand for fish and other seafood products, resulting in a steady increase in the tons of fish being produced yearly, from 20 million tons in the 1950s to currently over 150 million tons.5 As a consequence of these growing demands, fish and fish products are increasingly moved across borders, and new species are being introduced as alternatives to overfished species. © XXXX American Chemical Society

This emphasizes the necessity of an efficient control program to discover cases of fraud in relation to authentication, thereby increasing consumers trust in seafood products. Because of the magnitude of the problem, different methods have been implemented to successfully identify the origin of the sample. In recent years, DNA-based methods have increasingly been implemented for authentication because of their high discriminative power.6,7 However, DNA-based procedures are challenged when working with processed products from which it can be difficult to extract DNA in a standardized matter.8,9 This means that alternative methods need to be available and developed to ensure that the correct species can be assigned from samples regardless of their status of processing. Protein-based methods are a well-established methodology for authentication of fish species,1 with different types of isoelectric focusing being the most commonly used approaches.10−13 Often the workflow includes 2-DE-based mapping of proteins patterns, often followed by identification of protein spots of interest.12,14 Other proteomic approaches include studies taking advantage of specific mass spectrometry Special Issue: Agricultural and Environmental Proteomics Received: June 30, 2013

A

dx.doi.org/10.1021/pr4006525 | J. Proteome Res. XXXX, XXX, XXX−XXX

Journal of Proteome Research

Article

quenched by adding 8 μL of freshly prepared 10% TFA. Finally, the samples were centrifuged at 2500g for 10 min, after which the supernatants were collected and stored at −80 °C until LCtandem mass spectrometry.

profiles15 or using specific peptides or proteins as means to correctly identify the origin of the sample.16−18 Besides their important role for authentication, molecular methods are widely used to establish phylogenetic relationship between species. Within aquatic animals, differences in protein sequences have been used for classification;17,19 however, phylogenetic trees are predominantly founded on differences in selected DNA sequences.20 This means that within both authentication and phylogenetics, the vast majority of molecular founded methods are based on the comparison of selected sequences of either proteins or DNA. However, two of the authors recently demonstrated that molecular phylogenetics is also possible by direct pairwise comparison of tandem mass spectra from primate sera.21 The method is completely independent of the availability of genomic information and in principle uses all acquired spectra rather than selecting a few for de novo sequencing and comparison between sequences. In this study, we set out to establish an automated and standardized workflow using existing, well-tested, software for species identification of a wide range of commercially important fish species, comparing raw and differentially processed samples using a database of spectral libraries of individual species.

Liquid Chromatography−Tandem Mass Spectrometry

The in-solution digests of the muscle extracts were separated using splitless NanoLC-Ultra 2D plus (Eksigent, Dublin, CA) parallel ultra-high-pressure liquid chromatography (UHPLC) systems with an additional loading pump for fast sample loading and desalting. Each UHPLC system was configured with two 300 μm i.d. 5 mm PepMap C18 trap columns (Thermo Fischer Scientific) and two 15 cm 300 μm i.d. ChromXP C18 columns (Eksigent). All samples were separated by 45 min linear gradients from 4 to 33% acetonitrile in 0.05% formic acid and a constant flow rate of 4 μL/min. The UHPLC systems were coupled online to amaZon ETD speed highcapacity 3D ion traps with standard ESI sources (Bruker Daltonics, Bremen, Germany). After each MS scan, up to 10 abundant multiply charged species in the m/z 300−1300 range were automatically selected for MS/MS but excluded for 1 min after being selected twice. The UHPLC systems were controlled using HyStar 3.4 with a plug-in from Eksigent and the amaZon ion traps by trapControl 7.0, all from Bruker.



MATERIAL AND METHODS Muscle tissue samples from 22 different species were included in the experiment (taxonomic information is provided as Supporting Information). All included species were assigned by experienced zoologists, after which muscle samples were taken from the intact animal. In the laboratory, ∼20 mg of muscle tissue was taken from original samples and used for protein extraction. Proteins were extracted in 100 μL of urea buffer [8 M urea, 40 mM MgCl, and 50 U/mL benzonase (SigmaAldrich, Zwijndrecht, Netherlands)] with 0.5 mm zirconium oxide beads by homogenization using an air-cooled Bullet Blender (Next Advance, Averill Park, NY). The procedure involved 3 min of homogenization followed by incubation for 12 min at 4 °C, then 1 min of homogenization and incubation for 12 min at 4 °C. Finally, samples were centrifuged at 16 000g for 30 min at 4 °C, after which the supernatants were collected. The final protein concentration was measured using a bicinchoninic acid (BCA) protein assay kit (Thermo Fischer Scientific, Etten-Leur, Netherlands), and the supernatant was stored at −80 °C. As a control and possible out-group for phylogenetic analysis, an E. coli whole-cell lysate was also included. The preparation of this sample has been previously described.22

Molecular Phylogenetics

Raw fish muscle tissue proteomes from 22 fish species were analyzed using compareMS2, exactly as previously described,21 using MGF files generated by DataAnalysis (Bruker), each containing 2000 tandem mass spectra as in the previous study. The compareMS2 output, or fraction of shared tandem mass spectra between each pair of data sets, was combined into a distance matrix in the MEGA format and a UPGMA tree generated by MEGA 5.0524 with default settings. Figure 1 shows a simplified overview of the workflow. Authentication of Raw Fish Samples

The compareMS2 analysis workflow was designed for N × N comparisons, in which all LC-MS/MS data sets are compared pairwise. For identifying an individual unknown sample, it is more convenient to perform a 1 × N comparison of one query data set from this sample against a collection of previously acquired data sets in a database. This can be achieved with any well-designed spectral library software by creating spectral libraries from reference samples and searching the unknown sample against all libraries to find which reference sample matches the largest number of spectra in the unknown. We first created 24 small individual spectral libraries from the samples used for the phylogenetic analysis, including the E. coli wholecell lysate standard and a blank (water injection). This was accomplished by searching each LC-MS/MS data set with X! Tandem25 and the k-score plugin against a random sequence database. A total of 40 543 random sequences were generated from all May 2013 Uniprot Danio rerio sequences (organism: “Danio rerio” AND keyword: 181), canonical sequences and isoforms, also 40 543 sequences) using make_random (http:// www.ms-utils.org/make_random.html). The actual sequences used are not that important as long as they are sufficient in number to produce peptides at most nominal masses in the range of tryptic peptides, and they have no bias toward or against a particular species. In the X!Tandem search, a mass measurement error of 2.5 Da plus “isotope error” was allowed, and carbamidomethylation of cysteines and methionine oxidation was considered as the only fixed and variable

Digestion

In-solution digestion was essentially performed as previously described.23 First, 4 μL of 60 mM DTT were added to ∼150 μg protein in 20 μL (volume adjusted by adding 50 mM ammonium bicarbonate), followed by incubation at 56 °C for 45 min to reduce cysteines. Next, 6 μL of 100 mM iodoacetamide was added, and samples were incubated at room temperature in the dark for 1 h. Samples were then diluted 1:4 in 120 μL of 50 mM ammonium bicarbonate, after which volumes were reduced using an Amicon ultra-0.5 3K Centrifugal filter device (Merck Millipore, Billerica, MA) first by centrifuging for 30 min at 14 000g, after which samples were collected in a fresh tube by reversing the column and centrifuged at 1000g for 2 min. Each sample was digested at 37 °C overnight (14 h) by adding 6 μL (0.25 μg/μL) of trypsin (sequencing grade, Promega, Madison, WI). Digestion was B

dx.doi.org/10.1021/pr4006525 | J. Proteome Res. XXXX, XXX, XXX−XXX

Journal of Proteome Research

Article

and some poor or noisy spectra. Replicate analyses of the same 24 species (including an E. coli standard and water blank) were searched against each of the 24 libraries for an initial “sanity check”, storing the number of tandem mass spectra matching with a SpectraST dot product of 0.7 or higher. Dot products above 0.7 represent good SpectraST matches with typical falsediscovery rates below 1%.

Figure 2. Phylogenetic tree built solely from tandem mass spectrometry data. With only minor differences (described in the text), the tree agrees with the consensus phylogeny of these fish species.

Authentication of Processed Food Products

Not only raw fish samples but also fish that had been cooked, deep fried, smoked, or prepared in other ways were included in the experiment to demonstrate the broad applicability of the method. We analyzed 47 additional samples, prepared at different times, and analyzed 5 months apart on different amaZon speed instruments using different LC systems with different columns (albeit still of the same models and types and in the same laboratory). Four technical replicates of each sample analysis were performed. The results are summarized in Tables 1 and 2. Raw muscle samples from tuna, haddock, pollock, cod, rainbow trout, and salmon were further processed in the following ways: Deep frying: Fillets were breaded in flour and egg white and deep-fried in sunflower oil at 160 °C for 4 min. Two grams of processed fish meat was sampled. Steaming and autoclaving: Equal sizes of fish were cut from the fillet (50 g) and placed in bowl with 4 mL of water. Steamed samples were steamed in a microwave oven for 1 min at 800 W, and 2 g of the steamed fish was sampled. Autoclaved samples were placed in 20 mL of bluecap laboratory bottle (Pyrex) and autoclaved at 121 °C for 15 min, after which 2 g of sample was taken. Proteins were extracted as previously described. Each of the 4 × 47 data sets was then search against each of the 24 libraries, and the number of matching spectra (dot product > 0.7) was recorded. In addition, the deep-fried samples were also search against Gallus gallus sequences (UniProt June 2013, organism: “Gallus gallus” AND reviewed:yes, canonical sequences only, 2255 sequences including ovalbumin) to

Figure 1. General overview over species identification by standard bottom-up proteomics methods. Reference muscle samples from identified fish species are processed and analyzed. All spectra are placed into a SpectraST library with random peptide identifications and all libraries collected in a database. The query sample is then analyzed by the same or a similar method, and the data were searched against each library in the library database. At no point are sequence data used or individual spectra selected.

modifications, respectively. The X!Tandem results were converted to pepXML26 and analyzed by PeptideProphet.27 As the sequence database search was randomly generated, we did not expect to find any good matches but kept all results, including those with PeptideProphet probability zero. Spectral libraries were then generated with SpectraST using a zeroprobability cutoff to accept all tandem mass spectra that pass the default SpectraST filters into the library, including all good C

dx.doi.org/10.1021/pr4006525 | J. Proteome Res. XXXX, XXX, XXX−XXX

Journal of Proteome Research

Article

Table 1. Replicate Analyses of the 22 Samples Also Used to Generate the Librariesa query (“unknown”) species

matching spectra

reference library

± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ±

rainbow trout (Oncorhynchus mykiss) Atlantic salmon (Salmo salar) pink salmon (Oncorhynchus gorbuscha) chum salmon (Oncorhynchus keta) brown trout (Salmo trutta) Arctic char (Salvelinus alpinus) African catfish (Clarias gariepinus) European seabass (Dicentrarchus labrax) tilapia species (Oreochromis sp.) John Dory (Zeus faber) Greenland halibut (R. hippoglossoides) Atlantic halibut (H. hippoglossus) common sole (Solea solea) Patagonian grenadier (M. magellanicus) Southern hake (Merluccius australis) cod (Gadus morhua) haddock (Melanogrammus aeglef inus) pollock (Pollachius virens) pollack (Pollachius pollachius) yellowfin tuna (Thunnus albacares) albacore (Thunnus alalunga) skipjack tuna (Katsuwonus pelamis)

rainbow trout (Oncorhynchus mykiss) Atlantic salmon (Salmo salar) pink salmon (Oncorhynchus gorbuscha) chum salmon (Oncorhynchus keta) brown trout (Salmo trutta) Arctic char (Salvelinus alpinus) African catfish (Clarias gariepinus) European seabass (Dicentrarchus labrax) tilapia species (Oreochromis sp.) John Dory (Zeus faber) Greenland halibut (R. hippoglossoides) Atlantic halibut (H. hippoglossus) common sole (Solea solea) Patagonian grenadier (M. magellanicus) Southern hake (Merluccius australis) cod (Gadus morhua) haddock (Melanogrammus aeglef inus) pollock (Pollachius virens) pollack (Pollachius pollachius) yellowfin tuna (Thunnus albacares) albacore (Thunnus alalunga) skipjack tuna (Katsuwonus pelamis)

4728 3579 3539 3971 4199 4284 4254 3163 3324 4526 5914 4038 4240 5555 5132 4796 4154 4266 3935 4000 3801 3802

742 486 673 710 625 361 768 417 262 755 914 637 643 1185 535 699 802 682 659 528 620 745

a Reference libraries were generated from fresh muscle tissue from the 22 different fish species listed in the Table. The highest number of matching spectra is indicated together with the corresponding species from the reference library. The numbers of matching spectra are an average of four technical replicates with standard deviation.

Table 2. Best Matching (and Second Best) Spectral Library for 25 “Unknown” Samples Found by the Automated Workflow Incorporating the SpectraST Search Enginea query (“unknown”) species

sample state

cod (Gadus morhua) cod (Gadus morhua) cod (Gadus morhua) cod (Gadus morhua) haddock (Melanogrammus aeglef inus) haddock (Melanogrammus aeglef inus) haddock (Melanogrammus aeglef inus) haddock (Melanogrammus aeglef inus) pollock (Pollachius virens) pollock (Pollachius virens) pollock (Pollachius virens) pollock (Pollachius virens) yellowfin tuna (Thunnus albacares) yellowfin tuna (Thunnus albacares) tongol tuna (Thunnus tonggol)* tongol tuna (Thunnus tonggol)* tongol tuna (Thunnus tonggol)* rainbow trout (Oncorhynchus mykiss) rainbow trout (Oncorhynchus mykiss) rainbow trout (Oncorhynchus mykiss) rainbow trout (Oncorhynchus mykiss) rainbow trout (Oncorhynchus mykiss) Atlantic salmon (Salmo salar) Atlantic salmon (Salmo salar) Atlantic salmon (Salmo salar)

fresh steamed autoclaved deep fried fresh steamed autoclaved deep fried fresh steamed autoclaved deep fried fresh steamed autoclaved canned (oil) canned (water) fresh fresh autoclaved smoked steamed fresh steamed smoked

matching spectra (best match)

reference library (best match)

± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ±

cod cod cod cod haddock haddock haddock haddock pollock pollock pollock pollock yellowfin tuna yellowfin tuna albacore skipjack tuna skipjack tuna rainbow trout rainbow trout rainbow trout rainbow trout rainbow trout Atlantic salmon Atlantic salmon brown trout

4259 4859 4197 4629 3873 4653 3598 4540 4266 3302 3331 3421 3039 3508 2666 1713 2287 4153 2973 4415 4274 4543 2865 3418 3535

811 643 508 805 716 808 487 754 682 1008 380 463 590 731 261 192 321 883 599 576 573 684 477 160 453

matching spectra (second match)

reference library (second match)

± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ±

pollock pollock pollock pollock pollock pollock pollock pollock cod cod cod cod albacore albacore skipjack tuna albacore albacore brown trout (3/4) brown trout (3/4) brown trout chum salmon chum salmon brown trout (3/4) brown trout (3/4) rainbow trout (3/4)

3094 3456 3052 3392 2783 3323 2529 3155 2985 2600 2701 2737 3069 3490 2468 1391 2084 3839 2519 4017 3835 3995 2827 3145 3477

821 439 370 570 742 507 342 487 605 793 297 357 686 730 236 99 222 1011 438 541 484 529 602 237 452

Reference libraries were generated from fresh muscle tissue from 22 different fish species. The sample state indicates how the sample was processed prior to analysis and identification. The highest number of matching spectra (dot product > 0.7) for both best and second-best match is indicated together with the corresponding species from the reference library. The listed number of matching spectra is an average of four technical replicates with standard deviation. * = species not in library. The second best matching species is most often the same for all four replicates. a

D

dx.doi.org/10.1021/pr4006525 | J. Proteome Res. XXXX, XXX, XXX−XXX

Journal of Proteome Research

Article

underlying analyses identified the correct species, 96% (180/ 188) the correct genus, and all analyses the correct family. Table 2 also includes the second best-matching species. This is always a close relative to the correctly identified or best matching species. Only for the eight analyses of the canned tongol tuna (Thunnus tonggol) did the method fail to identify the correct genus. However, tongol tuna was not included in the collection of the spectra libraries, making precise identification impossible. The numbers of matches were also considerably smaller than for any of the other samples, which could be explained by the state of these samples, influenced by the canning process and perhaps also the greasy nature of the tuna in oil. In general, canned products are considered to be heavily processed products,6 making correct authentication more difficult in this type of samples. DNA-based studies are challenged by the fact that the DNA fragments are heavily fragmented, reducing the success rate in canned products.6,29 In this study, the reference spectral libraries are all based on fresh samples. However, because there is no limit to the number of spectral libraries that can be included in the collection of reference libraries, also spectral libraries from processed products can readily be included, further increasing the discriminating power of the method for all types of fish food products. Fish rarely appear in products in which the species can readily be identified by visual inspection. Most often, they occur in processed forms, making the use of molecular identification a necessity. Included in the study therefore are raw, steamed, autoclaved, and deep-fried samples from cod, pollock, and haddock, all from the Gadidae family, with the latter two being from the same genus (Pollachius). This approach meant that the methodology could be tested on increasingly processed samples, and by using closely related species, the discrimination power of the method could simultaneously be challenged. Commercially, deep frying is highly relevant because the included species are used for fish and chips, and recent scandals in the U.K. and Ireland2 have shown that in 7% (U.K.) and 28% (Ireland) of the tested cod products, cod is replaced with pollock or haddock but is still sold as the more expensive cod. The method presented here uses all tandem mass spectra when comparing an unknown sample to a collection of reference spectral libraries. Protein composition and post-translational modifications, such as oxidation (phenotype) and the amino acid sequence of the proteins (genotype), contribute to the similarity score, here defined as the number of spectra matching the reference library. Processing of the food product would therefore be expected to influence the overall similarity in which peptides are observed. For closely related species, these induced differences may be significant when compared with the species-to-species differences, challenging the discrimination power of the method. All individual samples from cod, haddock, and pollock were correctly identified. As shown in Figure 3, the number of matching spectra is significantly higher when matching cod against the cod reference compared with cod versus haddock or cod versus pollock. Likewise, as shown in Table 2, most other samples could be correctly identified regardless of how the fish had been processed or prepared. This shows that the method is very versatile because it is useful for fresh samples but important also for heavily processed ones. In fact, processing has a relatively minor effect on the number of matches. Only the canning procedures (p < 5 × 10−7), autoclaving (p = 0.0017) and steaming (p = 0.0085), resulted in a significantly different number of matches compared with the

check whether egg proteins had made it into the interior of the fried fillets or otherwise contaminated the samples.



RESULTS AND DISCUSSION

Molecular Phylogenetics

The phylogenetic analysis distinguished and successfully arranged the 22 different species in groups of closely related families based on the comparison of 2000 tandem mass spectra from small pieces of muscle tissue (Figure 2). Only one of the included species, namely, Pollachius pollachius, is misplaced in this phylogenetic tree, being grouped with flatfish instead of with the other members of the Gadidae family. Likewise it would be expected that Zeus faber being of the superorder Acanthopterygii would group together with the included tuna species because they are from the same superorder. All included members of the family Salmoidae are correctly grouped together; however, Salmo salar should be next to Salmo trutta,28 but the phylogenetic distances within this family are minor. The reason for the observed discrepancies could be that the phylogenetic tree is generated from a single example of each species and from a single analysis of each sample. Any changes due to storage or differential analysis of the samples therefore influence the clustering. However, Figure 2 clearly illustrates the potential of the method, following a simple workflow with no interpretation of the data. In fact, most of the species represented in this study and most commercial important fish are still not completely sequenced, illustrating a major advantage of our approach, as it is completely independent of any genomic or protein sequence database. No peptide identifications were used in any stage of the analysis, making the method equally applicable to all organisms. Yet another advantage of the presented method is the unbiased use of all acquired data rather than a manual selection of a small subset. The 2000 tandem mass spectra automatically selected (as the 2000 DataAnalysis “compounds” of highest signal) represent the bulk of the high-quality spectra in the data sets compared with other published methods that use a few and manually selected spectra or peptides as molecular makers for species identification.17,18 The latter case can be challenging because the selection of a few spectra easily leads to bias and can include time-consuming manual interpretation of the data. By comparing different samples based on thousands (or tens of thousands) of tandem spectra, the method is essentially comparing the proteomes of the different samples, with sensitivity toward both amino acid sequences and protein expression levels. Authentication

As seen in Table 1, all technical replicates of the samples used to construct the spectral library found the highest number of matching spectra (dot product >0.7) against the spectral library constructed from a different data set from the same sample. All technical replicates were run on a different ion trap and using a different LC column than the one used to generate data in the libraries, showing the robustness of the method. The order of the next few matches agreed with what could be predicted from the evolutionary relationship between these species and the previously described molecular phylogenetic analysis. In the follow-up study summarized in Table 2, all new raw and prepared samples could be successfully identified to the correct species, with four minor exceptions, clearly showing the robustness and versatility of the method. When summarizing the data in the two Tables, a total of 90% (169/188) of all E

dx.doi.org/10.1021/pr4006525 | J. Proteome Res. XXXX, XXX, XXX−XXX

Journal of Proteome Research



Article

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone:+45 61796585. Fax: +45 45884774. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS



REFERENCES

We wish to thank Hans Dalebout and Suzanne van der PlasDuivesteijn for technical assistance. This work was supported by the Danish Strategic Research Council grant 3304-FVFP-08K-08 and COST Action FA1002 and the Dutch Organization for Scientific Research (NWO) Vidi grant VI-917.11.398. Figure 3. Comparison of the number of matches between a query species (cod, N = 20) and the reference libraries from cod, two other members of the Gadidae family (haddock and pollock), and a more distantly related species (salmon). All individual cod samples and technical replicates match best to the cod reference, and significantly better than either the haddock (p = 3.1 × 10−16) or pollock (p = 3.8 × 10−17) reference (one-sided paired t test).

(1) Carrera, M.; Canas, B.; et al. Proteomics for the assessment of quality and safety of fishery products. Food Res. Int. 2012, DOI: 10.1016/j.foodres.2012.10.027. (2) Filonzi, L.; Chiesa, S.; Vaghi, M.; Marzano, F. N. Molecular barcoding reveals mislabelling of commercial fish products in Italy. Food Res. Int. 2010, 43, 1383−1388. (3) Miller, D.; Jessel, A.; Mariani, S. Seafood mislabelling: comparisons of two western European case studies assist in defining influencing factors, mechanisms and motives. Fish Fish. 2012, 13, 345− 358. (4) Wong, E. H. K.; Hanner, R. H. DNA barcoding detects market substitution in North American seafood. Food Res. Int. 2008, 41, 828− 837. (5) FAO. The State of World Fisheries and Aquaculture; Food and Agriculture Organization of the United Nations: Rome, 2012. (6) Rasmussen, R. S.; Morrissey, M. T. Application of DNA-Based Methods to Identify Fish and Seafood Substitution on the Commercial Market. Compr. Rev. Food Sci. Food Saf. 2009, 8, 118−154. (7) Wong, L. L.; Peatman, E.; Lu, J. G.; Kucuktas, H.; He, S. P.; Zhou, C. J.; Na-nakorn, U.; Liu, Z. J. DNA Barcoding of Catfish: Species Authentication and Phylogenetic Assessment. PLoS One 2011, 6. (8) Carrera, M.; Canas, B.; Gallardo, J. M., Fish Authentication. In Proteomics in Foods - Principles and Applications; Toldrá, F., Nollet, L. M. L., Eds.; Springer: New York, 2013; Vol. 2; pp 205−222. (9) Sentandreu, M. A.; Fraser, P. D.; Halket, J.; Patel, R.; Bramley, P. M. A proteomic-based approach for detection of chicken in meat mixes. J. Proteome Res. 2010, 9, 3374−83. (10) Scobbie, A. E.; Mackie, I. M. The Use of Sodium DodecylSulfate Polyacrylamide-Gel Electrophoresis in Fish Species Identification - a Procedure Suitable for Cooked and Raw Fish. J. Sci. Food Agric. 1988, 44, 343−351. (11) Pineiro, C.; Barros-Velazquez, J.; Perez-Martin, R. I.; Gallardo, J. M. Specific enzyme detection following isoelectric focusing as a complimentary tool for the differentiation of related Gadoid fish species. Food Chem. 2000, 70, 241−245. (12) Martinez, I.; Friis, T. J. Application of proteome analysis to seafood authentication. Proteomics 2004, 4, 347−354. (13) Rodrigues, P. M.; Silva, T. S.; Dias, J.; Jessen, F. PROTEOMICS in aquaculture: Applications and trends. J. Proteomics 2012, 75, 4325− 4345. (14) Carrera, M.; Canas, B.; Pineiro, C.; Vazquez, J.; Gallardo, J. M. Identification of commercial hake and grenadier species by proteomic analysis of the parvalbumin fraction. Proteomics 2006, 6, 5278−5287. (15) Mazzeo, M. F.; De Giulio, B.; Guerriero, G.; Ciarcia, G.; Malorni, A.; Russo, G. L.; Siciliano, R. A. Fish Authentication by MALDI-TOF Mass Spectrometry. J. Agric. Food Chem. 2008, 56, 11071−11076. (16) Berrini, A.; Tepedino, V.; Borromeo, V.; Secchi, C. Identification of freshwater fish commercially labelled ″perch″ by

fresh samples (two-sided, unpaired t test). To make the study as comparable to the commercial processing as possible, the preparation of the deep-fried samples included the fillets being breaded in flour and egg white. To test if this contaminated the samples, the LC-MS/MS data sets were searched against chicken sequences. No chicken ovalbumin or any other chicken protein was identified in any of the deep-fried samples, indicating that the egg white stays on the outside of the fish. From all samples in the study, protein extraction and all subsequent steps were performed in completely standardized and simple workflow using well-tested protocols and software without any need for adaptation, even for the deep-fried samples in which cooking oil may enter the sampled tissue. Here we analyzed the samples using unmodified, commercially available, and relatively inexpensive ion traps, performing datadependent tandem mass spectrometry with collision-induced dissociation. Virtually all existing ion-trap instruments, triple quadrupole, and quadrupole-time-of-flight instruments can carry out this type of analysis. The sample preparation techniques are standard procedures and can be replicated in any protein chemistry or mass spectrometry laboratory. Starting with the sample preparation, all steps are simple and uniform, and the same workflow worked for fresh as well as differently processed samples, enabling a uniform and automated protocol. Any included software was used as previously described and with all default settings, except for the generation of the spectral libraries from tandem mass spectra regardless of identification (zero probability threshold for inclusion in library). No “tweaking” or optimization was needed in any stage for the method to work, suggesting that the method is inherently robust. The findings confirm that proteome-wide comparison by direct spectral matching without peptide or protein identification is a powerful tool to identify samples of unknown species origin, even in processed fish products.



ASSOCIATED CONTENT

S Supporting Information *

Taxonomic information from 22 different species. This material is available free of charge via the Internet at http://pubs.acs.org. F

dx.doi.org/10.1021/pr4006525 | J. Proteome Res. XXXX, XXX, XXX−XXX

Journal of Proteome Research

Article

isoelectric focusing and two-dimensional electrophoresis. Food Chem. 2006, 96, 163−168. (17) Carrera, M.; Canas, B.; Pineiro, C.; Vazquez, J.; Gallardo, J. M. De novo mass spectrometry sequencing and characterization of species-specific peptides from nucleoside diphosphate kinase B for the classification of commercial fish species belonging to the family Merlucciidae. J. Proteome Res. 2007, 6, 3070−80. (18) Carrera, M.; Canas, B.; Gallardo, J. M. Rapid direct detection of the major fish allergen, parvalbumin, by selected MS/MS ion monitoring mass spectrometry. J. Proteomics 2012, 75, 3211−20. (19) Ortea, I.; Canas, B.; Gallardo, J. M. Mass spectrometry characterization of species-specific peptides from arginine kinase for the identification of commercially relevant shrimp species. J. Proteome Res. 2009, 8, 5356−62. (20) Near, T. J.; Eytan, R. I.; Dornburg, A.; Kuhn, K. L.; Moore, J. A.; Davis, M. P.; Wainwright, P. C.; Friedman, M.; Smith, W. L. Resolution of ray-finned fish phylogeny and timing of diversification. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 13698−703. (21) Palmblad, M.; Deelder, A. M. Molecular phylogenetics by direct comparison of tandem mass spectra. Rapid Commun. Mass Spectrom. 2012, 26, 728−32. (22) Palmblad, M.; van der Burgt, Y. E.; Mostovenko, E.; Dalebout, H.; Deelder, A. M. A novel mass spectrometry cluster for highthroughput quantitative proteomics. J. Am. Soc. Mass Spectrom. 2010, 21, 1002−11. (23) Mostovenko, E.; Deelder, A. M.; Palmblad, M. Protein expression dynamics during Escherichia coli glucose-lactose diauxie. BMC Microbiol 2011, 11, 126. (24) Tamura, K.; Peterson, D.; Peterson, N.; Stecher, G.; Nei, M.; Kumar, S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 2011, 28, 2731−9. (25) Craig, R.; Beavis, R. C. TANDEM: matching proteins with tandem mass spectra. Bioinformatics 2004, 20, 1466−7. (26) Keller, A.; Eng, J.; Zhang, N.; Li, X. J.; Aebersold, R. A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Mol. Syst. Biol. 2005, 1, 1−8. (27) Keller, A.; Nesvizhskii, A. I.; Kolker, E.; Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 2002, 74, 5383− 92. (28) Crespi, B. J.; Fulton, M. J. Molecular systematics of Salmonidae: combined nuclear data yields a robust phylogeny. Mol. Phylogenet. Evol. 2004, 31, 658−79. (29) Michelini, E.; Cevenini, L.; Mezzanotte, L.; Simoni, P.; Baraldini, M.; De Laude, L.; Roda, A. One-step triplex-polymerase chain reaction assay for the authentication of yellowfin (Thunnus albacares), bigeye (Thunnus obesus), and skipjack (Katsuwonus pelamis) tuna DNA from fresh, frozen, and canned tuna samples. J. Agric. Food Chem. 2007, 55, 7638−47.

G

dx.doi.org/10.1021/pr4006525 | J. Proteome Res. XXXX, XXX, XXX−XXX