Authentication of Closely Related Fish and Derived ... - ACS Publications

Apr 18, 2016 - (Grace, Columbia, MD) for direct MS analysis or stored at −20 °C until further ..... Wageningen UR and Hilde van Pelt-Heerschap at I...
0 downloads 0 Views 2MB Size
Subscriber access provided by SUNY DOWNSTATE

Article

Authentication of closely related fish and derived fish products using tandem mass spectrometry and spectral library matching Merel Anne Nessen, Dennis Jorden van der Zwaan, Sander Greevers, Hans Dalebout, Martijn Staats, Esther Kok, and Magnus Palmblad J. Agric. Food Chem., Just Accepted Manuscript • DOI: 10.1021/acs.jafc.5b05322 • Publication Date (Web): 18 Apr 2016 Downloaded from http://pubs.acs.org on April 23, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Agricultural and Food Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 38

Journal of Agricultural and Food Chemistry

1

Authentication of closely related fish and derived fish products using tandem mass

2

spectrometry and spectral library matching

3 4

Merel A. Nessen†, Dennis J. van der Zwaan†, Sander Grevers‡, Hans Dalebout‡, Martijn

5

Staats†, Esther Kok*†, Magnus Palmblad*‡

6 7

† RIKILT Wageningen UR, P.O. Box 230, 6700 AE Wageningen, The Netherlands

8

‡ Center for Proteomics and Metabolomics, Leiden University Medical Center, P.O. Box

9

9600, 2300 RC Leiden, The Netherlands

10

* Corresponding Authors: Tel. +31 317 480252, E-mail: [email protected] and Tel.: +31 71

11

5269582, E-mail: [email protected]

12 13

The authors declare no competing financial interest.

1

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 2 of 38

14

Abstract

15

Proteomics methodology has seen increased application in food authentication, including

16

tandem mass spectrometry of targeted species-specific peptides in raw, processed or mixed

17

food products. We have previously described an alternative principle that uses untargeted data

18

acquisition and spectral library matching - essentially spectral counting - to compare and

19

identify samples without the need for genomic sequence information in food species

20

populations. Here we present an interlaboratory comparison demonstrating how a method

21

based on this principle performs in a realistic context. We also increasingly challenge the

22

method by using data from different types of mass spectrometers, by trying to distinguish

23

closely related and commercially important flatfish, and by analyzing heavily contaminated

24

samples. The method was found to be robust in different laboratories and 94-97% of the

25

analyzed samples were correctly identified, including all processed and contaminated

26

samples.

27 28

Key Words

29

Food authentication

30

Species identification

31

Mass spectrometry

32

Proteomics

33

Spectral libraries

2

ACS Paragon Plus Environment

Page 3 of 38

Journal of Agricultural and Food Chemistry

34

Introduction

35

Fraud has likely occurred since the origin of trade, as it has always been lucrative to make

36

more profit out of inferior products. Food products such as wine, beer and bread have been

37

commonly subjected to adulteration in the past.1 Nowadays, globalization and

38

industrialization allows distribution of (in principle) good quality food to a large part of the

39

world’s population. At the same time it has become a daunting task to trace the origin of

40

products and the composition of processed and “mixed” products. This was illustrated by

41

recent incidents in Europe, where beef in processed foods such as lasagna and meatballs was

42

adulterated or substituted by horse meat2 and studies in the United Kingdom and Ireland

43

revealing cheaper substitutes were used in battered fish and marketed as the traditional “fish

44

and chips” dish.3, 4

45

Several studies on the authenticity of meat and fish products have revealed structural

46

inadequacies in some food supply chains in this respect, showing between 17% and 68% of

47

the analyzed samples to contain undeclared species.5-8 However, two surveys focusing on

48

Western European countries (United Kingdom and France) show a lower rate of false

49

labelling (5.5% and 3.7% respectively).9,

50

significant difference in the occurrence of food fraud between countries, as suggested by

51

Bénard-Capelle et al..10 Another reason may be that the increased enforcement of correct

52

labelling after the recent food scandals in Western Europe has been successful in discouraging

53

similar fraud, as recently described in a European study on the (mis)labeling in the seafood

54

supply chain.11, 12

10

This discrepancy could be explained by a

55

For fishery and aquaculture products in the European Union, specific labelling

56

instructions are given in the Regulation (EU) No 1169/2011 on the provision of food

57

information to consumers, which entered into force on 13 December 2014. This regulation

58

stipulates that the name of the food shall be its legal name, i.e. the scientific name of the

3

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 4 of 38

59

species. Identification of fresh fish traditionally takes place by visual inspection, examining

60

the anatomy and morphology of the fish. For closely related fish species this can be a

61

challenge. Especially after removal of the skin, leaving the bare fillets, identification of the

62

different (flat)fish species is troublesome. For further processed fish products authentication

63

becomes a very difficult if not impossible task.

64

Molecular and chemical methods have been developed to assist in the identification of

65

fish species, such as immunochemistry and DNA-based methods. In particular the latter is

66

standardly used for species identification in food safety and quality assessments, mainly

67

utilizing real-time PCR and DNA barcoding methods.13 DNA barcoding is a method for

68

assigning taxonomy to species using standardized short DNA sequences.14 More recently and

69

upcoming is the application of DNA metabarcoding, which involves Next-Generation

70

Sequencing (NGS) of DNA barcodes for the simultaneous detection of multiple species in

71

complex samples.15 Advantages of DNA-based methods are the robustness, low detection

72

limits and high specificity of the analysis. The drawback of DNA-based methods is that

73

species can only be identified by specific, known target sequences. Furthermore, due to the

74

low detection limits even small traces that are introduced unintentionally, either in the food

75

production process or in the analysis process, will be identified.

76

In recent years, proteomics is increasingly applied to assess the quality and safety of 16

and 17). Proteomics is the large-scale study

77

fish and fish derived products (for reviews see

78

of the expression, structure and function of proteins in specific cells, tissues or organisms.

79

Mass spectrometry is the most commonly used technology in proteomics, especially in so-

80

called ‘shot-gun’ or ‘bottom-up’ proteomics, and allows identification and quantification of

81

thousands of proteins. In this bottom-up proteomics, proteins are typically enzymatically

82

cleaved into peptides and the obtained digest is analyzed by liquid-chromatography tandem

83

mass spectrometry (LC-MS/MS). Proteins can be identified by matching the peptide

4

ACS Paragon Plus Environment

Page 5 of 38

Journal of Agricultural and Food Chemistry

84

fragmentation spectra against a protein sequence database, comparing the experimental

85

peptide spectra with the predicted peptide spectra generated from the sequence database.

86

Protein identification is in proteomics is generally restricted by the availability of

87

(genetic) sequence information. This is not a problem for human or common model organisms

88

whose genomes are completely sequenced and well annotated. But most data analysis

89

methods in proteomics are inapplicable when the species is unknown and no sequence

90

information is available for that species.

91

For authentication therefore, current methods mainly focus on the detection and

92

quantification of species specific peptides of which the sequence information is available.18-24

93

However, the (genetic) sequence information for many fish species is often limited to the

94

mitochondrial cytochrome c oxidase I (COI) and cytochrome b genes, two regions that are

95

often used for DNA barcoding.13 Selection of species specific protein or peptide biomarkers

96

can therefore be a labor intensive task, requiring well-defined samples of the species of

97

interest, as well of all closely related species. An example is the selection and de novo

98

sequencing by mass spectrometry of a species specific, thermostable and allergenic protein,

99

parvalbumin, allowing discrimination between 11 different species from the unsequenced

100

Merlucciidae family.25-27

101

To overcome the limited availability of genome sequences of many animal species,

102

authentication by mass spectrometry and spectral library matching may be a robust and

103

reliable alternative to current methods for the identification of species. For microorganisms in

104

clinical settings Matrix Assisted Laser Desorption Ionization Time-of-Flight Mass

105

Spectrometry (MALDI-TOF-MS) has become a common tool to rapidly identify species and

106

define proper treatment.28 The MALDI-TOF-MS analysis results in a single spectrum (or

107

technically the sum of multiple spectra from the same sample), consisting of m/z values of

108

peptides, proteins (e.g. ribosomal proteins) and other cell components of the investigated

5

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 6 of 38

109

microorganism. This mass spectral fingerprint can be matched against spectra from known

110

isolates in a reference database, allowing identification of the species. This approach was

111

recently applied to identify scallop species29 and to determine the origin of meat and gelatin.30

112

MALDI-TOF allows fast analysis of untreated (intact) samples, though the information in the

113

obtained profiles is limited and might be insufficient for more complex (processed) samples

114

and closely related species. Furthermore, for the identification of microorganisms, standard

115

culture conditions need to be used for the MALDI-TOF spectra to match the reference.

116

Alternatively, proteome-wide tandem mass spectrometry (MS/MS) can be used.31-33

117

These approaches typically use tandem mass spectrometry data from protein digests and, like

118

the MALDI-TOF method, can use the spectra directly and therefore do not require any

119

genome sequence information. Figure 1 shows an overview of the workflow for the

120

identification of animal species. First, one spectral library is created for each species from the

121

tandem mass data (ignoring the chromatographic dimension). For authentication, an unknown

122

sample is analyzed using the same or a similar workflow, and the tandem mass data used to

123

query all spectral libraries to find the best matching reference. Identification takes place by

124

counting the number of shared tandem mass spectra between the investigated sample and the

125

reference spectral libraries. In a previous study using this approach, we could correctly

126

identify 22 analyzed fish samples.32 In addition, 47 additional fish samples, both fresh and

127

processed (steamed, smoked, fried, autoclaved or canned) were analyzed. All samples, except

128

one, a smoked salmon, were correctly identified simply by counting shared spectra. Three

129

samples of a tuna species that was not included the reference libraries matched best to one of

130

the two tuna species that was included. This demonstrated that this method allows

131

identification on the genus or family level, using the closest relative in the reference database,

132

when the correct species is not present in the database.

6

ACS Paragon Plus Environment

Page 7 of 38

Journal of Agricultural and Food Chemistry

133

In food safety and quality control, robust application of analysis methods is a

134

prerequisite. Sample preparation and analysis at different laboratories, utilizing different

135

preparation techniques and different types of tandem mass spectrometers, should ideally result

136

in identical results. Furthermore, processed samples mixed with additional ingredients are

137

common, and a method should allow identification of the correct species in these kinds of

138

samples as well. We therefore challenged the previously developed method and applied it,

139

without modification or fine tuning, to the identification and discrimination of closely related

140

flatfish species in two different laboratories, using two types of common tandem mass

141

spectrometers. In addition, we analyzed samples of a battered and fried fish product obtained

142

from a local fish store to investigate the applicability to heavily contaminated samples.

143 144

Materials and methods

145

Samples

146

Five different flatfish were prepared in this study: European plaice (Pleuronectes platessa),

147

Rock sole (Lepidodsetta bilineata), turbot (Scophthalmus maximus) and common dab

148

(Limanda limanda) were collected by IMARES-WUR in Den Helder. Yellowfin sole

149

(Limanda aspera) was obtained from a local fish shop and originate from the Pacific Ocean.

150

The identity of the five selected flatfish was determined by morphological experts from

151

IMARES Wageningen UR and by DNA barcoding at RIKILT Wageningen UR. Samples

152

were stored at -20°C until further preparation and analysis.

153

For ion trap libraries, samples of one fish from the five selected flatfish were prepared

154

in triplicate. For the Q-Exactive libraries, samples of the same fish were prepared in triplicate

155

for European plaice, common dab, yellowfin sole and rock sole. For preparation of the

156

“unknown” samples, samples of the same European plaice, common dab and yellowfin sole

157

were prepared in triplicate at two laboratories, twice at two different days, obtaining a total of

158

four sets of nine samples. 7

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 8 of 38

159

A battered and fried fish, known as “kibbeling” in the Netherlands, was obtained from

160

a local fish store. Samples containing either ~90% fish (and ~10% batter) or 10-20% fish (and

161

80-90% batter) were taken from the fried battered fish and prepared in triplicate freshly before

162

analysis.

163

Sample preparation

164

Of each fish sample, ~20 mg of muscle tissue was taken and proteins were extracted in 100

165

µL urea buffer (8 M urea, 40 mM MgCl (lab 1) or MgSO4 (lab 2), 50 U/mL benzonase).

166

Samples were homogenized either for 3 minutes using 0.5 mm zirconium oxide beads (Next

167

Advance Inc., Averill Park, NY) in an air-cooled Bullet Blender® (Next Advance Inc.,

168

Averill Park, NY) on speed 8 (lab 1) or twice at 6500 rpm for 2 times 20 seconds using 2.3

169

mm zirconium silica beads (Precellys, Bertin, France) in a PRECELLYS® 24 homogenizer

170

(Precellys, Bertin, France) (lab 2). After incubation for 12 minutes at 4°C, supernatant was

171

collected after centrifugation of the samples at 16,000 x g for 30 minutes at 4°C and

172

transferred to LoBind Eppendorf tubes (Eppendorf, Hamburg, Germany). Final protein

173

concentration was determined using a micro bicinchoninic acid (BCA) protein assay kit

174

(Thermo Fisher Scientific).

175

Tryptic digestion

176

To either 150 µg of proteins in 20 µL (volume adjusted by 50 mM ammonium bicarbonate)

177

(lab 1) or 10 µL protein extract and 10 µL 50 mM ammonium bicarbonate (20 µL final

178

volume) (lab 2), 4 µL 60 mM dithiothreitol (10 mM final concentration) was added. Cysteines

179

were reduced for 45 minutes at 60 °C, after which alkylation of free thiols was achieved by

180

addition of 6 µL 100 mM iodoacetamide (20 mM final concentration) and incubation for 1

181

hour at room temperature. Excess of chemicals was removed by dilution of the sample 1:4

182

with 50 mM ammonium bicarbonate and centrifugation for 30 minutes at 14.000xg in a

183

Amicon Ultra-0.5 mL 3K centrifugal filter device (Merck Millipore, Billerica, MA). The

8

ACS Paragon Plus Environment

Page 9 of 38

Journal of Agricultural and Food Chemistry

184

concentrate was transferred to a Lobind Eppendorf tube (Eppendorf, Hamburg, Germany) and

185

trypsin (Sequencing grade, Promega, Madison, WI) was added for digestion o/n at 37 °C

186

(enzyme: protein ratio ~1:100). Tryptic digestion was stopped by addition of 8 µL

187

trifluoroacetic acid (TFA). Samples were centrifuged for 10 minutes at 2500xg and either

188

transferred to a 1.2 mL Ultra Recovery Clear MS-vial (Grace, Columbia, MD) for direct MS

189

analysis or stored at -20 C until further analysis.

190

Mass spectrometric analysis

191

Ion Trap

192

For ion trap analyses, 2 µL of sample (~10 µg protein digest) was loaded and desalted on a

193

C18 PepMap 300 µm x 5 mm, 300 Å precolumn (Thermo Scientific), after which peptides

194

were separated by reversed-phase liquid chromatography using two identical MicroLC

195

columns (3 µm, ChromXP C18CL, 120Å, 150 x 0.3 mm) (Eksigent, Dublin, CA) coupled in

196

parallel and connected to a splitless NanoLC-Ultra 2D plus system (also Eksigent) with a

197

linear gradient of 45 minutes from 4 to 35% solvent B at a flow rate of 4 µL/min (solvent A:

198

0.05% formic acid, solvent B: 95% acetonitrile, 0.05% formic acid). While the gradient was

199

applied to one column after sample injection, the other column was being washed and

200

equilibrated. A 6-port column selection valve was used to direct the eluent from the column

201

running the gradient to the mass spectrometer and divert the wash from the other column to

202

waste. The column selection valve was connected to an amaZon speed ETD ion trap (Bruker

203

Daltonics) configured with an Apollo II ESI source. After each MS scan, up to 10 abundant

204

multiply charged species in the mass range of m/z 300-1,300 were selected for tandem mass

205

spectrometry and actively excluded for one minute after having been selected twice. The LC

206

system was controlled by HyStar 3.2 and the ion trap by trapControl 7.1.

207

Q-Orbitrap

9

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 10 of 38

208

For Q-Orbitrap analyses, 5 µL of sample (~35 µg protein digest) was loaded on an UltiMate

209

3000 UHPLC system (Thermo Scientific Dionex) equipped with a UPLC BEH300 C18 150

210

mm x 1 mm column (Waters) placed in a column oven at 50°C. Peptides were separated with

211

a linear gradient of 45 minutes from 5 to 35% solvent B at a flow rate of 0.1 mL/min (solvent

212

A: 5% acetonitrile, 0.05% formic acid, solvent B: 95% acetonitrile, 0.05% formic acid). The

213

LC system was coupled to a Q-Exactive Orbitrap MS (Thermo Scientific) configured with a

214

heated electrospray ionization II source in positive mode. MS scans were recorded in a mass

215

range of m/z 300-2,000 at a resolution of 70,000 with an AGC target of 3e6. After each MS

216

scan up to 10 most abundant multiply charged ions were selected for fragmentation. MS2

217

scans were recorded at a resolution of 17,500 with an AGC target of 1e5 and a maximum fill

218

time of 50 ms, using either a fixed NCE of 25 or a stepped NCE of 20, 25 and 30 for

219

fragmentation in the HCD cell.

220

Data analysis

221

LC-MS/MS datasets were converted to .mzXML34 format using compassXport (Bruker) for

222

ion trap data and msconvert35 for Q-Exactive data. For conversion of the Q-Exactive .raw

223

data, the vendors software Xcalibur (Thermo Scientific) needs to be installed as well.

224

Generation of libraries

225

Libraries of the five (Ion trap) or four (Q-Orbitrap) selected flatfish were generated as

226

described by Wulff et al.32 and added to our database. Spectral libraries were generated using

227

SpectraST 4.0 by first searching a randomized zebrafish protein sequence database with

228

X!Tandem36 and including all results in the PeptideProphet analysis,37 as previously

229

described.32 From version 5.0, SpectraST can directly build spectral libraries from

230

unidentified peptides.

231

Analysis query data

10

ACS Paragon Plus Environment

Page 11 of 38

Journal of Agricultural and Food Chemistry

232

Converted LC-MS/MS datasets were analyzed using SpectraST version 4.0 (as part of Trans-

233

Proteomic Pipeline version 4.8.0 PHILAE, Build 201411201551-6764 (mingw-i686)) under

234

Debian Linux. Each dataset was searched against 27 fish libraries present in our database. For

235

the analysis against Q-Orbitrap libraries, an additional four (flat)fish libraries were present.

236

The number of spectral hits with a dot product of 0.7 or higher were returned (dot products

237

above 0.7 represent good SpectraST matches with typical false discovery rates below 1%).

238

The exact number of spectral hits returned might depend on the specific TPP and SpectraST

239

versions installed and used for the analysis.

240 241

Results and discussion

242

Spectral library matching has been shown to be valuable for species identification of both

243

microorganisms and animal species.31-33 To assess applicability of this approach to food safety

244

and quality control, an interlaboratory comparison was carried out and the compatibility of

245

data from different mass spectrometers was investigated, applied to closely related flatfish

246

species and contaminated processed fish samples.

247

Selection of flatfish samples and libraries

248

For the identification of closely related flatfish, three commercially common flatfish,

249

European plaice, common dab and yellowfin sole were selected. European plaice and

250

common dab are common to the seas of Northern Europe and the North-Eastern part of the

251

Atlantic Ocean, while yellowfin sole mainly originates from the Pacific Ocean. Similar in

252

taste, yellowfin sole is a common substitute, especially for the European plaice.

253

Morphological discrimination of the species is readily feasible when the skin is still present on

254

the fish, European plaice has a distinctive orange to red dots on its brownish skin, which is

255

absent for the two Limanda species. However, fillets without skin are difficult to impossible

256

to identify.

11

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 12 of 38

257

Spectral libraries of tandem mass data from 22 fish species were already generated in a

258

previous study, including from three flatfish: Atlantic halibut and Greenland halibut

259

(Pleuronectidae) and common sole (Soleidae). In addition to the three selected flatfish for the

260

analysis, two additional flatfish were chosen to add to the library database to further

261

investigate the specificity of the approach. Rock sole is a species that, as the European plaice,

262

common dab and yellowfin sole, belongs to the Pleuronectidae family, while turbot is part of

263

the more distinct Scophthalmidae family. After preparation and analysis of all samples, for

264

each species, the overall best scoring library (i.e. most spectral hits) was selected to be added

265

to the existing species database. With the spectral libraries of the flatfish already present in the

266

database a total of eight flatfish spectral libraries are now available.

267

Identification of flatfish by spectral library matching

268

Identification of the flatfish was accomplished by analysis of the obtained tandem mass

269

spectrometry data of each sample against the spectral libraries of all species available at our

270

laboratory, using SpectraST.38 For each query dataset, the total number of tandem mass

271

spectra that have a good match to the tandem mass spectrometry data in each species library

272

(dot-product > 0.7) is returned. The species library returning the highest number of spectral

273

matches is considered the identity of the (unknown) fish sample.

274

In Figure 2, results typically obtained for each of the three flatfish are presented for the

275

ion trap data. The total number of spectral matches to each of the spectral libraries in the

276

database is given. The European plaice (Figure 2A) matches to the correct library for all

277

analyzed samples, with the second best library match at around 60% (55% - 65%) of the total

278

spectral hits of the first library. A clear distinction can be made between this species and the

279

other flatfish libraries in the database. For the two Limanda species, the common dab (Figure

280

2B) and yellowfin sole (Figure 2C), the number of spectral hits clearly show that these two

281

species are more related. The second best match for the species in this study was always

12

ACS Paragon Plus Environment

Page 13 of 38

Journal of Agricultural and Food Chemistry

282

found to be the other Limanda species, with around 90% (82% - 99%) of the total number of

283

spectral matches of the best match. The single incorrect identification of the flatfish was a

284

yellowfin sole sample, which was falsely identified as common dab (Supporting Information

285

1, Figure S1). Overall, out of the 36 samples analyzed, high quality data was obtained from 34

286

samples that could be used for the spectral library matching. Out of these 34 samples only one

287

sample was incorrectly identified (yellowfin sole as common dab) and it was therefore

288

possible to identify 97% of the samples correctly.

289

Apart from the identity of the species, also information on the number of spectral

290

matches to other species is obtained, with phylogenetically more closely related species

291

having a higher number of spectral matches compared to less related species. Closely related

292

species will have more shared proteins and more identical peptide sequences and therefore

293

more common tandem mass spectra, which is reflected in our analyses as depicted in Figure 2.

294

The flatfish matches best with the other flatfish libraries in the database (in green), after which

295

the samples are closest to the other fish (in blue). The number of library matches to mammal

296

(in red) and bird (in orange), however, show a shallow, almost flat descend in number of

297

spectral hits. These hits are most likely to come from conserved regions of (abundant) muscle

298

proteins that are highly conserved among metazoans. The last column in the graphs represents

299

the spectral hits against the European squid (dark blue), a cephalopod mollusk that is

300

phylogenetically most distantly related to the investigated flatfish. From these observations

301

the evolutionary distance to a species can be estimated: when an unknown sample matches

302

equally well to all species in one clade, the mammals in this case, then one can conclude the

303

species does not belong to that clade (the mammals).

304

Spectral library matching for species identification is robust

305

The method was further investigated by comparison of the preparation and analysis of the

306

flatfish samples at different laboratories and at different days, without any optimization of the

13

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 14 of 38

307

method. As a measure for the repeatability and reproducibility the total number of spectral

308

hits to the correct library is used. This will take into account both the quality of the sample

309

itself and the tandem mass spectrometry data (and therewith performance of the mass

310

spectrometer). Here, the repeatability has been defined as the variation in number of spectral

311

hits of the samples prepared and analyzed on each single day (i.e. within-day variation). For

312

the reproducibility, multiple days are taken into account. The within-lab reproducibility is

313

determined by the variation in spectral hits between the two different days at each laboratory,

314

while for the between-lab reproducibility the results of all analyses are taken into account. In

315

Figure 3, the average number of spectral hits and the corresponding standard deviation is

316

given as a measure for the repeatability, within-lab and between-lab reproducibility.

317

The variation in the number of spectral hits between the samples prepared on the same

318

day was found to be good to very good to with variations between 1.0% and 27.9% (see

319

Supporting Information 1, table S1). Earlier experiments showed that triplicate analyses of the

320

same sample resulted in a variation of about 20% in number of spectral hits to the correct

321

library32 and is in the same order of magnitude as the variation observed in this study between

322

different samples of the same species. The samples of the common dab showed a variation of

323

1.0% to 27.0%, with an average of 10.1%. The European plaice (6.3% - 20.5%, average of

324

15.1%) and the yellowfin sole (3.0 – 27.9%, average of 16.6%) follow.

325

The within-lab reproducibility was found to be comparable to the repeatability, with

326

variations in total number of hits between 6.5% and 25.8%. The between-lab reproducibility

327

was found to be slightly higher, but good with variations of the total number of spectral hits of

328

21.7% for European plaice, 13.2% for common dab and 17.7% for yellowfin sole.

329

In conclusion, the method was found to be reproducible and robust with a variation in

330

total number of spectral hits in the same order of magnitude of replicate analyses of a single

14

ACS Paragon Plus Environment

Page 15 of 38

Journal of Agricultural and Food Chemistry

331

sample. Furthermore, the approach is easily implemented and applied at other laboratories,

332

without the need of optimization of sample preparation or data analysis.

333

Data of different mass spectrometers is compatible

334

In addition to the reproducibility of the method the use of different types of mass

335

spectrometers was investigated. In previous species identification studies that used spectral

336

library matching with SpectraST, ion trap mass spectrometers have been used.31, 32, 39 In this

337

study, tandem mass spectrometry data was also acquired on a Q-Orbitrap mass spectrometer.

338

The two mass spectrometers implement collision induced dissociation differently, with the Q-

339

Orbitrap making use of a higher energy collision induced dissociation (HCD) cell. In addition,

340

the ion trap ramps the gradient collision energy for fragmentation, while the Q-Orbitrap uses

341

either a fixed collision energy, or a three stepped collision energy. This will influence the ions

342

observed in the spectra and have an impact on how well the tandem mass spectra match.

343

To investigate the compatibility of the ion trap and Q-orbitrap data, libraries of four

344

flatfish, European plaice, common dab, yellowfin sole and rock sole, were generated as well

345

on the Q-Orbitrap using a three stepped collision energy. It was chosen to use a stepped

346

collision energy as this will generate fragmentation spectra that have a higher resemblance to

347

the ion trap fragmentation spectra and increases the number of spectral hits. All obtained data

348

(ion trap and Q-Orbitrap) was analyzed against both the database with ion trap libraries and

349

Q-Orbitrap libraries. In Figure 4C a representative example is given of peptide fragmentation

350

spectra generated by the ion trap and Q-Orbitrap mass spectrometer. In butterfly plots,

351

matching query tandem mass spectra (in blue on the top) and library tandem mass spectra (in

352

red on the bottom) generated by both mass spectrometers are shown. The butterfly plot in the

353

middle shows the difference in fragmentation pattern for the Q-Orbitrap (query data, on the

354

top) and ion trap (library data, on the bottom), which results in a lower resemblance and thus

355

dot product, though it still correctly matches the two spectra.

15

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 16 of 38

356

An overview of the results of the spectral library matching can be found in Figure 4.

357

Both the average number of spectral hits to the first (correct) match with standard deviation

358

and the percentage of correct identifications is given. The ion trap data matches with an

359

average of around 3,500 spectral hits to the correct library in the ion trap database. This is

360

almost three times higher compared with the results for the Q-Orbitrap data matching to the

361

correct library in the Q-Orbitrap database (on average around 1,200 spectral hits), even

362

though the libraries contain a similar number of total tandem mass spectra (10,617±1,158 for

363

the ion trap libraries, 7,808±964 for the Q-Orbitrap libraries). The difference can be partially

364

explained by the use of different settings for the collision energy for the Q-Orbitrap data of

365

samples compared with the libraries. For the MS analysis of the samples a fixed collision

366

energy was used, while for the libraries a stepped collision energy was employed, resulting in

367

about 10% less spectral hits (Supporting Information 1, table S2). Regardless the difference in

368

total number of spectral hits, the number of correct identifications for the ion trap and Q-

369

Orbitrap are comparable, 97% (33 out of 34) and 94% (32 out of 34) respectively, when the

370

data is matched to the libraries of the same mass spectrometer.

371

To investigate the possibility to use a single database containing libraries of data

372

derived from one type of instrument, query data and libraries of different sources were

373

combined. Analysis of the ion trap query data to the Q-Orbitrap libraries shows a large

374

decrease in the total number of spectral hits compared to the ion trap libraries. A reduction of

375

more than five times is observed (624 vs 3,427) for the number of spectral hits. When fixed

376

collision energy libraries were used in the analysis, the number of spectral hits even dropped

377

to around 300 (Supporting Information 1, table S2).

378

Similar results are obtained for the analysis of Q-Orbitrap query data against ion trap

379

libraries. On average less than 100 spectra could be matched against the correct library, which

380

is more than ten times lower than the Q-Orbitrap query data against the Q-Orbitrap libraries

16

ACS Paragon Plus Environment

Page 17 of 38

Journal of Agricultural and Food Chemistry

381

(~1,200 spectral hits) and six times lower than the ion trap data versus the Q-Orbitrap libraries

382

(~600 spectral hits). Nevertheless 88% of the samples were correctly identified, 5% higher

383

than the ion trap data versus Q-Orbitrap libraries (82.5%).

384

Even though the absolute total number of spectral hits (to the correct library) seem to

385

be of less importance, it does give information about the quality of the sample and the quality

386

of the data (both query and library). As well, in case the species of the investigated sample is

387

not part of the database, the data will match to the most closely related species. In this case,

388

comparison to the (average) total number of spectra of a correct identification might be able to

389

reveal the incorrect identification. For very closely related species, such as the here

390

investigated Limanda species, this will be difficult though, as the variation in average spectral

391

hits between the two species is lower than the variation between samples of the same species.

392

Furthermore, a deviation of the total number of spectral hits (compared to a standard) can be

393

an indication the investigated sample is actually a mixture and the number of spectral matches

394

can be potentially used to calculate the ratio in which the species are present in the mixture.

395

We recently have shown this possibility for mixtures of cow and horse meat40, though for

396

closely related species this will be a challenge as will be further discussed in the future

397

perspectives.

398

Identification of contaminated processed fish

399

Like many fish, flatfish is often served à la Meunière (lightly floured and fried in butter) or

400

battered and deep fried. We therefore further challenged the method by analyzing battered and

401

fried fish sold as “kibbeling” at a local fish store. Traditionally, this Dutch snack made from

402

cod, although it is now common to use other species as well. The fried fish meat was mixed

403

with pieces of the fried batter from the same sample to determine the degree of contamination

404

at which the fish can still be correctly identified. In both cases, fried fish was correctly

405

identified as cod as is shown in Figure 5, even when the sample mostly consists of batter. The

17

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 18 of 38

406

total number of spectral hits for the samples containing only 10% fish is about half of the total

407

number of spectral hits for the 90% fish samples (~2,000 hits vs ~3,500 hits), showing that the

408

quality of the sample does influence the total number of spectral hits, but not proportionally to

409

the purity. Furthermore, independent of the amount of fish in the sample, a similar trend in

410

number of spectral hits to successive species libraries is observed. Pollock and haddock

411

belong both to the same family as cod (Gadidae), and are the second and third best match,

412

respectively, with about 75% of the number of spectral hits to the cod library. Salmon,

413

included for comparison with Figure 3 in Wulff et al.32, is a far more distantly related species,

414

belonging to Protacanthopterygii, a different superorder of the infraclass Teleostei than cod (a

415

Neoteleost).41 The last common ancestor of cod and salmon lived ~240 Mya, which is clearly

416

reflected in the results with far fewer matched to the salmon reference library. This example

417

shows that the method also allows identification of species in processed fish samples, even

418

when the sample mostly consists of other components. And it mimics a forensic scenario

419

where only scraps or crumbs may be available for analysis.

420

Future perspectives

421

Here we have shown that untargeted tandem mass spectrometry and simple spectral library

422

matching can be applied reproducibly to identify closely related fish species and processed,

423

contaminated, fish products. The method can be optimized and improved to increase the

424

specificity and improve the reliability. The fish libraries could be extended by inclusion of

425

several individual fish, preferably caught at different locations and of different age, in each

426

species library. Library data files could be merged into a single, extended, library per species,

427

in a similar approach as described by Önder et al..39 This will produce richer libraries,

428

including information on variation within the populations from different geographic origins

429

for further analysis. Furthermore, the database should be extended by adding more flatfish

18

ACS Paragon Plus Environment

Page 19 of 38

Journal of Agricultural and Food Chemistry

430

species, by which a wider range of flatfish can be confidently identified and more insight

431

generated on the specificity of other closely related (flat)fish will be obtained.

432

Another topic that deserves detailed investigation is the discrimination of species in

433

mixtures of two or more closely related species without knowledge and targeting of species-

434

specific peptides. This is likely even more challenging than the identification of a single fish

435

species in a processed sample containing mainly plant material, which is easy to distinguish

436

from fish proteins. More distantly related species such as horse and cow can be identified and

437

relatively quantified in mixtures.40 However, in closely related species, the high fraction of

438

shared peptides and tandem mass spectra challenges the relative quantitation based solely on

439

spectral counting (essentially measuring a small difference between two large numbers). In

440

the current, simple, data analysis, only the total number of spectra matching the reference

441

samples are used. This simplicity emphasizes the inherent robustness and general nature of the

442

method, but detection of closely related species in mixtures may necessitate information on

443

species-specific peptides and tandem mass spectra.

444

Currently, DNA-based methods are most often used for the confirmation of species

445

identity. It has to be stressed that the here described and challenged method is not intended to

446

replace DNA-based methods, but is meant to complement existing ones. In specific situations,

447

such as for heavily processed samples or samples adulterated or contaminated by food

448

products containing none or little DNA, mass spectrometry can be a better solution. Another

449

promising application of spectral library matching is the identification of specific tissues, as

450

has been demonstrated in earlier work on zebrafish.42 Where DNA has a high specificity and

451

low detection limit, it cannot easily discriminate between different tissues derived from the

452

same species. The difference in expression (levels) of proteins in different tissues will allow

453

the our method to differentiate tissues and apply the method to products of organ meats.

19

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 20 of 38

454

Tandem mass spectrometry and spectral library matching has been successfully

455

applied to identify closely related flatfish species. The method is simple, robust, easily

456

transferable to different laboratories and allows the use of different types of mass

457

spectrometers. One database, consisting of spectral libraries of different species, from one

458

type of mass spectrometer, also allows identification of species based on data derived from

459

other mass spectrometers, albeit at a slightly lower identification rate is obtained. In addition,

460

the method correctly identifies the species in heavily contaminated fish samples. In the field

461

of food quality and safety, the method is of particular value in cases where DNA based

462

methods have problems, such as (heavily) processed samples, foods adulterated with protein

463

products and differentiation of tissues.

464 465

Acknowledgements

466

The authors thank Marleen Voorhuijzen and Theo Prins at RIKILT Wageningen UR and

467

Hilde van Pelt-Heerschap at IMARES Wageningen UR for identification of the flatfish, Klaas

468

Wubs and Marco Blokland at RIKILT Wageningen UR for technical support and Rob

469

Marissen at LUMC for help with implementing the reference library searches on a Web server

470

and the MassIVE submission.

471 472

Associated Content

473

Supporting Information

474

Supporting Information Available: Supporting Information 1: Contains Figure S1 (Limanda

475

species are sometimes difficult to discriminate), Table S1 (Repeatability and reproducibility

476

of number of spectral hits for the identification of flatfish using tandem mass spectrometry

477

and spectral library matching) and Table S2 (Comparison of Q-Orbitrap libraries, recorded

478

with stepped (S) or fixed (F) collision energy). This material is available free of charge via the

479

Internet at http://pubs.acs.org. 20

ACS Paragon Plus Environment

Page 21 of 38

Journal of Agricultural and Food Chemistry

480

Data

481

The Q-Exactive Orbitrap and amaZon ion trap data from the flatfish used to generate and

482

query the libraries are available on MassIVE (ftp://[email protected] , with

483

password ‘a’).

484 485 486

References

487

1.

488

2010, 112, 198-213.

489

2.

http://ec.europa.eu/food/food/horsemeat/.

490

3.

http://www.which.co.uk/news/2014/09/which-investigation-uncovers-fish-fraud-

491

379594/.

492

4.

http://cordis.europa.eu/news/rcn/32023_en.html.

493

5.

Cawthorn, D.-M.; Steinman, H. A.; Hoffman, L. C., A high incidence of species

494

substitution and mislabelling detected in meat products sold in South Africa. Food Control

495

2013, 32, 440-449.

496

6.

497

E., DNA barcoding reveals a high level of mislabeling in Egyptian fish fillets. Food Control

498

2014, 46, 441-445.

499

7.

500

V.; Kolovos, M.; Liakou, C.; Stasinou, V.; Mamuris, Z., What do we think we eat? Single

501

tracing method across foodstuff of animal origin found in Greek market. Food Research

502

International 2015, 69, 151-155.

503

8.

504

the U.S. commercial market using DNA-based methods. Food Control 2016, 59, 158-163.

Shears, P., Food fraud – a current issue but an old problem. British Food Journal

Galal-Khallaf, A.; Ardura, A.; Mohammed-Geba, K.; Borrell, Y. J.; Garcia-Vazquez,

Stamatis, C.; Sarri, C. A.; Moutou, K. A.; Argyrakoulis, N.; Galara, I.; Godosopoulos,

Kane, D. E.; Hellberg, R. S., Identification of species in ground meat products sold on

21

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 22 of 38

505

9.

Helyar, S. J.; Lloyd, H. A.; de Bruyn, M.; Leake, J.; Bennett, N.; Carvalho, G. R., Fish

506

product mislabelling: failings of traceability in the production chain and implications for

507

illegal, unreported and unregulated (IUU) fishing. PloS one 2014, 9, e98691.

508

10.

509

A., Fish mislabelling in France: substitution rates and retail types. PeerJ 2015, 2, e714.

510

11.

Bénard-Capelle, J.; Guillonneau, V.; Nouvian, C.; Fournier, N.; Le Loët, K.; Dettai,

http://ec.europa.eu/food/safety/official_controls/food_fraud/fish_substitution/index_en

511 512

.htm.

513

12.

514

Schröder, U.; Verrez-Bagnis, V.; Silva, H.; Vandamme, S. G.; Boufana, B.; Mendes, R.;

515

Shorten, M.; Smith, C.; Hankard, E.; Hook, S. A.; Weymer, A. S.; Gunning, D.; Sotelo, C. G.,

516

Low mislabeling rates indicate marked improvements in European seafood market operations.

517

Frontiers in Ecology and the Environment 2015, 13, 536-540.

518

13.

519

through DNA barcodes. Proceedings. Biological sciences / The Royal Society 2003, 270, 313-

520

21.

521

14.

522

Systematic biology 2005, 54, 852-9.

523

15.

524

R.; Dolman, P. M.; Woodcock, P.; Edwards, F. A.; Larsen, T. H.; Hsu, W. W.; Benedick, S.;

525

Hamer, K. C.; Wilcove, D. S.; Bruce, C.; Wang, X.; Levi, T.; Lott, M.; Emerson, B. C.; Yu,

526

D. W., Reliable, verifiable and efficient monitoring of biodiversity via metabarcoding.

527

Ecology letters 2013, 16, 1245-57.

528

16.

529

safety of fishery products. Food Research International 2013, 54, 972-979.

Mariani, S.; Griffiths, A. M.; Velasco, A.; Kappel, K.; Jérôme, M.; Perez-Martin, R. I.;

Hebert, P. D.; Cywinska, A.; Ball, S. L.; deWaard, J. R., Biological identifications

Hebert, P. D.; Gregory, T. R., The promise of DNA barcoding for taxonomy.

Ji, Y.; Ashton, L.; Pedley, S. M.; Edwards, D. P.; Tang, Y.; Nakamura, A.; Kitching,

Carrera, M.; Cañas, B.; Gallardo, J. M., Proteomics for the assessment of quality and

22

ACS Paragon Plus Environment

Page 23 of 38

Journal of Agricultural and Food Chemistry

530

17.

Tedesco, S.; Mullen, W.; Cristobal, S., High-Throughput Proteomics: A New Tool for

531

Quality and Safety in Fishery Products. Current Protein & Peptide Science 2014, 15, 118-

532

133.

533

18.

534

based approach for detection of chicken in meat mixes. Journal of proteome research 2010, 9,

535

3374-83.

536

19.

537

monitoring for the fast identification of seafood species. Journal of chromatography. A 2011,

538

1218, 4445-51.

539

20.

540

allergen, parvalbumin, by selected MS/MS ion monitoring mass spectrometry. Journal of

541

proteomics 2012, 75, 3211-20.

542

21.

543

B. P., Proteomic analysis of sarcoplasmic peptides of two related fish species for food

544

authentication. Applied biochemistry and biotechnology 2013, 171, 1011-21.

545

22.

546

metabolic networks and potential bioactive peptides for nutritional inferences. Journal of

547

proteomics 2013, 78, 211-20.

548

23.

549

sensitive high-performance liquid chromatography-tandem mass spectrometry method for the

550

detection of horse and pork in halal beef. Journal of agricultural and food chemistry 2013, 61,

551

11986-94.

552

24.

553

MS/MS based method for the fast and sensitive detection of horse and pork in highly

554

processed food. Journal of agricultural and food chemistry 2014, 62, 9428-35.

Sentandreu, M. A.; Fraser, P. D.; Halket, J.; Patel, R.; Bramley, P. M., A proteomic-

Ortea, I.; Canas, B.; Gallardo, J. M., Selected tandem mass spectrometry ion

Carrera, M.; Canas, B.; Gallardo, J. M., Rapid direct detection of the major fish

Barik, S. K.; Banerjee, S.; Bhattacharjee, S.; Das Gupta, S. K.; Mohanty, S.; Mohanty,

Carrera, M.; Canas, B.; Gallardo, J. M., The sarcoplasmic fish proteome: pathways,

von Bargen, C.; Dojahn, J.; Waidelich, D.; Humpf, H. U.; Brockmeyer, J., New

von Bargen, C.; Brockmeyer, J.; Humpf, H. U., Meat authentication: a new HPLC-

23

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 24 of 38

555

25.

Carrera, M.; Canas, B.; Pineiro, C.; Vazquez, J.; Gallardo, J. M., Identification of

556

commercial hake and grenadier species by proteomic analysis of the parvalbumin fraction.

557

Proteomics 2006, 6, 5278-87.

558

26.

559

new parvalbumin isoforms using a novel combination of bottom-up proteomics, accurate

560

molecular mass measurement by FTICR-MS, and selected MS/MS ion monitoring. Journal of

561

proteome research 2010, 9, 4393-406.

562

27.

563

Fast monitoring of species-specific peptide biomarkers using high-intensity-focused-

564

ultrasound-assisted tryptic digestion and selected MS/MS ion monitoring. Analytical

565

chemistry 2011, 83, 5688-95.

566

28.

567

ionization-time of flight mass spectrometry: a fundamental shift in the routine practice of

568

clinical microbiology. Clinical microbiology reviews 2013, 26, 547-603.

569

29.

570

reliable species identification of scallops by MALDI-TOF mass spectrometry. Food Control

571

2014, 46, 6-9.

572

30.

573

origin of meat and gelatin by MALDI-TOF-MS. Journal of Food Composition and Analysis

574

2015, 41, 104-112.

575

31.

576

blood meals using unidentified tandem mass spectral libraries. Nature communications 2013,

577

4, 1746.

Carrera, M.; Canas, B.; Vazquez, J.; Gallardo, J. M., Extensive de novo sequencing of

Carrera, M.; Canas, B.; Lopez-Ferrer, D.; Pineiro, C.; Vazquez, J.; Gallardo, J. M.,

Clark, A. E.; Kaleta, E. J.; Arora, A.; Wolk, D. M., Matrix-assisted laser desorption

Stephan, R.; Johler, S.; Oesterle, N.; Näumann, G.; Vogel, G.; Pflüger, V., Rapid and

Flaudrops, C.; Armstrong, N.; Raoult, D.; Chabrière, E., Determination of the animal

Onder, O.; Shao, W.; Kemps, B. D.; Lam, H.; Brisson, D., Identifying sources of tick

24

ACS Paragon Plus Environment

Page 25 of 38

Journal of Agricultural and Food Chemistry

578

32.

Wulff, T.; Nielsen, M. E.; Deelder, A. M.; Jessen, F.; Palmblad, M., Authentication of

579

fish products by large-scale comparison of tandem mass spectra. Journal of proteome

580

research 2013, 12, 5253-9.

581

33.

582

sequence-independent shotgun proteomics workflow for strain-level bacterial differentiation.

583

Scientific reports 2015, 5, 14337.

584

34.

585

Pratt, B.; Nilsson, E.; Angeletti, R. H.; Apweiler, R.; Cheung, K.; Costello, C. E.; Hermjakob,

586

H.; Huang, S.; Julian, R. K.; Kapp, E.; McComb, M. E.; Oliver, S. G.; Omenn, G.; Paton, N.

587

W.; Simpson, R.; Smith, R.; Taylor, C. F.; Zhu, W.; Aebersold, R., A common open

588

representation of mass spectrometry data and its application to proteomics research. Nature

589

biotechnology 2004, 22, 1459-66.

590

35.

591

S.; Gatto, L.; Fischer, B.; Pratt, B.; Egertson, J.; Hoff, K.; Kessner, D.; Tasman, N.; Shulman,

592

N.; Frewen, B.; Baker, T. A.; Brusniak, M. Y.; Paulse, C.; Creasy, D.; Flashner, L.; Kani, K.;

593

Moulding, C.; Seymour, S. L.; Nuwaysir, L. M.; Lefebvre, B.; Kuhlmann, F.; Roark, J.;

594

Rainer, P.; Detlev, S.; Hemenway, T.; Huhmer, A.; Langridge, J.; Connolly, B.; Chadick, T.;

595

Holly, K.; Eckels, J.; Deutsch, E. W.; Moritz, R. L.; Katz, J. E.; Agus, D. B.; MacCoss, M.;

596

Tabb, D. L.; Mallick, P., A cross-platform toolkit for mass spectrometry and proteomics.

597

Nature biotechnology 2012, 30, 918-20.

598

36.

599

Bioinformatics 2004, 20, 1466-7.

600

37.

601

estimate the accuracy of peptide identifications made by MS/MS and database search.

602

Analytical chemistry 2002, 74, 5383-92.

Shao, W.; Zhang, M.; Lam, H.; Lau, S. C., A peptide identification-free, genome

Pedrioli, P. G.; Eng, J. K.; Hubley, R.; Vogelzang, M.; Deutsch, E. W.; Raught, B.;

Chambers, M. C.; Maclean, B.; Burke, R.; Amodei, D.; Ruderman, D. L.; Neumann,

Craig, R.; Beavis, R. C., TANDEM: matching proteins with tandem mass spectra.

Keller, A.; Nesvizhskii, A. I.; Kolker, E.; Aebersold, R., Empirical statistical model to

25

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 26 of 38

603

38.

Lam, H.; Deutsch, E. W.; Eddes, J. S.; Eng, J. K.; King, N.; Stein, S. E.; Aebersold,

604

R., Development and validation of a spectral library searching method for peptide

605

identification from MS/MS. Proteomics 2007, 7, 655-67.

606

39.

607

parasitic arthropods using shotgun proteomics and unidentified tandem mass spectral libraries.

608

Nature protocols 2014, 9, 842-50.

609

40.

610

Palmblad, M., Identification of meat products by shotgun spectral matching. Food chemistry

611

2016, 203, 28-34.

612

41.

613

Wainwright, P. C.; Friedman, M.; Smith, W. L., Resolution of ray-finned fish phylogeny and

614

timing of diversification. Proceedings of the National Academy of Sciences of the United

615

States of America 2012, 109, 13698-703.

616

42.

617

A.; Hoogendijk, J. L.; Henneman, A. A.; Deelder, A. M.; Spaink, H. P.; Palmblad, M.,

618

Identifying proteins in zebrafish embryos using spectral libraries generated from dissected

619

adult organs and tissues. Journal of proteome research 2014, 13, 1537-44.

Onder, O.; Shao, W.; Lam, H.; Brisson, D., Tracking the sources of blood meals of

Ohana, D.; Dalebout, H.; Marissen, R. J.; Wulff, T.; Bergquist, J.; Deelder, A. M.;

Near, T. J.; Eytan, R. I.; Dornburg, A.; Kuhn, K. L.; Moore, J. A.; Davis, M. P.;

van der Plas-Duivesteijn, S. J.; Mohammed, Y.; Dalebout, H.; Meijer, A.; Botermans,

620 621

Funding Sources

622

The research at RIKILT Wageningen UR has been financially supported by the Dutch

623

Ministry of Economic Affairs. MP has been financially supported by the Dutch Organization

624

of Scientific Research (NWO) via VIDI grant 917.11.398.

26

ACS Paragon Plus Environment

Page 27 of 38

Journal of Agricultural and Food Chemistry

Figures

Figure 1, Workflow for the identification of closely related flatfish by tandem mass spectrometry and spectral library matching.

1

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 28 of 38

Figure 2, Identification of flatfish by tandem mass spectrometry and spectral library matching. Column graph representation of total number of spectral hits against all species spectral libraries (n=59) of representative samples of (A) European plaice, (B) common dab and (C) yellowfin sole. The samples match best to the flatfish spectral libraries in the database (in green), after which the other fish (blue), mammals (red), birds (orange) and European squid (dark blue) follow. Insert zooms in on the flatfish libraries. In dark green the Pleuronectidae family, to which the three investigated flatfish belong, in lighter greens the Scophthalmidae and Soleidae family. Columns indicated with an asterisk are the flatfish

2

ACS Paragon Plus Environment

Page 29 of 38

Journal of Agricultural and Food Chemistry

spectral libraries added to the database at a different moment from the other fish spectral libraries.

Figure 3, Variation of total number of spectral matches for the identification of (closely related) flatfish. On four days three samples of three different flatfish (European plaice (white), common dab (striped) and yellowfin sole (dotted)) were prepared at two different laboratories and analyzed on a Bruker Daltonics amaZon speed ion trap mass spectrometer. The number of spectral matches to the correct species was used to calculate the average number of spectral matches and the standard deviation. A. Within-day variation (n=3 per species*). B. Within-lab variation (n=6 per species*). C. Between-lab variation (n=12 per species*). *due to poor sample and data quality two yellowfin sole samples are missing (no spectral hits)

3

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 30 of 38

Figure 4, Compatibility spectral library matching different fragmentation techniques. Results from a direct comparison of flatfish species identification from tandem mass spectra from a Bruker Daltonics amaZon speed ion trap and a Thermo Scientific Q-Exactive Orbitrap mass spectrometer. Both the query (’unknown’) and reference library data was varied. A. Pie chart of average number of spectral hits, including correct identification rates. B. Average number of spectral hits for each analysis. The liquid chromatography system, ion source and instrument settings are all important. This is not a competitive comparison between instruments, but a validation that query data can come from a different instrument than was used to generate the reference libraries, as long as it contains collision-induced dissociation tandem mass spectra from tryptic peptides. For the results presented here, SpectraST version 4.0 was used. Only correct identifications were used to calculate the average number of spectral hits. C. Typical example of matching tandem mass spectrum from Yellowfin sole samples. The spectra are almost certainly from a conserved region of tropomyosin 4

ACS Paragon Plus Environment

Page 31 of 38

Journal of Agricultural and Food Chemistry

(MEIQELQLK), as was identified using Mascot. With the method presented here, however, no peptide identifications are used to identify biological species - only information on shared fragmentation spectra of peptides is used for identification. From top to bottom: Matching amaZon Ion trap tandem mass spectra, Q-Exactive Orbitrap tandem mass spectrum (top) from Yellowfin sole best matching the amaZon ion trap library spectrum (bottom) and matching QExactive Orbitrap tandem mass spectra. In this case, the Orbitrap spectrum contains more signal at low m/z but less immediately below the precursor (neutral loss peaks). All spectra are plotted to relative intensity scale.

Figure 5, Positive identification of fish species of battered and fried fish at 10% fish content. A sample of battered and fried cod was mixed in a 9:1 and 1:9 ratio of fish to batter. Both at ~90% and ~10% fish, cod was positively identified, showing the method to be robust and applicable to fish samples contaminated with additional ingredients.

5

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 32 of 38

For Table of Contents Only

6

ACS Paragon Plus Environment

Page 33 of 38

Journal of Agricultural and Food Chemistry

Figure 1, Workflow for the identification of closely related flatfish by tandem mass spectrometry and spectral library 150x178mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Figure 2. Identification of flatfish by tandem mass spectrometry and spectral library matching. Column graph representation of total number of spectral hits against all species spectral libraries (n=59) of representative samples of (A) European plaice, (B) common dab and (C) yellowfin sole. The samples match best to the flatfish spectral libraries in the database (in green), after which the other fish (blue), mammals (red), birds (orange) and European squid (dark blue) follow. Insert zooms in on the flatfish libraries. In dark green the Pleuronectidae family, to which the three investigated flatfish belong, in lighter greens the Scophthalmidae and Soleidae family. Columns indicated with an asterisk are the flatfish spectral libraries added to the database at a different moment from the other fish spectral libraries. 176x179mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 34 of 38

Page 35 of 38

Journal of Agricultural and Food Chemistry

Figure 3. Variation of number of spectral matches for the identification of (closely related) flatfish. On four days three samples of three different flatfish (European plaice (white), common dab (striped) and yellowfin sole (dotted)) were prepared at two different laboratories and analyzed on a Bruker Daltonics amaZon speed ion trap mass spectrometer. The number of spectral matches to the correct species was used to calculate the average number of spectral matches and the standard deviation. A. Within-day variation (n=3 per species*). B. Within-lab variation (n=6 per species*). C. Between-lab variation (n=12 per species*). *due to poor sample and data quality two yellowfin sole samples are missing (no spectral hits) 93x102mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 36 of 38

Figure 4. Compatibility spectral library matching different fragmentation techniques. Results from a direct comparison of flatfish species identification from tandem mass spectra from a Bruker Daltonics amaZon speed ion trap and a Thermo Scientific Q-Exactive Orbitrap mass spectrometer. Both the query (’unknown’) and reference library data was varied. A. Pie chart of average number of spectral hits, including correct identification rates. B. Average number of spectral hits for each analysis. The liquid chromatography system, ion source and instrument settings are all important. This is not a competitive comparison between instruments, but a validation that query data can come from a different instrument than was used to generate the reference libraries, as long as it contains collision-induced dissociation tandem mass spectra from tryptic peptides. For the results presented here, SpectraST version 4.0 was used. Only correct identifications were used to calculate the average number of spectral hits. C. Typical example of matching tandem mass spectrum from yellowfin sole samples. The spectra are almost certainly from a conserved region of tropomyosin (MEIQELQLK), as was identified using Mascot. With the method presented here, however, no peptide identifications are used to identify biological species - only information on shared fragmentation spectra of peptides is used for identification. From top to bottom: Matching amaZon Ion trap tandem mass spectra, Q-Exactive Orbitrap tandem mass spectrum (top) from Yellowfin sole best matching the amaZon ion trap library spectrum (bottom) and matching Q-Exactive Orbitrap tandem mass spectra. In this case, the Orbitrap spectrum contains more signal at low m/z but less immediately below the precursor (neutral loss peaks). All spectra are plotted to relative intensity scale. 176x133mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 37 of 38

Journal of Agricultural and Food Chemistry

Figure 5, Positive identification of fish species of battered and fried fish at 10% fish content. A sample of battered and fried cod was mixed in a 9:1 and 1:9 ratio of fish to batter. Both at ~90% and ~10% fish, cod was positively identified, showing the method to be robust and applicable to fish samples contaminated with additional ingredients. 82x45mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

For Table of Contents Only 30x10mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 38 of 38