Subscriber access provided by University of Ulster Library
Article
New Quantitative Structure-Fragmentation Relationship strategy for chemical structure identification using as descriptor the calculated enthalpy of formation for the fragments produced in electron ionization mass spectrometry. Case study: tetrachlorinated biphenyls Nicolae Dinca, Simona Dragan, Mihael Dinca, Eugen Sisu, and Adrian Covaci Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/ac5003728 • Publication Date (Web): 28 Apr 2014 Downloaded from http://pubs.acs.org on May 2, 2014
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
New Quantitative Structure-Fragmentation Relationship strategy for
2
chemical structure identification using as descriptor the calculated
3
enthalpy of formation for the fragments produced in electron ionization
4
mass spectrometry. Case study: tetrachlorinated biphenyls
5 6 7 1
8
2
2
2
3*
Nicolae Dinca , Simona Dragan , Mihael Dinca , Eugen Sisu , Adrian Covaci
9 10 11
1
“Aurel Vlaicu” University, 310330 Arad, Romania
12
2
University of Medicine and Pharmacy “Victor Babes”, 300041 Timisoara, Romania
13
3
Toxicological Center, University of Antwerp, 2610 Wilrijk, Belgium
14 15 16
*-corresponding author:
17
Adrian Covaci, fax: +32-3-265-2722; e-mail:
[email protected] 18 19
ACS Paragon Plus Environment
1
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 20
20
ABSTRACT
21
Differential mass spectrometry correlated with quantum chemical calculations (QCC-∆MS)
22
has been shown to be an efficient tool for the chemical structure identification (CSI) of
23
isomers with similar mass spectra. For this type of analysis, we report here a new strategy
24
based on ordering (ORD), linear correlation (LCOR) algorithms and their coupling, to filter
25
the most probable structures corresponding to similar mass spectra belonging to a group with
26
dozens of isomers (e.g., tetrachlorinated biphenyls, TeCBs). This strategy quantifies and
27
compares the values of enthalpies of formation (∆fH) obtained by QCC for some isobaric
28
ions from the (EI)-MS mass spectra, to the corresponding relative intensities. The result of
29
CSI is provided in the form of lists of decreasing probabilities calculated for all the position-
30
isomeric structures using the specialized software package CSI-Diff-MS Analysis 3.1.1. The
31
simulation of CSI with ORD, LCOR and their coupling of six TeCBs (IUPAC no. 44, 46, 52,
32
66, 74, and 77) has allowed finding the best semi-empirical molecular-orbital methods for
33
several of their common isobaric fragments. The study of algorithms and strategy for the
34
entire group of TeCBs (42 isomers) was made with one of the variants found optimal for the
35
computation of ∆fH using semi-empirical molecular orbital methods of HyperChem: AM1
36
for M+, [M-4Cl]+· ions and RM1 for [M-Cl]+, [M-2Cl]+. The analytical performance of ORD,
37
LCOR and their coupling resulted from the CSI simulation of an analyte of known structure,
38
using a decreasing number of isomeric standards, s = 5, 4, 3, and 2. Compared with the
39
results obtained by classical library search for TeCB isomers, the novel strategies of
40
assigning structures of isomers with very similar mass spectra based on ORD, LCOR and
41
their coupling were much more efficient, because they provide the correct structure at the top
42
of the probability list. Databases used in these CSI do not contain mass spectra as in the case
43
of library search, but series of ∆fH values obtained by QCC. These techniques are capable of
44
relating relative intensities to the chemical structures of analytes via ∆fH of ions which turns
45
out to be a good Quantitative Structure-Fragmentation Relationship (QSFR) descriptor.
46 47 48
Key words: structural identification, differential mass spectrometry, tetrachlorinated
49
biphenyls, quantum chemical calculation, formation enthalpy, structure refining algorithm,
50
Quantitative Structure-Fragmentation Relationship, QSFR descriptor
51 52
ACS Paragon Plus Environment
2
Page 3 of 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
53
INTRODUCTION
54
One of the applications of differential mass spectrometry (∆MS) is the chemical structure
55
identification (CSI) of isomers with similar mass spectra using the quantum chemical
56
calculated (QCC) enthalpies of formation (∆fHs) of the ions formed in the mass
57
spectrometer. In this way, structures could be identified by ∆MS for groups of isomers, such
58
as position isomers of nitrobenzophenones or their dimethyl acetals,1-3 of polychlorinated
59
biphenyls (PCBs),4 or endo- and exo-diastereomers of some alcohols.5 In the same context,
60
stereochemical studies of cis- and trans-1,3-dioxane derivatives,6 or α- and β-mannofuranose
61
acetals may be mentioned.7
62
Differential techniques presented in these works use an ordering algorithm (ORD) based
63
on the principle that the more stable an ion is, the more abundant it is in the mass spectrum.
64
Applied to common isobaric ions in similar mass spectra of isomers, ORD can generate a
65
fine quantification of the correlation between the ions’ intensities with their enthalpies of
66
formation. Thus, an ascending series of intensities corresponds to a descending series of
67
enthalpies.4
68
Chemical identification by the interpretation of fragmentation schemes or library search
69
needs diagnostic ions that provide spectrum uniqueness and reduce the risk for false
70
positives.8 In contrast, ∆MS uses only the common ions in the mass spectra. In similar mass
71
spectra, diagnostic ions are most often lacking and the only analytically-exploitable
72
difference between spectra remains the difference between the intensities of common ions. In
73
this case, as shown in the studies mentioned above, CSI using QCC and ∆MS leads to good
74
results. This approach extends the investigative power of mass spectrometry beyond the use
75
of spectral libraries or interpretation of fragmentation schemes for the identification of
76
compounds with similar spectra.4,9
77
Establishing the best QCC methods and calculating series of ∆fH values for ions and
78
radicals is the first step in the strategy of QCC-∆MS analysis because these techniques
79
require thermochemical data as close as possible to the real values that govern the
80
fragmentations in the mass spectrometer. Recently, we reported a method of determining the
81
best series of ∆fH values obtained with semi-empirical QCC methods, which used the
82
simulation of CSI by the ORD algorithm of six chemical standards (TeCBs 44, 46, 52, 66,
83
74, and 77).10 The lists of probabilities of possible structural assignments were obtained by
84
the specialized software package, Chemical Structure Identification by Differential Mass
85
Spectra (CSI-Diff-MS Analysis 3.1.1; BET2 Software, Königsbrunn, Germany).9 The best
86
results were obtained using ∆fH calculated by semi-empirical methods, AM1, MINDO3 and
87
MNDO for M+ ions, RM1 for [M-Cl]+, [M-2Cl]+ ions, PM3 for [M-3Cl]+ and AM1 for [MACS Paragon Plus Environment
3
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 20
88
4Cl]+.10 Although not considered the most efficient, these calculation methods are faster than
89
other methods. But even so, we performed over 33,000 clicks for each of them, because QCC
90
software is not yet adapted for the rapid creation of thermochemical databases for a large
91
number of fragments. Yet, this is worth the effort, because these databases do not have to be
92
recalculated for each analysis. The series of calculated ∆fH values can be reused in any
93
analysis of the same analytes by MS, the only restriction referring to the mandatory
94
acquisition of spectra under the same conditions and the same instrumental setup.
95
To date, the ORD algorithm was successfully used in analyses where n possible
96
structures were attributed to n isomers with n similar mass spectra (n = 2 to 6). These cases
97
often do not correspond with actual experimental situations in which only a few standards
98
from a large group of isomers are available, and the analyte can have any of the structures
99
proposed for the unknown isomers. We recently built such an analytical situation using as
100
example the TeCBs group for which we already knew several good QCC methods10 and for
101
which there was practical importance of identifying the various isomers. PCBs are a group of
102
209 isomers with varying degree of chlorination which are important environmental11 and
103
food12 contaminants. In addition, epidemiological studies have shown that PCBs can have
104
effects on reproduction and neuro-development, thyroid system, nervous system, immune
105
system, cardiovascular system, leading to disturbances in growth, lipid metabolism, and
106
finally to diabetes and obesity.13
107
Regarding our study, the question that arises is whether ORD can correctly select the ∆fH
108
values (and implicitly the chemical structure) corresponding to the analyte, from the series of
109
values calculated for all 42 possible structures of TeCBs. Can the efficiency of the QCC-
110
∆MS methods for the 6 standards be extended to the entire group of TeCB isomers? What is
111
the minimum number of standards which are isomers with the analyte that can ensure a good
112
analysis? How can the performance of such computational method be improved using
113
complementary algorithms?
114
To answer these questions, we aimed here at investigating: i) the ORD-QCC-∆MS
115
strategy for qualitative analysis; ii) a new algorithm, LCOR-QCC-MS, based on Linear
116
CORrelation of IC with the ∆fH values; and iii) the coupling of ORD-LCOR algorithms for
117
the improvement of analytical performance and their applications for TeCBs.
118 119
EXPERIMENTAL
120
Materials
121
Reference standards of TeCBs (IUPAC no. 44, 46, 52, 66, 74, and 77) were obtained
122
from Dr. Ehrenstorfer Laboratories (Augsburg, Germany) at a concentration of 10 ng/µL in ACS Paragon Plus Environment
4
Page 5 of 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
123
isooctane. After appropriate dilution, a mixture containing these six isomers was prepared in
124
isooctane at a concentration of 1 ng/µL.
125 126
Mass spectrometry
127
An Agilent 6890 series gas chromatograph (Waldbronn, Germany) equipped with a DB-5
128
capillary column (30 m x 0.25 mm x 0.25 mm) was coupled to an Agilent 5973 mass-
129
selective detector (MSD) operating in electron ionization (EI) mode at 70 eV. The heated
130
zone temperature for the injector was set at 260°C, the mass spectrometer interface at 280°C,
131
the quadrupole mass analyzer at 150°C and the ion source at 230°C. Helium was used as a
132
carrier gas at a constant flow rate of 1.3 mL/min and the mixture containing the TeCBs was
133
injected in splitless mode. The oven temperature was programmed at an initial temperature
134
100°C, increased to 250°C at a rate of 22.5°C/min and then further increased to 310°C at
135
5.5°C/min, where it was held for 5 min. The MSD was operated in full-scan acquisition
136
mode between m/z 50 and 400 to obtain the Total Ion Chromatogram (TIC).
137
Automatic background subtraction was applied to obtain clean and interference-free mass
138
spectra. For each isomer, the averaged mass spectrum is obtained on identical intervals (500
139
000 – 2 000 000 relative units) of ion abundance in the front side of the chromatographic
140
peak. The average mass spectra are normalized on TIC (100%), to offer comparable
141
intensities for the common ions of the isomers.9 It is very important that all resulting mass
142
spectra are obtained under identical analytical conditions, so that the differences in intensity
143
between common isobaric ions are due exclusively to structural differences.4 In the CSI-Diff-
144
MS Analysis 3.1.1 software, the tabular mass spectra were imported in format of comma
145
separated values files, *.csv (Figure S-1).
146 147
The common ions for CSI
148
The minimum necessary number of common ions for CSI with ORD is determined based
149
on the number of isomers used in probability matrices. Thus, the maximum number of
150
isomers that can be identified with n ions of each spectrum is 2n.4 For the six TeCB isomers
151
used in this study, three common isobaric ions are enough. Using a larger number of ions
152
than the minimum required can improve the method’s selectivity if their series of ∆fH values
153
have been correctly computed. With a personal computer and the CSI-Diff-MS Analysis
154
3.1.1 software, matrices of 6 spectra x 6 structures for 5 ions can be calculated in less than 10
155
s. Larger matrices may require more powerful computers and/or longer computing periods.
156
The most suitable ions for CSI by QCC-∆MS are those with predictable structures and
157
for which correct thermodynamic data can easily be computed. For TeCBs, the ions M+, [M-
ACS Paragon Plus Environment
5
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 20
158
Cl]+, [M-2Cl]+, [M-3Cl]+ and [M-4Cl]+ and the corresponding isotopic peaks, e.g., those at
159
m/z 290, 256, 220, 185 and 150, respectively, can be used.10
160 161
The ∆fH database The formation enthalpies (∆fHs) were calculated for the molecule M, the molecular ion
162 163
M and for the common ions produced by successive loss of chlorine atoms: [M-Cl]+, [M-
164
2Cl]+, [M-3Cl]+ and [M-4Cl]+. The geometries of the molecules, ions and radicals were
165
optimized with the force field MM+ and re-optimized with the semi-empirical methods,
166
AM1, MINDO3, MNDO, PM3 and RM1,14,15 using the Restricted Hartree-Fock (RHF)
167
operators for molecules or ions and Unrestricted Hartree-Fock (UHF) for radicals or radical-
168
ions. The convergence limit of Self-Consistent-Field (SCF) was set at 10–5 and the
169
accelerated convergence procedure was used. For the optimization of the geometries, the
170
conjugated gradient method – Polak-Ribiere with a total root-mean-square (RMS) gradient
171
set at 10–2 kcal/(mol·Å) – was used, the molecule being considered in vacuum, since these
172
ions are isolated in a mass spectrometer.16 The semi-empirical molecular-orbital methods
173
(AM1, MINDO3, MNDO, PM3, and RM1) were employed as available in the HyperChem
174
8.0.10 Hypercube, Inc. software.
175 176 177
+
For the considered ions, the fragmentation enthalpies can be obtained using equation (1): ∆fH fragmentation = ∆fH (ion) + n·∆fH (Cl·) + E (electron) - ∆fH (molecule)
(1)
where n is the number of cleaved chlorine atoms (n = 0 to 4).
178
If it is accepted that the isomeric ions of TeCBs are formed by the same mechanism, the
179
terms n·∆fH (Cl·) and E (electron) most possibly have the same values for these ions. Upon
180
ordination (ORD) and linear correlation (LCOR), these terms can be neglected without
181
affecting the results in the CSI probability lists. Thus, equation (1) is transformed into (2),
182
and instead of ∆fH fragmentation, the ∆fH (relative) can be used with the same results
183
obtained from:
184
∆fH (relative) = ∆fH (ion) - ∆fH (molecule)
(2)
185
where ∆fH (relative) is the formation enthalpy of the respective ion measured above the level
186
of the molecular enthalpy. When several isomeric ionic structures result for a fragmentation,
187
∆fH minimum was used, because the corresponding ion is the most stable and has the
188
essential contribution to the ionic current (IC).4,10 From here onwards, we have used only the
189
relative minimum of the enthalpy of formation, simply given as ∆fH. The values calculated in
190
this way were imported into the ∆fH library of the CSI-Diff-MS Analysis 3.1.1 software
191
(Figure S-2, Tables S-1 and S-2).
192
ACS Paragon Plus Environment
6
Page 7 of 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
193
LCOR algorithm
194
The LCOR (Linear CORrelation) algorithm was designed to complement the ORD
195
algorithm, since the latter cannot estimate how close the calculated ∆fH values are to the
196
experimental ones. LCOR quantifies using a correlation coefficient (R), the inverse linear
197
relationship of the ∆fH values calculated for ions of the same type, with the corresponding
198
relative ionic currents (IC) in the similar mass spectra of isomers (Equation S-1). In this
199
paper, we used IC in place of relative intensity (RI), because its abbreviation can be mistaken
200
for the retention index (RI). The best correlation between ∆fH and IC corresponds to R = -1,
201
which means that in a correct analysis, real structures must be those that provide the best
202
inverse correlation. To run this algorithm, a minimum of three pairs of values (∆fH; IC) is
203
necessary, and consequently, the mass spectra of at least three isomers. The use of a linear
204
correlation function is justified since the ∆fH and IC values lie within narrow intervals and,
205
over a narrow range, any curve can be approximated by a straight line. The fact that a series
206
of increasing enthalpies corresponds to a series of decreasing ionic intensities4 explains the
207
inverse linearity (the negative slope of the regression line).
208 209 210
To achieve the ORD-LCOR coupling, R was converted into probability using the equation (3): PLCOR(%) = 100(1-R)/2
(3)
211
The probability corresponding to the coupling, PORD-LCOR, expresses the degree of
212
simultaneous fulfillment of the conditions of correct succession and proximity (correlation)
213
of the experimental values (IC) to the corresponding theoretical ones (∆fH). PORD-LCOR was
214
calculated with equation (4):
215 216 217 218
PORD-LCOR = PORD · PLCOR
(4)
The total probability (P) results from the simultaneous fulfillment of the conditions ORD and LCOR for all the common ions (m) considered, according to equation (5): P = P1·P2·……·Pm
(5)
219 220
The similarity of mass spectra
221
The similarities of the mass spectra of the standards and analyte, computed from their
222
degrees of overlapping,9 are involved in the selection of the algorithm and of the standards
223
that should be used for the optimum filtration of the structures. After importing the mass
224
spectra, the CSI-Diff-MS program can display the table of similarity that includes the values
225
for all possible pairs of spectra (Figure S-3). For the six TeCB congeners used in this study,
226
the similarity ranges between 76% and 96%. We can distinguish two groups of high
ACS Paragon Plus Environment
7
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 20
227
similarity: group 1 (PCB 44, 46, and 52) and group 2 (PCB 66, 74, and 77) with values
228
ranging between 92% and 96%.10
229 230
CSI probabilities list
231
We have calculated the probabilities of the CSI list using the CSI-Diff-MS 3.1.1
232
software, the database of enthalpies of formation (∆fH) calculated by HyperChem 8.0.10 and
233
the mass spectra generated by EI-MS (see above). The LCOR algorithm was ran with
234
experimental subroutines of CSI-Diff-MS. With the ORD and LCOR algorithms, the
235
experimental and calculated data are compared in all possible variants of structural
236
assignments and the results are presented as a list of decreasing probabilities of these variants
237
(Figure S-4).9
238
Although the relative error, accuracy and precision are used mainly in quantitative
239
analysis, these analytical parameters can be calculated for each of these probabilities lists
240
using the following equations:
241
Relative error (%) = 100·∆rank S / N
(6)
242
Accuracy (%) = 100(N - ∆rank S) / N
(7)
243
Selectivity (%) = 100(N - rank S + rank P) / N
(8)
244
where rank S is the rank of the real structure in the list of probabilities or the number of
245
structures with probability greater than or equal to that of the real structure, ∆rank S = (rank
246
S – true rank), true rank = 1, rank P represents the number of distinct probabilities among
247
first rank S positions of the list, N is the number of possible structures for the analyte. For the
248
ORD lists, rank S > rank P because several structures offer most often the same probability.
249
For the LCOR lists, rank S = rank P because two structures with the same probability cannot
250
exist. Precision is estimated by the probabilities in the lists.
251 252
Determining the best series of enthalpies
253
The best series of ∆fH values, calculated with various QCC methods, can be determined
254
by simulating the CSI by the studied algorithm, for standards from the respective group of
255
isomers. The ∆fH values which lead to the correct assignment of the structures with the
256
highest rank, selectivity and probability were found to be the most appropriate. For a
257
accurate, selective and precise CSI, even mixtures of ∆fH values calculated by different
258
methods can be used, provided that the same method is used for a certain ion type.10
259 260
ACS Paragon Plus Environment
8
Page 9 of 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
261
RESULTS AND DISCUSSION
262
Optimal semi-empirical method for LCOR
263
ORD and LCOR are operationally different and it would be useful to know if there are
264
series of ∆fH values that could provide the best CSI simultaneously for the two algorithms
265
and the ions used. In this case, the same set of values could be used to run both algorithms,
266
and the analysis time would be substantially reduced. The scores obtained for LCOR, shown
267
comparatively with those of ORD in Table 1, are encouraging because it can be noted that
268
there are ∆fH series calculated by two methods (bold/grey formatted), AM1 for ions M+, [M-
269
4Cl]+ and RM1 for [M-Cl]+, [M-2Cl]+, which provides good results for both algorithms.
270
Further in the study, we used the optimal series of ∆fH values corresponding to these two
271
methods for all TeCB isomers. The ∆fH series for M+ resulting from MINDO3, MNDO and
272
PM3 are only slightly better than those of AM1. However, their use would not justify the
273
computational load for three methods. Since LCOR did not provide good results for the ion
274
[M-3Cl]+ by any of the semi-empirical methods, it has not been used.
275 276
CSI with ORD-QCC-∆MS
277
The study of the analytical performance of ORD was carried out by simulating the CSI of
278
an analyte of known structure, using a decreasing number of isomeric standards s = 5, 4, 3,
279
and 2. The role of the standards and analyte was played in turn by all the 6 TeCBs. Only for
280
CSI with 5 standards, all six possible cases were run, while for the other situations when less
281
than 5 standards were used, representative variants were selected. The analyte was sought
282
among the 42-s possible isomeric structures. The selection of its structure was achieved by
283
establishing by ORD the most appropriate ∆fH values for the ions intensities in its mass
284
spectrum.
285
The results in the CSI probability lists obtained by ORD are shown in Tables 2, 3, 4 and
286
5, column 3. The first number is the rank of the probability corresponding to the correct
287
structure (rank P). The second is the number of isomers that have a probability greater than
288
or equal to that of the correct structure (rank S) or confounding structures number. The third
289
number is the probability (PORD) corresponding to the degree of ordering of the values ∆fH-
290
IC. The best result < rank P / rank S / P > is < 1 / 1 / 100% >.
291
For each list, the relative error, accuracy and selectivity were calculated using the
292
equations (6), (7) and (8). Their average values, grouped in three situations determined by the
293
similarity of the spectra of the standards and analyte (cases a, b and c), are given in Table 6.
294
The three analytical situations which can occur are:
ACS Paragon Plus Environment
9
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
295 296
Page 10 of 20
(a) when both the standards and the analyte have mass spectra with high similarities (e.g. above 92%), that is, they belong to the same similarity group;
297
(b) the standards belong to one similarity group, and the analyte to the other;
298
(c) the standards come from both similarity groups, having diverse similarities (e.g. 75-
299
96%).
300
Certainly, the calculated analytical parameters can give an overview of the quality or
301
shortcomings of the analysis, but what matters is whether the filtering algorithm succeeds in
302
bringing the real structure on the first place or at least among the first places of the list. In
303
other words, rank S is the most important parameter for CSI.
304
For ORD, selectivity is the one that limits the performance. It increases in the order (b)
6 for the six standards),4 filtering of structures with ORD, LCOR and their coupling was
355
much more efficient. Thus, the ORD-LCOR structural filter could successfully complement
356
the library search of isomers with similar mass spectra. Since differential algorithms do not
357
use databases of standard spectra, but databases of ∆fH calculable for all isomeric structures,
358
important steps toward a de novo structural analysis are made, which would require minimal
359
pre-knowledge.17 The CSI-Diff-MS software platform offers the possibility to perform this
360
type of analysis with ORD and LCOR.
361 362
CONCLUSIONS
363
Although this study was conducted on TeCBs using EI-MS spectra and semi-empirical
364
QCC methods, there is no reason for us to believe that the kinetic and thermodynamic laws ACS Paragon Plus Environment
11
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 20
365
of fragmentation confirmed as implicitly functional on this exercise, do not occur and cannot
366
be pursued for analytical purposes also for other groups of isomers, using other instruments,
367
conditions or QCC methods. Thus, we can consider that the CSI by QCC-MS-∆MS strategy
368
with ORD and LCOR algorithms described here has a high degree of generality. In this
369
study, ∆fH was shown to be a good descriptor of fragmentations, and the differential
370
techniques capable of relating the relative ionic current to the chemical structures of analytes,
371
thus representing true Quantitative Structure-Fragmentation Relationship methods.
372 373
Acknowledgments
374
The authors acknowledge BET2 Software for the LCOR experimental subroutines of the
375
CSI-Diff-MS software. Part of this work was supported by the Romanian National Authority
376
for Scientific Research (CNCS-UEFISCDI) through project PN-II-PCCA-2011-142.
377 378
Supporting Information Available
379
This material is available free of charge via the Internet at http://pubs.acs.org
380 381
References
382
(1) Dinca, N.; ȘiȘu, E.; ȘiȘu, I.; Oprean, I.; Csunderlik, C.; Mracec, M. Rev. Roum. Chim.
383 384 385 386 387
2002, 47(3-4), 379-385. (2) Dinca, N.; ȘiȘu, E.; ȘiȘu, I.; Csunderlik, C.; Oprean. I. Rev. Chim.-Bucharest 2002, 53(5), 332-336. (3) Dinca, N.; ȘiȘu, E.; Mracec, M.; Oprean, I.; Sander, O. Rev. Roum. Chim. 2004, 49(3-4), 331-338.
388
(4) Dinca, N. In Applications of Mass Spectrometry in Life Safety; Popescu, C.; Zamfir, A.D.;
389
Dinca N., Ed.; NATO Public Diplomacy Division & Springer: Dordrecht, 2008; pp. 221-
390
233.
391 392
(5) Dinca, N.; Stanescu, M.D.; ȘiȘu, E.; Mracec, M. Rev. Chim.-Bucharest 2004, 55(5), 347-350.
393
(6) Harja, F.; Bettendorf, C.; Grosu, I.; Dinca, N. In Applications of Mass Spectrometry in
394
Life Safety; Popescu, C.; Zamfir, A.D.; Dinca N., Ed.; NATO Public Diplomacy
395
Division & Springer: Dordrecht, 2008; pp. 185-191.
396 397 398
(7) Rafaila, M.; Pascariu, M.C.; Gruia, A.; Penescu, M.; Purcarea, V.L.; Medeleanu, M.; Rusnac, L.M.; Davidescu, C.M. Farmacia, 2013, 61(1), 116-126. (8) Stein, S. Anal. Chem. 2012, 84(17), 7274−7282.
ACS Paragon Plus Environment
12
Page 13 of 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
399 400
(9) Patent Number: DE102005028944-A1, Assignee: C. Bettendorf, Inventors: C. Bettendorf, N. Dinca, 2007; http://www.bet2-soft.de
401
(10) Dinca, N.; Covaci, A. Rapid Commun. Mass Sp. 2012, 26(17), 2033-2040.
402
(11) Covaci, A.; Gheorghe, A.; Voorspoels, S.; Maervoet, J.; Steen Redekker, E.; Blust, R.;
403
Schepens, P. Environ. Int. 2005, 31(3), 367-375.
404
(12) Voorspoels, S.; Covaci, A.; Neels, H. Environ. Toxicol. Pharm. 2008, 25(2), 179-182.
405
(13) Hamers, T.; Kamstra, J. H.; Cenijn, P. H.; Pencikova, K.; Palkova, L.; Simeckova, P.;
406
Vondracek, J.; Andersson, P. L.; Stenberg, M.; Machala, M. Toxicol. Sci. 2011,
407 408 409
121(1), 88–100. (14) Dewar, M.J.S.; Zoebisch, G.E.; Healy, F.E.; Stewart, J.J.P. J. Am. Chem. Soc. 1985, 107(13), 3902-3909.
410
(15) Stewart, J.J.P. J. Comput. Aid. Mol. Des. 1990, 4(1), 1-103.
411
(16) Holmes, J.L.; Aubry, C.; Mayer, P.M. Assigning Structures to Ions in Mass
412 413
Spectrometry; CRC Press: Boca Raton, 2006. (17) Kind, T.; Fiehn, O. Bioanal. Rev. 2010, 2(1-4), 23-60.
414 415
ACS Paragon Plus Environment
13
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 14 of 20
416
Table 1. Ranking of semi-empirical QCC methods for the ions of TeCB standards based on
417
the scores obtained using the ORD and LCOR algorithms of the CSI-Diff-MS software. For
418
ORD, a higher number of stars corresponds to a higher degree of confidence in the calculated
419
∆fH series.10 For LCOR, rank S and probability (rank S / PLCOR%) are shown. The calculated
420
∆fH values are the better the closer rank S (out of 720 possible variants) is to 1, and the
421
probability to 100%. The optimal ∆fH series used in this paper are formatted grey.
422 Semi-empirical method AM1 MINDO3
MNDO PM3 RM1
ORD-∆MS LCOR-MS ORD-∆MS LCOR-MS ORD-∆MS LCOR-MS ORD-∆MS
M+·
[M-Cl]+
[M-2Cl]+
[M-3Cl]+
[M-4Cl]+
** 11 / 98.8%
11 / 94.4% ** 9 / 94.6% -
58 / 76.4% -
** 1 / 99.8%
** 9 / 98.5%
* 55 / 82.0% 38 / 88.0% -
230 / 64.9%
** 10 / 96.8%
LCOR-MS ORD-∆MS
* 7 / 95.0% -
LCOR-MS
32 / 94.2%
12 / 94.3%
67 / 86.2%
* 68 / 75.7%
* 10 / 94.3%
* 11 / 95.9%
** 34 / 77.1%
*** 4 / 94.9%
**** 1 / 99.5%
* 85 / 74.2%
* 19 / 89.1% 206 / 65.0% 151 / 69.0% 134 / 75.0%
423 424
ACS Paragon Plus Environment
14
Page 15 of 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
425
Table 2. Results of structural filtering using five TeCB standards and the calculated ∆fH
426
database. For each of the 6 variants of selecting the standard and analyte out of the 6
427
possible, there are N = 42 – 5 = 37 possible structures for the analyte. The first number
428
represents rank P, the second is rank S (confounding structures), and the third is probability,
429
(rank P / rank S / P).
430 5 standards
Analyte
Case
1 PCBs 46, 52, 66, 74, 77 PCBs 44, 52, 66, 74, 77 PCBs 44, 46, 66, 74, 77 PCBs 44, 46, 52, 74, 77 PCBs 44, 46, 52, 66, 77 PCBs 44, 46, 52, 66, 74
2 PCB 44 PCB 46 PCB 52 PCB 66 PCB 74 PCB 77
3 c c c c c c
ORD Algorithm 4 1 /1/ 95% 1 /2/ 95% 1 /1/ 95% 1 /2/ 95% 1 /2/ 95% 1 /1/ 95%
LCOR Algorithm 5 2 /2/ 93.2% 1 /1/ 93.2% 1 /1/ 93.2% 3 /3/ 93.2% 1 /1/ 93.2% 1 /1/ 93.2%
ORD-LCOR Coupling 6 1 /1/ 88.5% 1 /1/ 88.5% 1 /1/ 88.5% 1 /1/ 88.5% 1 /1/ 88.5% 1 /1/ 88.5%
431 432 433
ACS Paragon Plus Environment
15
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 20
434
Table 3. Results < rank P / rank S / P > of the analysis with four TeCB standards using the
435
calculated ∆fH database. A number of 6 variants of the standards and the analyte were
436
selected from 30 possible. For each variant there are N = 42 – 4 = 38 possible structures for
437
the analyte.
438 4 standards
Analyte
Case
1
2 PCB 74 PCB 77 PCB 44 PCB 74 PCB 44 PCB 52
3 c c c c c c
PCBs 44, 46, 52, 66 PCBs 46, 52, 66, 77 PCBs 46, 66, 74, 77
ORD Algorithm 3 1 /2/ 95% 1 /1/ 97% 1 /1/ 97% 1 /2/ 95% 1 /1/ 95% 1 /1/ 95%
LCOR Algorithm 4 1 /1/ 92.1% 1 /1/ 92.7% 2 /2/ 92.1% 1 /1/ 92.4% 2 /2/ 98.2% 1 /1/ 92.4%
ORD-LCOR Coupling 5 1 /1/ 88.1% 1 /1/ 89.3% 1 /1/ 89.3% 1 /1/ 87.8% 1 /1/ 93.3% 1 /1/ 87.8%
439 440
ACS Paragon Plus Environment
16
Page 17 of 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
441
Table 4. Results < rank P / rank S / P > of the analysis with three TeCB standards using the
442
calculated ∆fH database. A number of 12 variants of the standards and the analyte were
443
selected from 60 possible. For each variant there are N = 42 – 3 = 39 possible structures for
444
the analyte.
445 3 standards
Analyte
Case
1
2 PCB 66 PCB 74 PCB 77 PCB 52 PCB 74 PCB 77 PCB 46 PCB 52 PCB 66 PCB 44 PCB 46 PCB 52
b b b c c c c c c b b b
PCBs 44, 46, 52
PCBs 44, 46, 66
PCBs 44, 74, 77
PCBs 66, 74, 77
ORD Algorithm 3 1 /25/ 95% 1 /25/ 95% 1 /25/ 95% 1 /1/ 95% 1 /2/ 95% 1 /1/ 100% 1 /3/ 95% 1 /7/ 91% 1 /2/ 91% 1 /20/ 91% 1 /13/ 91% 1 /20/ 91%
LCOR Algorithm 4 6 /6/ 90.3% 4 /4/ 90.7% 2 /2/ 89.0% 1 /1/ 90.3% 1 /1/ 98.5% 1 /1/ 98.6% 1 /1/ 98.9% 1 /1/ 91.0% 3 /3/ 97.5% 5 /5/ 97.5% 3 /3/ 97.8% 3 /3/ 96.5%
ORD-LCOR Coupling 5 6 /6/ 85.8% 4 /4/ 86.2% 2 /2/ 84.5% 1 /1/ 85.8% 1 /1/ 93.5% 1 /1/ 98.6% 1 /1/ 93.9% 1 /1/ 82.8% 1 /1/ 88.7% 5 /5/ 88.7% 3 /3/ 89.0% 3 /3/ 87.8%
446 447 448
ACS Paragon Plus Environment
17
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 18 of 20
449
Table 5. Results < rank P / rank S / P > of the analysis with two TeCB standards using the
450
calculated ∆fH database. A number of 20 variants of the standards and the analyte were
451
selected from 60 possible. For each analytical variant there are N = 42 - 2 = 40 possible
452
structures for the analyte.
453 2 standards
Analyte
Case
1
2 PCB 52 PCB 66 PCB 74 PCB 77 PCB 46 PCB 52 PCB 74 PCB 77 PCB 44 PCB 52 PCB 66 PCB 77 PCB 44 PCB 46 PCB 52 PCB 66 PCB 46 PCB 44 PCB 77 PCB 74
a b b b c c c c c c c c b b b a a a a a
PCBs 44, 46
PCBs 44, 66
PCBs 46, 74
PCBs 74, 77 PCBs 44, 52 PCBs 46, 52 PCBs 66, 74 PCBs 66, 77
ORD Algorithm 3 1 /1/ 91% 1 /28/ 100% 1 /28/ 100% 1 /28/ 100% 1 /3/ 100% 1 /14/ 91% 1 /2/ 91% 1 /1/ 100% 1 /1/ 100% 1 /1/ 100% 1 /14/ 91% 1 /14/ 91% 1 /20/ 91% 1 /20/ 91% 1 /21/ 91% 1 /2/ 83% 1 /2/ 91% 1 /1/ 91% 1 /1/ 83% 1 /2/ 83%
LCOR Algorithm 4 1 /1/ 54.8% 5 /5/ 99.1% 4 /4/ 99.1% 2 /2/ 99.6% 2 /2/ 99.1% 1 /1/ 89.2% 1 /1/ 98.7% 2 /2/ 98.2% 2 /2/ 99.1% 1 /1/ 91.2% 2 /2/ 98.5% 1 /1/ 99.0% 6 /6/ 98.7% 4 /4/ 99.0% 5 /5/ 97.4% 11 /11/ 51.3% 3 /3/ 54.8% 1 /1/ 54.8% 1 /1/ 51.3% 1 /1/ 51.3%
ORD-LCOR Coupling 5 1 /1/ 49.9% 5 /5/ 99.1% 4 /4/ 99.1% 2 /2/ 99.6% 1 /1/ 99.1% 1 /1/ 81.2% 1 /1/ 89.8% 1 /1/ 98.2% 1 /1/ 99.1% 1 /1/ 91.2% 1 /1/ 89.6% 1 /1/ 90.1% 6 /6/ 89.8% 4 /4/ 90.1% 5 /5/ 88.6% 8 /8/ 88.6% 3 /3/ 49.9% 1 /1/ 49.9% 1 /1/ 42.6% 1 /1/ 42.6%
454 455 456
ACS Paragon Plus Environment
18
Page 19 of 20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
457
Table 6. Averages of the analytical parameters corresponding to the ORD, LCOR algorithms
458
and their coupling in the 3 cases of similarity of the standards and analyte, for the variants
459
shown in Tables 2-5. The values formatted bold/grey correspond to the analytical variants
460
recommended for the analysis of TeCBs.
461 Case
Similarity of standards
2 a b c a b c
3 high high various high high various
Similarity of the analyte with the standards 4 high small various high small various
Average selectivity (%)
a b c
high high various
Average precision (%)
a b c a b c
Analytical parameter 1 Average relative error (%) Average accuracy (%)
Rank S interval
ORD Algorithm
LCOR Algorithm
ORDLCOR Coupling
5 1.25 55 6.3 98.75 45 93.7
6 5 7.8 1.1 95 92.2 98.9
7 3.75 7.8 0 96.25 92.2 100
high small various
98.8 45 94.5
100 100 100
100 100 100
high high various
high small various
87 94.2 95.2
53 96.2 94.9
54.0 96.2 90.3
high high various
high small various
1-2 13-28 1-14
1-11 2-6 1-3
1-8 2-6 1
462
ACS Paragon Plus Environment
19
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment
Page 20 of 20