Subscriber access provided by RYERSON UNIVERSITY
Article
Influence of mass resolving power in orbital iontrap mass spectrometry-based metabolomics Lukáš Najdekr, David Friedecký, Ralf Tautenhahn, Tomáš Pluskal, Junhua Wang, Yingying Huang, and Tomas Adam Anal. Chem., Just Accepted Manuscript • Publication Date (Web): 03 Nov 2016 Downloaded from http://pubs.acs.org on November 3, 2016
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
Influence of mass resolving power in orbital ion-trap
2
mass spectrometry-based metabolomics
3
Lukáš Najdekr1,2, David Friedecký1,2, Ralf Tautenhahn3, Tomáš Pluskal4, Junhua Wang3,
4
Yingying Huang3, Tomáš Adam1,2
5
1
Laboratory of Metabolomics, Institute of Molecular and Translational medicine, Palacký University in
6
Olomouc, Hněvotínská 5, 775 15 Olomouc, Czech Republic 2
7
University Hospital Olomouc, I.P. Pavlova 185/6, 779 00 Olomouc, Czech Republic 3
8 9
4
Thermo Fisher Scientific, 355 River Oaks Parkway, San Jose, 95134 CA, USA
Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge, MA 02142-1479, USA
10 11
Abstract
12
Modern separation methods in conjunction with high resolution accurate mass (HRAM)
13
spectrometry can provide an enormous number of features characterized by exact mass and
14
chromatographic behavior. Higher mass resolving power usually requires longer scanning
15
times, and thus fewer data points are acquired across the target peak. This could cause
16
difficulties for quantification, feature detection and deconvolution. The aim of this work was
17
to describe the influence of mass spectrometry resolving power on profiling metabolomics
18
experiments.
19
From metabolic databases (HMDB, LipidMaps, KEGG), a list of compounds (41 474) was
20
compiled and potential adducts and isotopes were calculated (622 110 features). The number
21
of distinguishable masses was calculated for up to 3840k resolution. To evaluate these
22
models, human plasma samples were analyzed by LC-HRMS on an Orbitrap Elite hybrid
23
mass spectrometer (Thermo Fisher Scientific, CA, USA) at resolving power settings of 15k
1 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
24
(7.8 Hz) up to a maximum of 480k (1.2 Hz). Software XCMS 1.44, MZmine 2.13.1 and
25
Compound Discoverer 2.0.0.303 were used for evaluation.
26
In plasma samples, the number of detected features increased sharply up to 60k in both
27
positive and negative mode. However, beyond these values, it either flattened out or
28
decreased owing to technical limitations.
29
In conclusion, the most effective mass resolving powers for profiling analyses of metabolite
30
rich bio-fluids on the Orbitrap Elite were around 60 000 - 120 000 FWHM in order to retrieve
31
the highest amount of information. The region between 400 – 800 m/z was influenced the
32
most by resolution.
33
Graphical Abstract
34 35
Introduction
36
Analysis of complex samples by modern separation methods in conjunction with high
37
resolution accurate mass (HRAM) spectrometry can yield an enormous number of features
2 ACS Paragon Plus Environment
Page 2 of 26
Page 3 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
38
characterized by exact mass and chromatographic behavior. High resolution mass
39
spectrometry analyzers are usually based on FT-ICR, double focused magnetic sectors,
40
reflectron time-of-flight mass analyzers or ion traps. The last two techniques are
41
predominantly used in analyses of biological samples. A resolution of several tens of
42
thousands FWHM (full-width-at-half-maximum) with high speed data acquisition up to 100
43
Hz can be achieved with current time-of-flight instruments (TOF), for which scan rate is
44
independent of resolution. In contrast, mass spectrometers based on an orbital ion trap using
45
fast Fourier transformation (FFT) allow a resolution of up to 500 000 FWHM (at 200 m/z) at
46
the expense of lower acquisition rates. Hence, their higher mass resolving power usually
47
requires a longer scanning times, and consequently fewer data points are acquired across the
48
studied peak. This could cause problems for feature detection, the deconvolution of peaks and
49
quantification. Mass spectrometry measurement with a precision of four decimal places is
50
crucial for molecular formula prediction. With increasing resolution, the number of
51
compounds with apparently identical m/z decreases owing to isobaric matrix interferences. In
52
many analyses of highly complex samples (e.g., metabolomics, proteomics), the balance
53
between speed of mass spectral acquisition and mass resolution is an issue.
54
Chromatographic separation of complex biological matrices is still a considerable
55
challenge. The human serum metabolome is chemically highly variable and consists of many
56
classes of metabolites, including lipids (e.g., glycerolipids, phospholipids), amino acids,
57
hydroxycarboxylic acids, purines, etc. Analysis of such complex matrices is usually very
58
difficult and requires several different separation techniques (liquid chromatography, gas
59
chromatography, capillary electrophoresis)
60
metabolites can vary over six orders of magnitude. It has been reported in many studies that
61
right choice of separation methods may significantly improve number of detected features (3-
(1-2)
. Furthermore, concentration levels of
3 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
62
5
63
spectrometry data is crucial.
). Despite the high efficiency and selectivity of available separation methods, HRAM
64
The aim of this work was to describe this relationship by both theoretical and
65
experimental methods. In the first part, we compiled a list of 41 474 metabolites available in
66
public databases (HMDB, LipidMaps, KEGG) and calculated 622 110 potential adducts and
67
isotopes. Values of the partition coefficient (LogP) for each metabolite were retrieved from
68
the databases if available. The resulting lists were used for subsequent in silico calculations.
69
In the second part of the study, human plasma was analyzed at different mass spectral
70
resolutions and the experimental data was compared with the theoretically predicted behavior
71
of a high resolution mass spectrometer.
72
Materials and Methods
73
Chemicals
74
Solvents acetonitrile, methanol and water (all LC-MS quality) and acetone (HPLC
75
quality) as well as formic acid were purchased from Sigma-Aldrich (St. Louis, USA).
76
Samples
77
Plasma samples from healthy volunteers were collected at the University Hospital
78
Olomouc (Czech Republic). The samples were pooled and then stored at -80°C until analysis.
79
Written informed consent according to the Declaration of Helsinki by the World Medical
80
Association (WMA) was obtained from the volunteers for all samples used in the analyses.
81
In silico calculations
4 ACS Paragon Plus Environment
Page 4 of 26
Page 5 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
82
To obtain a comprehensive list of compounds known to constitute the human
83
metabolome, a list of positively ionizable metabolites was compiled from the HMDB,
84
LipidMaps and KEGG databases (41 474 metabolites in total after removing duplicates). All
85
calculations were performed using R software (6) in conjunction with the package Rdisop (7-10).
86
For each metabolite, the isotopic pattern based on the chemical formula was generated. From
87
the database generated list, adducts for M, M+1 and M+2 isotopes ([M+H]+, [M+NH4]+,
88
[M+Na]+, [M+K]+, [M+ACN+H]+) were calculated (622 110 features). Mass distribution
89
graphs for 15 000, 30 000, 60 000, 120 000, 240 000, 480 000, 960 000, 1 920 000 and 3 840
90
000 FWHM at 400 m/z were then plotted. The resolution in orbital ion trap based
91
spectrometers is not constant through all mass range, thus the correction for each m/z was
92
made (Figure S-6). By removing isobars from the metabolite list (41 474) based on m/z, a list
93
of unique m/z was generated (15 722). For each unique m/z in the list, the theoretical mass
94
spectrometry peak width [m/z - x; m/z + x] was calculated, where x = m/z mass/(resolving
95
power*((400/(m/z mass))^(1/2))). Consequently, the entire final list of 622 110 features was
96
searched against the interval defining the number of features not detectable due to isobaric
97
matrix interferences within the calculated range of each unique m/z (15 722).
98
The influence of resolution on the number of detected peaks was calculated for m/z up
99
to 2000. The list of generated in silico features (622 110) was filtered to give unique m/z
100
values (227 060). The first value from the list of unique m/z was taken and the peak width
101
based on resolution and its m/z were calculated. All m/z values lying within the peak width
102
were grouped and removed from the list. The final number of groups was considered to be the
103
number of peaks detectable in the mass spectrum for the given resolution and mass range.
104
Sample preparation and LC-MS method
5 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 26
105
Samples were prepared using a method modified from Yuan et al. (11) Pooled human
106
plasma sample (500 µL) was deproteinated by mixture of acetonitrile, acetone and methanol
107
(v/v 1:1:1, 1500 µL, -80°C), vortex mixed and incubated overnight at -80°C. Samples were
108
centrifuged (24 400 x g, 15 min, 4°C), freezed-dried and re-suspended in 1 mL of 10%
109
methanol and 90% water. Before analyses, samples were centrifuged again in order to remove
110
the debris and other solid objects.
111
The LC method followed that of Wang, J. et al.
(12)
using a Dionex UltiMate 3000
112
Rapid Separation LC system (Thermo Fisher Scientific, MA, USA). Samples were analyzed
113
on an Acquity UPLC BEH C18, 2.1 x 100 mm, 1.7 µm column (Waters, MA, USA). The
114
mobile phase consisted of water with 0.1% formic acid (mobile phase A) and methanol with
115
0.1% formic acid (mobile phase B). A flow rate of 0.35 mL/min was used with the following
116
elution gradient: t=0.0, 0.5% B; t=4.0, 70% B; t=4.5, 98% B; t=10.4, 98% B; t=10.6, 0.5% B;
117
t=15.0 min, 0.5% B. The column temperature was set at 40°C and the injection volume was 2
118
µL. Peaks in the retention window from 1 – 15 minutes were chosen for data processing.
119
Same LC method was used for both ionization modes (13).
120
An Orbitrap Elite hybrid mass spectrometer (Thermo Fisher Scientific, MA, USA)
121
was operated in either positive or negative mode at 15 000 (transient = 24 ms; 7.8 Hz), 30 000
122
(transient = 48 ms; 7.7 Hz), 60 000 (transient = 96 ms; 6.9 Hz), 120 000 (transient = 192 ms;
123
4 Hz), 240 000 (transient = 384 ms; 2.3 Hz)and 480 000 (transient = 768 ms; 1.2 Hz)FWHM
124
at 400 m/z over the ranges 70–500 m/z and 300–2000 m/z (acquisition at 480 000 FWHM was
125
possible owing to the use of a Tune Plus Developer’s Kit, kindly provided by Thermo Fisher
126
Scientific, MA, USA). Two mass range regions were chosen in order to increase sensitivity
127
and ensure one scan per spectrum (according to Mathieu equation). To eliminate variances
128
due to data acquisition, analyses of plasma samples were performed in sextuplicate for each
129
mass spectrometry resolution. Settings of the electrospray ionization were as follows: heater 6 ACS Paragon Plus Environment
Page 7 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
130
temperature 250°C; sheath gas 35 arbitrary units; auxiliary gas 15 arbitrary units; capillary
131
temperature 300°C and source voltage +3.0 kV. A Thermo Tune Plus 2.7.0.1103 SP1 was
132
used as instrument control software and data were acquired in centroid mode using Thermo
133
Excalibur 2.2 SP1.48 software (Thermo Fisher Scientific, MA, USA).
134
LC-MS data processing
135
The acquired dataset from the plasma samples was processed using the three most
136
frequently used software based on different feature detection algorithms, i.e., XCMS 1.44 (in
137
R software environment), Compound Discoverer 2.0.0.303 and MZmine 2.13.1 centWave
138
algorithm in XCMS, to detect regions of interest (ROI) within the particular m/z value. The
139
Continuous Wavelet Transform (CWT) was applied to the intensity values of the ROI and
140
local maxima in the CWT coefficients for each scale were determined
141
algorithms are mainly influenced by the parameters ppm mass error (ppm) and signal-to-noise
142
ratio (snthresh). Various values of these parameters were tested (ppm = 2, 4, 6, 8, 10, 12, 14,
143
16, 18, 20; snthresh = 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30) and after detailed study of the
144
results, “ppm= 8” and “snthresh=20” were chosen as the best settings in order to obtain less
145
false positive features (noisy features). These findings correspond to the recently published
146
work by Glauser et al.
147
grouping and retention time correction methods, see the Supporting Information. After
148
processing, number of peaks was counted for each data file individually. In case of XCMS
149
and MZmine no deisotoping module or package was used.
(15)
(14)
. Peak detection
. For details of the settings for each peak detection algorithm, peak
150
Retention time correction in each software was performed for individual sextuplicates.
151
The processed lists of features for the ranges 70–500 m/z and 300–2000 m/z for each
152
resolving power were merged at m/z 400 in order to obtain the number of features in the
153
spectra per resolving power. Coefficient of variance (CV) was calculated based on detected 7 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
154
areas across six replicate injections. Peaks with CV > 30% were considered as noise and
155
removed from further calculations.
156 157 158 159
Results and Discussion
160
In this work, we employed both theoretical and experimental methods to investigate
161
the relation between mass spectrometry resolving power, scan speed and capability of feature
162
detection in a metabolomics study.
163
The effect of separation could not be included in the calculations owing to the
164
unpredictable behavior of compounds during separation (e.g., lipids with very similar exact
165
mass but different formula and chromatographic behavior) and extensive variability of
166
chromatographic methods. Thus, the presented calculations are only valid for flow injection
167
analysis metabolomics experiments and “worst case scenarios” in separation methods.
168
In silico calculations
169
In silico calculations were performed to examine distributions of overlaps of m/z
170
representing (622 110) metabolites, isotopes and adducts over the range 50 – 2000 m/z. The
171
first step was filtering the combined metabolite list to identify unique m/z values. These
172
unique m/z values are plotted on the X axis in Figure 1, whereas the Y axis shows the number
173
of m/z values that lie within the interval [m/z - x; m/z + x], as described in the Materials and
174
Methods. Hence, the coordinates of each dot shown in Figure 1 represents the unique m/z
175
values (X axis) and the number of features that are apparently identical at a given resolution
176
and not recognizable within the curve of mass spectrometry peak with Gaussian profile (Y 8 ACS Paragon Plus Environment
Page 8 of 26
Page 9 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
177
axis). The scale variance (sigma squared) of the mass spectral peak was indirectly
178
proportional to the resolving power. Thus, the number of indistinguishable features decreased
179
with increasing resolving power. Two major regions with the highest number of m/z overlaps
180
can be seen in Figure 1. The first one is the region between 400 m/z and 600 m/z, which
181
corresponds to short peptides (di-, tri-, tetra-), secosteroids and partially to lipids with lower
182
m/z (e.g., glycerophosphocholines, glycerophosphoethanolamines, long chain fatty acids). The
183
second region was between 750 m/z to 1050 m/z, which corresponds mainly to lipids. The
184
three colored lines shown in Figure 1 indicate three different quantiles (0.99; 0.75; 0.50) of
185
the dot density distribution.
186
Figure 2 depicts the maximum number and median of the calculated overlapping m/z
187
values at a particular resolving power. The number of m/z masked by isobaric matrix
188
interferences decreased according to a power function with limit at one. Above a resolving
189
power of 240 000 FWHM, the maximum number of indistinguishable features did not
190
decrease. Evaluation of the structure of the data revealed that it was caused by isobaric
191
compounds with high structural diversity. For example, m/z 244.1549 corresponds to a M+K
192
ion of mass 205.1951 (C15H24) which applies to a group of sesquiterpenes and prenols with
193
130 possible overlaps. The other most abundant feature overlaps (m/z 205.1956, 298.2746,
194
322.2746, 450.3219) can mostly be attributed to various lipid classes and adducts
195
corresponding to those lipids (see Figure 1). The overall abundance of lipids in the compiled
196
metabolite list is 37.23 % (Fatty acyls: 6.28%; Glycerolipids: 9.35%; Glycerophospholipids:
197
10.65%; Polyketides: 3.46%; Prenol lipids: 1.85%; Saccharolipids: 0.03%; Sphingolipids:
198
1.34%; Sterol lipids: 4.27%). The median values (dashed line) show that even with very high
199
resolving power, it is not possible to separate all the features fully. At a resolving power of
200
3 840 000 FWHM, a maximum of 35.2 % of features were represented by a specific m/z with
9 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
201
no overlaps, whereas for a typical resolving power of 60 000 FWHM, only 3.63 % of features
202
could be separated.
203
Comparison of individual increments of features generated in silico was made (Figure
204
3). In the range 0 to 600 m/z (Figure 3a), there was a huge increase from 1.47 (0–100 m/z) up
205
to 23.11 (501–600 m/z). In the range 600–1400 m/z, the opposite trend was observed from
206
14.43 to 2.80 (Figure 3b). The curves in Figure 3c show similar trends in the range 1400–
207
2000 m/z (ratios 2.85 to 3.67) but a different dependence than observed at lower m/z because
208
the data plateaued at high resolution above 960 000 FWHM. Thus, in these theoretical
209
calculations, a resolution of millions FWHM still had an effect on the calculated number of
210
detectable unique masses.
211
LC-MS data
212
We analyzed plasma samples at different resolutions up to 480 000 FWHM to
213
investigate the influence of resolution on the number of detected features. The analysis lasted
214
15 minutes with gradient elution and a peak capacity P=167 (N = 90 000 – 576 000 N/m).
215
Total ion chromatograms and extracted ion chromatograms of selected isomeric compounds
216
are provided in the Supporting Information (Figure S-1, Figure S-3). Three different software
217
were used for processing the LC-MS data (data shown in Figure 4). Software XCMS,
218
MZmine and Compound Discoverer yielded similar trends, i.e., sharp increase in the number
219
of detected features with maximum at 60 000 FWHM in both positive and negative mode
220
(120 000 FWHM for Compound Discoverer in positive mode). When all peaks considered,
221
regardless the CV, the trends are peaking at 120 000 FWHM in positive and 60 000 FWHM
222
in negative mode. This findings suggesting that many noise peaks are detected during the
223
peak picking at resolution 120 000 FWHM in positive mode (See Supporting Information
224
Figure S-7, S-8 and S-9). Each software is capable of producing different types of lists of 10 ACS Paragon Plus Environment
Page 10 of 26
Page 11 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
225
features. XCMS is detecting all features without any further filtering, referred to as “raw
226
features” (for isotope and/or adduct grouping, other software modules should be used).
227
MZmine is capable of identifying “raw features” or if the deisotoping module is applied,
228
“isotopic features”. Thus, for ready comparison, XCMS raw feature and MZmine raw feature
229
lists were used to generate the plots shown in Figure 4. The “Unknown Detector” module in
230
Compound Discoverer is capable of detecting features only with the minimum number of
231
isotopes set to one or more, generating an “isotopic feature” list. Numbers obtained by
232
software Compound Discoverer (Figure 4C) represent the sum of compounds present in the
233
mass spectrum as several different ion species grouped as one (grouped isotopes and adducts).
234
For the abovementioned reasons, the absolute numbers in Figure 4 are not strictly comparable
235
and only the trends should be considered. All the software predicted an approximately five
236
times higher number of features for the positive mode compared to the negative mode. This
237
observation may origin from fact that plasma metabolites are predominantly ionized in
238
positive mode. The physical-chemical properties of the compounds and mobile phase
239
composition may also contribute to this observed phenomena
240
features observed in negative mode, the necessity for higher resolution is less crucial.
(16)
. Due to lower number of
241
In plasma samples in positive mode at 60 000 FWHM, 6778 features (MZmine, Figure
242
4B) were detected (if all peaks considered, regardless the CV, at 120 000 FWHM, 10 168
243
features were detected (MZmine, Supporting Information Figure S-8a)). The error bars at
244
higher resolutions will be result of more individual ion signals, therefore presenting a
245
challenge for the peak detection algorithms. In contrast, the number of features revealed by
246
the in silico calculations at 60 000 FWHM was 49 529 (Figure S-4). Although both analyses
247
took account of metabolites, isotopes and most common adducts, the number of features for
248
the plasma samples should be in theory even higher because it includes fragments, noise
249
features and other features possibly generated by the electrospray ionization. The discrepancy 11 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 26
250
in number of features may appear from different reasons. A large number of compounds listed
251
in the databases are present in biological samples at concentrations below the limit of
252
detection for current profiling methods (e.g., hormones, neurotransmitters) and non-targeted
253
metabolite extract, as well as contains exogenic compounds (drugs, food metabolites,
254
xenobiotics, etc.). Other fragments may be chemically and/or biologically unstable, and thus
255
lost. Poor ionizability of certain compound classes may also decrease number of detected
256
features. Another limitation is that some compounds are not retained or are trapped on the
257
column, thus undetectable. Further, isobars may show unpredictable behavior under the given
258
separation modes (e.g., reverse phase, aqueous normal phase, HILIC).
259
Figure 5 presents histograms of m/z values (by XCMS) from plasma samples showing
260
the distribution of individual data points in Figure 4. The overall trend in the curves mostly
261
follows that observed in the in silico calculations (Figure 1). The region 300–800 m/z showed
262
a strong dependence on resolution in positive mode (Figure 5a). In contrast, the region 800–
263
1400 m/z showed almost same number of feature for 60 000 and 120 000 FWHM (Figure 5a)
264
suggesting less need for the high resolution in this region. The resolution in orbital based ion
265
traps detectors is not linear (Figure S-6). This effect result in lower resolving power in region
266
with higher m/z values and thus less number of detected features. In the negative ionization
267
mode (Figure 5b), all curves at resolutions from 15 000 to 120 000 FWHM showed similar
268
profiles. The number of detected features with m/z above 400 at resolutions of 240 000 and
269
480 000 FWHM was significantly decreased due to insufficient scan frequency (data points).
270
This issue may be overcome by using Orbitrap mass spectrometer capable of higher scanning
271
speed.
272
In mass spectrometers based on an orbital ion trap, a high resolving power is achieved (17)
273
by using longer acquisition of ions in the trap, thus lowering the frequency of data points
274
(Figure S-2). It is generally accepted that there should be a minimum of four points per peak 12 ACS Paragon Plus Environment
Page 13 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
(14)
275
for automatic feature detection algorithms
. In order to minimize the influence of this
276
parameter, we conducted an experiment where the minimum number of data points per peak
277
was set to 3 (centWave). Regardless of the resolving power, a higher number of features was
278
detected (Figure S-5). However, detailed inspection of the data revealed that most of the
279
reported peaks were false positive hits. This may suggest that 60 000 - 120 000 FHWM is a
280
good compromise in terms of resolution and scan speed for metabolomics on the mass
281
spectrometer used in this study.
282
Although, very high resolution may not be suitable for general untargeted
283
metabolomics experiment, it can be very useful to define isotopic distribution and
284
determination of elemental composition.
285
Our compiled list covered metabolites present in a given biological system not taking
286
into account differences in tissue/bio-fluid distribution. It is also containing exogenic
287
compounds (drugs, xenobiotics, food and plant metabolites), which may be present to varying
288
degrees in biological samples depending on their nature. In silico calculations in this study
289
were focused on human plasma and it would be interesting to see its application in plant
290
metabolomics where many metabolites are preferably ionized in negative mode. Different
291
scenario may also appear in lipidomics or glycomics which are heavily influenced by high
292
number of structural isomers.
293
294
Conclusion
295
The aim of this work was to address theoretically and experimentally the relation
296
between mass spectrometry resolution and capability of feature detection in a metabolomics
297
experiment. In silico calculations showed that with increasing resolution, more features can be
13 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
298
detected (limited by the maximum number of features possible for the particular biological
299
matrix). LCMS data showed that the best resolution was 60 000 - 120 000 FWHM in positive
300
and 60 000 FWHM in negative ionization mode for ESI, thus our findings suggest that in
301
current metabolomic studies, a resolution above 60 000 FWHM is necessary to retrieve the
302
highest amount of information.
303
Funding:
304
The infrastructural part of this project (Institute of Molecular and Translational
305
Medicine) was supported by a NPU I (LO1304) and Czech Science Foundation Grant 15-
306
34613L. Tomáš Pluskal is a Simons Foundation Fellow of the Helen Hay Whitney
307
Foundation.
308
309
References
310
(1)
Kuehnbaum, N. L.; Britz-Mckibbin, P. Chem. Rev. 2013, 113, 2437–2468.
311 312 313 314 315
(2)
Psychogios, N.; Hau, D. D.; Peng, J.; Guo, A. C.; Mandal, R.; Bouatra, S.; Sinelnikov, I.; Krishnamurthy, R.; Eisner, R.; Gautam, B.; Young, N.; Xia, J.; Knox, C.; Dong, E.; Huang, P.; Hollander, Z.; Pedersen, T. L.; Smith, S. R.; Bamforth, F.; Greiner, R.; McManus, B.; Newman, J. W.; Goodfriend, T.; Wishart, D. S. PLoS One 2011, 6, e16957.
316
(3)
Contrepois, K.; Jiang, L.; Snyder, M. Mol. Cell. Proteomics 2015, 14, 1684–1695.
317 318
(4)
Zhang, T.; Creek, D. J.; Barrett, M. P.; Blackburn, G.; Watson, D. G. Anal. Chem. 2012, 84, 1994–2001.
319 320
(5)
Zhang, R.; Watson, D. G.; Wang, L.; Westrop, G. D.; Coombs, G. H.; Zhang, T. J. Chromatogr. A 2014, 1362, 168–179.
321 322 323
(6)
R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.Rproject.org/
324
(7)
Böcker, S.; Letzel, M. C.; Lipták, Z.; Pervukhin, A. Bioinformatics 2009, 25, 218–224. 14 ACS Paragon Plus Environment
Page 14 of 26
Page 15 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
325 326
(8)
Böcker, S.; Lipták, Z.; Martin, M.; Pervukhin, A.; Sudek, H. Bioinformatics 2008, 24, 591–593.
327 328
(9)
Böcker, S.; Letzel, M.; Lipták, Z.; Pervukhin, A. Proc. Work. Algorithms Bioinforma. (WABI 2006) 2006, 4175, 12–23.
329
(10)
Böcker, S.; Lipták, Z. Algorithmica (New York) 2007, 48, 413–432.
330
(11)
Yuan, M.; Breitkopf, S. B.; Yang, X.; Asara, J. M. Nat. Protoc. 2012, 7, 872–881.
331 332
(12)
Wang, J.; Christison, T. T.; Misuno, K.; Lopez, L.; Huhmer, A. F.; Huang, Y.; Hu, S. Anal. Chem. 2014, 86, 5116–5124.
333 334 335
(13)
Dunn, W. B.; Broadhurst, D.; Begley, P.; Zelena, E.; Francis-McIntyre, S.; Anderson, N.; Brown, M.; Knowles, J. D.; Halsall, A.; Haselden, J. N.; Nicholls, A. W.; Wilson, I. D.; Kell, D. B.; Goodacre, R. Nat. Protoc. 2011, 6, 1060–1083.
336
(14)
Tautenhahn, R.; Bottcher, C.; Neumann, S. BMC Bioinformatics 2008, 9, 504.
337 338
(15)
Glauser, G.; Grund, B.; Gassner, A.-L.; Menin, L.; Henry, H.; Bromirski, M.; Schutz, F.; McMullen, J.; Rochat, B. Anal. Chem. 2016, acs. analchem.5b04689.
339
(16)
Cech, N. B.; Enke, C. G. Mass Spectrom. Rev. 2002, 20, 362–387.
340
(17)
Zubarev, R. A.; Makarov, A. Anal. Chem. 2013, 85, 5288–5296.
341 342
15 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
343 344
Figure 1: In silico calculation of mass distribution at resolution of 15 000 (A), 120 000 (B)
345
and 960 000 (C) FWHM. The X axis shows the number of unique m/z values filtered from
346
the compiled list of metabolites, whereas the Y axis shows the number of m/z values that fit
347
into the interval [m/z - x; m/z + x], where x is based on the resolution. The lines denote
348
different quantiles (from the top 0.99, 0.75 and 0.50, respectively). The colors of the 16 ACS Paragon Plus Environment
Page 16 of 26
Page 17 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
349
independent dots indicate polarity: green = polar, red = non-polar (based on their value of
350
logP – octanol/water).
351
352
Figure 2: Regression of overlapping features based on resolution (in silico calculation).
353
The full line represents the maximum value of indistinguishable features according to the
354
resolving power. The dashed line shows the median value of indistinguishable features in the
355
list of m/z for each resolving power. The percentage of m/z values represented in mass spectra
356
by a single value is shown by the red line.
357 358
359
17 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
360 361
Figure 3: Relative increase of detected features from in silico calculations divided into
362
100 m/z bins. The Y axis represents the ratio of detected features at the specific resolution
363
standardized by value for 15 000 FWHM. The X axis represents the resolution from 15 000 to
364
3 840 000 FWHM.
365 366
Figure 4: Number of detected features in plasma samples. Each part of the picture
367
represents results from different software in both positive (yellow line) and negative mode
368
(blue line): A) XCMS (raw features), B) MZmine (raw features), C) Compound list (grouped
369
features as a compounds).
370
371
18 ACS Paragon Plus Environment
Page 18 of 26
Page 19 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
372 373
Figure 5: Histograms of m/z values in plasma samples in positive (A) and negative (B)
374
mode (XCMS). Each point on the lines represents the frequency of m/z values within a 50 Da
375
window. Numbers in the legend shows resolution and scan speed respectively.
376
KEYwords: untargeted metabolomics, Orbitrap, high resolution, peak-picking, resolving
377
power, mass spectrometry
378
Shortcuts:
379
CV
Coefficient of variance
380
HRAM
High resolution accurate mass
381
HMDB
Human Metabolome Database
382
KEGG
Kyoto Encyclopaedia of Genes and Genomes
383
LC
Liquid chromatography
384
LC-HRMS
Liquid chromatography-high resolution mass spectrometry
385
LC-MS
Liquid chromatography mass spectrometry
386
HPLC
High performance liquid chromatography
387
FWHM
Full-width-at-half-maximum
388
FFT
Fast Fourier transformation
389
TOF
Time-of-flight
390
WMA
World Medical Association
391
CWT
Continuous Wavelet Transform 19 ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
392
ROI
Page 20 of 26
Regions of interest
393
Table of content: Sample preparation Peak detection algorithm settings Figure S-1: Total ion chromatograms Figure S-2: Lower number of datapoints Figure S-3: Separation of isomeric compounds on the column Figure S-4: Number of detectable compounds based on the list of unique masses Figure S-5: XCMS peak picking with 3 points/peak Figure S-6: Dependency of m/z and resolution in Orbitrap based mass spectrometers Figure S-7: All detected features by Compound Discoverer Figure S-8: All detected features by MZmine Figure S-9: All detected features by XCMS 394
20 ACS Paragon Plus Environment
page S-3 S-3 S-5 S-7 S-9 S - 10 S - 11 S - 12 S - 13 S - 14 S - 15
Page 21 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
Graphical abstrakt Graphical abstrakt
ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 1: In silico calculation of mass distribution at resolution of 15 000 (A), 120 000 (B) and 960 000 (C) FWHM. The X axis shows the number of unique m/z values filtered from the compiled list of metabolites, whereas the Y axis shows the number of m/z values that fit into the interval [m/z - x; m/z + x], where x is based on the resolution. The lines denote different quantiles (from the top 0.99, 0.75 and 0.50, respectively). The colors of the independent dots indicate polarity: green = polar, red = non-polar (based on their value of logP – octanol/water). Figure 1 241x291mm (300 x 300 DPI)
ACS Paragon Plus Environment
Page 22 of 26
Page 23 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
Figure 2: Regression of overlapping features based on resolution (in silico calculation). The full line represents the maximum value of indistinguishable features according to the resolving power. The dashed line shows the median value of indistinguishable features in the list of m/z for each resolving power. The percentage of m/z values represented in mass spectra by a single value is shown by the red line. Figure 2 77x75mm (300 x 300 DPI)
ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 3: Relative increase of detected features from in silico calculations divided into 100 m/z bins. The Y axis represents the ratio of detected features at the specific resolution standardized by value for 15 000 FWHM. The X axis represents the resolution from 15 000 to 3 840 000 FWHM. Figure 3 76x24mm (300 x 300 DPI)
ACS Paragon Plus Environment
Page 24 of 26
Page 25 of 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
Figure 4: Number of detected features in plasma samples. Each part of the picture represents results from different software in both positive (yellow line) and negative mode (blue line): A) XCMS (raw features), B) MZmine (raw features), C) Compound list (grouped features as a compounds). Figure 4 255x77mm (300 x 300 DPI)
ACS Paragon Plus Environment
Analytical Chemistry
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 5: Histograms of m/z values in plasma samples in positive (A) and negative (B) mode (XCMS). Each point on the lines represents the frequency of m/z values within a 50 Da window. Numbers in the legend shows resolution and scan speed respectively. Figure 5 77x28mm (300 x 300 DPI)
ACS Paragon Plus Environment
Page 26 of 26