Subscriber access provided by UNIV OF SOUTHERN INDIANA
Article
Accelerating Lipidomic Method Development through in silico Simulation Paul D Hutchins, Jason D Russell, and Joshua J. Coon Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.9b01234 • Publication Date (Web): 12 Jul 2019 Downloaded from pubs.acs.org on July 16, 2019
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
Accelerating Lipidomic Method Development through in silico Simulation
Paul D. Hutchins1,3, Jason D. Russell2,3, and Joshua J. Coon1,2,3,4,* 1Department 2Morgridge 3Genome
of Chemistry, University of Wisconsin–Madison, Madison, WI 53706, USA
Institute for Research, Madison, WI 53715, USA
Center of Wisconsin, Madison, WI 53706, USA
4Department
of Biomolecular Chemistry, University of Wisconsin–Madison, Madison, WI 53706,
USA * Corresponding author:
[email protected] Page 1 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 31
1
ABSTRACT
2
Judicious selection of mass spectrometry (MS) acquisition parameters is essential for effectively
3
profiling the broad diversity and dynamic range of biomolecules. Typically, acquisition parameters
4
are individually optimized to maximally characterize analytes from each new sample matrix. This
5
time-consuming process often ignores the synergistic relationship between MS method
6
parameters, producing sub-optimal results. Here we detail the creation of an algorithm which
7
accurately simulates LC-MS/MS lipidomic data acquisition performance for a benchtop
8
quadrupole-Orbitrap MS system. By coupling this simulation tool with a genetic algorithm for
9
constrained parameter optimization, we demonstrate the efficient identification of LC-MS/MS
10
method parameter sets individually suited for specific sample matrices. Finally, we utilize the in
11
silico simulation to examine how continued developments in MS acquisition speed and sensitivity
12
will further increase the power of MS lipidomics as a vital tool for impactful biochemical analysis.
13
INTRODUCTION
14
Continued improvements to mass spectrometer (MS) acquisition speed and sensitivity over the
15
last two decades have reinforced its role as the premier bioanalytical tool for unbiased compound
16
identification and quantitation.1,2 These advances in MS hardware and data acquisition strategies
17
now permit the rapid profiling of thousands of biomolecules from complex matrices3–5 and have
18
opened the door to the system-wide characterization of biological processes.6,7 To translate these
19
technological advances into improved biomolecular measurements, proper selection of
20
instrumental operating parameters through rigorous method development is still required.8
21
Traditionally, the optimization of acquisition settings for new instrument platforms or
22
sample types involves the sequential iteration of select parameters to improve various method
23
performance metrics (e.g., number of identifications and run-to-run overlap).9–12 This pragmatic
24
approach is necessary to ensure sufficient instrument performance within a reasonable timeframe
Page 2 ACS Paragon Plus Environment
Page 3 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
as thousands of potential method parameter combinations exist.13
Although practical, the
2
success of this iterative process heavily depends on user expertise and the time allotted for
3
method optimization.
4
Several approaches have been developed to expedite the MS method development cycle
5
including statistical design of experiment (DOE) methods,8,14 derivation of vital method
6
parameters,15 and software for method performance extraction.16
7
approaches seeks to rapidly uncover interdependent relationships between MS parameters and
8
ensure optimal performance, these methods still require numerous LC-MS/MS experiments. For
9
the field of chromatographic separations, tools exist which accelerate method development by
10
simulating the separation of mixtures in silico.17 These tools model chromatographic systems
11
with sufficient accuracy to predict the optimal operating conditions for theoretical compound
12
mixtures. Such computational simulation is highly attractive as it performs rapid optimization on
13
new sample types with minimal a priori knowledge of the sample’s chromatographic behavior.
14
Although powerful, in silico simulation has not been developed for bioanalytical MS method
15
optimization likely due, in part, to the complexity of ion manipulation, the growing array of data
16
acquisition techniques, and the diverse applications of LC-MS/MS analysis.
Although each of these
17
For the emerging fields of discovery lipidomic analysis via LC-MS/MS, the use of different
18
instrumental methods18 leads, in part, to a wide range of analytical performance.19 To empower
19
the rapid selection of optimal lipidomic MS parameters, we describe an algorithm which simulates
20
LC-MS/MS data-dependent acquisition (DDA) of complex lipid extracts on a quadrupole-Orbitrap
21
MS platform (Q Exactive HF). By focusing on a single instrument platform and biomolecule class,
22
we derive equations which accurately model the ion flux, ion transmission, quadrupole isolation,
23
fragmentation efficiency, and spectral noise under various method parameter combinations.
24
From several experimental inputs and user-supplied method settings, the algorithm rapidly
25
simulates data-dependent MS analysis of the sample and predicts the number of lipid Page 3 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 31
1
identifications for a given experiment. After outlining the creation and validation of the simulation,
2
we couple the in silico model to a modified genetic algorithm to efficiently explore the method
3
parameter space for several sample types and return near-optimal method parameter sets.
4
Finally, we model the effect continued improvements to MS acquisition speed and sensitivity will
5
have on lipidomic LC-MS/MS analysis and its continued growth into a mature -omics technology.
6
EXPERIMENTAL SECTION
7
Materials and Reagents
8
The lipid reference standard mixture was purchased from Avanti Polar Lipids (Differential Ion
9
Mobility System System Suitability Lipidomix, Alabaster, AL).
Trifluoracetic acid (TFA),
10
acetonitrile (ACN), and 1.0 M NaOH were purchased from Fisher Scientific (Hampton, NH).
11
Acetonitrile/aqueous sodium trifluoroacetic acid solution (1:1 ACN/0.1% TFA, pH 3.5) was
12
prepared according to Moini et al.20
13
purchased from Thermo Scientific (Rockford, IL). Lipids were extracted from HAP1 and yeast
14
cells using a CHCl3/MeOH/H2O (6:3:2, v/v/v) solvent system, an aliquot of the organic layer was
15
dried under vacuum, and the dried lipid material resuspended in ACN/IPA/H2O (65:30:5, v/v/v).
16
For mouse cecum and plasma samples, lipids were extracted using an MTBE/MeOH/H2O
17
(10:3:2.5, v/v/v) solvent system, an aliquot of the organic layer was dried under vacuum, and the
18
dry lipid material resuspend in MeOH/toluene (9:1). Further sample preparation details are
19
provided in Supplemental Methods.
20
LC-MS/MS Acquisition
21
Lipid extracts were chromatographically separated using an Acquity CSH C18 column (50 °C, 2.1
22
x 100 mm x 1.7 µm particle size; Waters Corporation) coupled to a Vanquish Binary Pump
23
(Thermo Scientific). Mobile phase A consisted of 10 mM ammonium acetate in 70:30 (v/v)
24
ACN/H2O with 250 μL/L acetic acid and mobile phase B consisted of 10 mM ammonium acetate
25
in 90:10 (v/v) IPA/ACN with 250 μL/L acetic acid. Ten microliters of the lipid extracts were injected
Pierce negative ion calibration solution (Calmix) was
Page 4 ACS Paragon Plus Environment
Page 5 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
onto the column using the Vanquish autosampler (Thermo Scientific), ionized using a HESI
2
heated ESI source (Thermo Scientific), and analyzed on the Q Exactive HF platform (Thermo
3
Scientific). LC-MS/MS data were acquired using alternating positive and negative scan cycles
4
(i.e. polarity switching) consisting of a single MS scan followed by multiple data-dependent
5
MS/MS scans. Complete chromatography and MS acquisition parameters are supplied in Table
6
S1.
7
LC-MS/MS Data Processing
8
All lipid identification from LC-MS/MS data was performed using LipiDex (v1.0.2).20 Briefly,
9
MS/MS scans were extracted from each LC-MS/MS experiment and converted to the Mascot
10
Generic Format (MGF) using ProteoWizard (ProteoWizard Software Foundation, v3.0).21 The
11
resulting MS/MS spectra were searched against the LipiDex HCD Acetate library in LipiDex to
12
generate putative lipid identifications based on spectral similarity to in silico lipid fragmentation
13
spectra. Retention time alignment and chromatographic feature finding were performed using
14
Compound Discoverer (Thermo Scientific, v2.1).
15
assigned to chromatographic features and filtered in LipiDex to generate final lipid identifications.
16
Complete data processing parameters are provided in Table S2, S3.
17
RESULTS AND DISCUSSION
18
Simulating Data-Dependent Acquisition
19
During DDA, precursor ions are selected for MS/MS analysis based on their relative intensity from
20
a survey MS scan (Figure 1A). Once sampled, each unique ion population is often excluded
21
from subsequent re-analysis for a pre-determined period to reduce redundant MS/MS sampling.
22
Although remarkably effective, this straightforward data acquisition strategy can suffer from poor
23
run-to-run identification overlap and depth of biomolecule identification due to DDA’s semi-
24
stochastic MS/MS sampling which targets the most abundant MS1 features.22,23 Although the
25
acquisition speed and sensitivity of modern mass spectrometers have significantly increased the
Putative MS/MS identifications were then
Page 5 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 31
1
depth and reproducibility of DDA methods,24 proper tuning of instrument parameters is still
2
required to harness these improvements.
3
With this goal in mind, we developed a software tool which simulates the lipidomic data
4
acquisition and identification performance of a benchtop quadrupole-Orbitrap (Q Exactive HF)
5
LC-MS/MS system. We demonstrate the rapid selection of method parameters to maximize lipid
6
identifications in a single experiment. The Q Exactive HF’s relatively straightforward instrument
7
geometry and widespread use in the lipidomics community18 render it an ideal target for lipidomic
8
simulation. Although several MS simulation tools exist for proteomic analysis, these tools focus
9
on generating “ground truth” datasets to benchmark data acquisition25 or peptide identification26–
10
29
algorithms and do not model the effect of varied MS operating parameters on biomolecule
11
identification. Derivation of the complex factors governing ion transport, fragmentation, and
12
analysis from first principles is, at this time, not possible and is not the focus of this manuscript.
13
Instead, we aim to descriptively model data-dependent acquisition with sufficient accuracy to
14
return near-optimal method parameter sets and accelerate the lipidomic method development
15
process.
16
As outlined in Figure 1B, the simulation first replicates the Q Exactive HF data-dependent
17
scan cycle based on user-supplied instrument parameters (e.g., Orbitrap transient time) and
18
method settings (e.g., MS/MS isolation width). For each scan cycle, the survey MS scan is
19
reconstructed using user-supplied lists which contain chromatographic features and potential lipid
20
identifications.
21
distinguish unique biochemical features from background noise and other co-eluting
22
chromatographic peaks. When combined with a user-supplied LC-MS survey experiment, these
23
data accurately represent the incoming ion flux generated by the LC-ESI system (Figure S1).
24
After MS scan creation, MS/MS scans are “triggered” with user-provided DDA settings. For each
25
scan, we simulate theoretical signal-to-noise (S/N) ratio (ratio of the most intense to least intense
The generation of chromatographic feature table allows the simulation to
Page 6 ACS Paragon Plus Environment
Page 7 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
fragment ion)30 and precursor ion fraction (PIF). Based on these two metrics and whether the
2
chromatographic feature stems from an identifiable lipid species, the predicted identification
3
success of the MS/MS scan is recorded. The outlined scan cycle simulation process is then
4
repeated for the entire LC gradient and the cumulative number of identified lipid features
5
calculated.
6
Modeling MS/MS Acquisition
7
Accurate simulation of lipid data acquisition and identification performance requires in-depth
8
modeling of the entire MS/MS acquisition cycle.
9
characterization of this process which includes ion transmission, quadrupole isolation, lipid
10
fragmentation efficiency, and spectral noise (Figure 1C). For each triggered MS/MS scan, ions
11
of a given m/z range are selected by the quadrupole mass filter (QMF) and collected in the C-
12
trap. As shown in Figure 2A, the time required (injection time) to collect the requisite number of
13
charges (AGC target) is heavily dependent on the MS/MS isolation width (IW) and the intensity of
14
the precursor. We investigated the interdependence of these parameters by varying the MS/MS
15
AGC target and isolation width for a constant ion flux of an ACN/NaTFA (aq). MS analysis of this
16
sample yielded singly-charged NaTFA clusters across a wide m/z range.31 To mathematically
17
correct MS/MS injections time for the co-isolation of NaTFA isotopes, the MS triggered a single-
18
ion monitoring (SIM) scan with the same isolation width after each scan. As anticipated, we
19
observed higher MS/MS injection times at lower isolation widths and higher AGC target values
20
(Figure S2A). Each measured injection time was converted to a transmission efficiency (E) value
21
(Figure S2B) according to the following equation:
22
E=
Accordingly, we describe the systematic
# of charges IT x Parent Intensity
23
Following the method of Remes at al.,32 we fit a curve to each transmission efficiency profile using
24
the following equation: Page 7 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
E=
a―b d
( )
IW 1+ c
Page 8 of 31
+b
2
The values of b,c, and d were found to be linearly dependent on precursor m/z (Figure S2D) –
3
which permitted the calculation of a theoretical MS/MS injection time for any precursor ion
4
population given the MS/MS AGC target and isolation width. To validate the derived relationship,
5
we estimated the MS/MS injection time for nine lipidomic experiments collected with various
6
MS/MS AGC target and isolation width settings. As seen in Figure 2B, we observed a strong
7
correlation between the actual and simulated injection time values spanning several orders of
8
magnitude.
9
Precursor ion populations isolated for MS/MS analysis are rarely pure and often contain
10
multiple co-isolated species whose m/z falls within the QMF’s isolation width.22 To effectively
11
simulate quadrupole transmission profiles, we measured the relative intensity of several Calmix
12
ions while varying the MS/MS isolation width and isolation center (Figure 2C). The resulting
13
quadrupole transmission peaks were then fit with a flat-topped Gaussian according to the
14
following equation (Figure S3A):
15
I = e ―ax
2
― bx4
16
For all m/z values analyzed, the a term was dependent on the isolation width and m/z value, the
17
b term was dependent on the isolation width, and the apex m/z offset was dependent on the
18
isolation width and m/z value (Figure S3B-D). Based on these equations, the simulation models
19
the quadrupole transmission peak for any given m/z and isolation width with sufficient accuracy
20
to calculate the purity of any given MS/MS ion population (Figure 2D). For each simulated MS/MS
21
scan, the algorithm computes a PIF value using the intensity of nearby mass spectral peaks, the
22
simulated transmission profile, and the MS/MS isolation width setting (Figure 2E). To validate
23
our PIF estimation, we collected SIM scans of the isotopic cluster of several Calmix ions using Page 8 ACS Paragon Plus Environment
Page 9 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
various isolation widths. For all analyzed isolation widths, the relative error for the monoisotopic
2
PIF did not exceed 12% (Figure 2F).
3
Once collected in the C-Trap, lipid precursor ions are fragmented via higher-energy
4
collisional dissociation (HCD) in the HCD cell and the resulting fragments used to identify the lipid
5
class and side chain moieties.33 Fragment ion abundances for lipid species vary widely due to
6
the chemical diversity of lipids and the collision energy used for fragmentation.33,34 Accurate
7
prediction of these fragmentation patterns, especially absolute intensities, is not yet routine.35 To
8
approximate lipid fragmentation, we analyzed the fragmentation efficiency (here defined as the
9
ratio between precursor intensity and the most intense fragment ion) from an LC-MS/MS analysis
10
of HAP1 cell extracts (Figure S4A).
Once log-transformed, the observed fragmentation
11
efficiencies followed a normal distributed (x̄:-1.5, σ: 0.6, Figure S4B).
12
approximation, all simulated MS/MS scans are assigned a random fragmentation efficiency drawn
13
from this distribution. To avoid creating a non-deterministic simulation, the S/N of each MS/MS
14
is simulated fifty times and considered successful only if the majority of replicate simulations
15
exhibited sufficient S/N.
Based on this
16
After collisional dissociation, product ions are mass-analyzed in the Orbitrap to generate
17
MS/MS spectra. When the MS collects insufficient lipid ions for analysis, the signal corresponding
18
to noise dominates the spectrum (Figure S5A). For the Orbitrap mass analyzer, non-chemical
19
noise arises from the thermal noise of the pre-amplifier.36 As observed previously,9,37 we found
20
this spectral noise (here defined as the least intense ion in a spectrum) for an MS/MS of Calmix
21
ions to be inversely dependent on the injection time and the square root of the transient length
22
(Figure 2G, Figure S5B-D).
23
N=
k IT × Res
Page 9 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 31
1
We validated this relationship on several LC-MS/MS experiments collected using varying MS/MS
2
resolution settings and observed a strong correlation between the experimentally-measured and
3
simulated noise levels over ~three orders of magnitude (Figure 2H).
4
Predicting Identification Probability
5
Numerous software tools exist which analyze lipid tandem mass spectra using theoretical
6
databases of in silico spectra38 or fragmentation rules.34 For any given spectrum, the intensity of
7
structurally-informative fragments relative to interfering spectral peaks largely determines the
8
software’s ability to decipher the lipid fragmentation patterns.20 Even if an MS/MS targets a bona
9
fide lipid for which the software has an in silico spectrum, the MS/MS event will not give rise to a
10
confident lipid identification if it contains insufficient fragment ions. Accordingly, the probability a
11
given lipid MS/MS spectrum will yield an identification is mainly dependent on the spectral signal-
12
to-noise ratio and precursor ion fraction.
13
To determine the minimum S/N required for lipid identification, we collected MS/MS scans
14
for five phospholipid and glycerolipid reference standards using different MS/MS injection times.
15
All spectra were searched in LipiDex using a precursor and product ion tolerance of 0.01 Da and
16
the spectral similarity score (dot product) recorded. As seen in Figure 3A, similarity scores
17
dropped considerably for MS/MS spectra with S/N less than 3. Indeed, the similarity scores below
18
this S/N threshold appeared random and followed no discernable trend. Even when fragment
19
ions are sufficiently abundant, the probability of a confident lipid identification decreases if the
20
spectrum contains co-fragmented species.34,39
21
generate lipid identifications, we selected 135 pairs of isobaric lipid spectra, mixed them at varying
22
ratios in silico, and searched them against LipiDex’s spectral libraries (Figure 3B). We observed
23
significantly fewer spectral matches at or below the 1:1 mixing ratio. Based on these two
24
analyses, the simulation considers all lipid MS/MS spectra to be successful if S/N > 3 and PIF >
25
0.5. Note these thresholds are specific to the method of lipid identification used in LipiDex and
To elucidate the minimum PIF required to
Page 10 ACS Paragon Plus Environment
Page 11 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
would likely differ for other lipid identification algorithms. Additionally, these thresholds do not
2
account for instances where co-fragmented lipid species share common fragment m/z values. In
3
such cases, the minimum PIF for a confident lipid identification would be significant higher.
4
Simulating Acquisition Performance
5
The number of identified lipid features in an LC-MS/MS experiment results from the
6
intricate interplay between numerous MS acquisition parameters. For an in silico simulation tool
7
to have utility, it must accurately mirror this complexity for a wide range of method setting
8
combinations. Accordingly, we validated the number of predicted lipid identifications using two of
9
HAP1 cell LC-MS/MS datasets. First, we systematically varied the MS/MS topN and MS/MS
10
maximum injection time values for a 30-minute fast polarity switching LC-MS/MS method and
11
measured the number of lipid identifications generated by each experiment. For complex samples
12
which contain numerous chromatographic features across a wide dynamic range, these two
13
parameters largely control the balance between MS/MS quantity and quality.13 Next, we varied
14
the MS/MS isolation width and AGC target setting across nine LC-MS/MS experiments to
15
generate data with varying levels of spectral signal-to-noise and co-fragmentation. For each
16
collected LC-MS/MS experiment, we performed an LC-MS/MS simulation with the same method
17
parameters.
18
As shown in Figure 4A-B, the number of unique lipids identified for the first experimental
19
set ranged between 490 and 590 lipids while the simulated lipid identification performance varied
20
between 400 and 611 lipids. Encouragingly, the simulation accurately reproduced small relative
21
differences in ID performance (average percent error: 10.5%) even though the MS/MS maximum
22
injection times were only incremented by 5 ms for each experiment. Impressed with this level of
23
accuracy, we further explored the simulation’s ability to model the rate and depth of MS/MS lipid
24
identification across an LC-MS/MS gradient as spectral complexity fluctuates. As displayed for
Page 11 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 31
1
two experiments in Figure S6-7, the simulation modeled the nuanced changes in MS/MS
2
acquisition rate and depth of identification.
3
For the second experimental dataset (isolation width/AGC target), the number of unique
4
lipid identifications only varied between 527 and 483 (Figure 4C). Despite this small range of
5
outcomes, the simulation accurately predicted lipid identification performance with an average
6
error of 12.3% and precisely mirrored the minor identification differences between each
7
experiment (Figure 4D). Together, these experiments suggest the algorithm accurately simulates
8
the effect of different method parameter sets on MS/MS speed, acquisition, and quality.
9
Rapid Method Optimization
10
LC-MS/MS method development involves the careful selection of method parameters likely to
11
provide the maximum number of confident identifications. Although a single polarity switching
12
LC-MS/MS experiment can be simulated in less than 30 seconds using the outlined approach
13
(Intel® i7 CPU, 2.67 GHz), the routine exploration of lipid identification performance for all method
14
parameter combinations is infeasible. Even if the potential values for a given parameter are
15
restricted, there still exist thousands of potential combinations. Typically, the analyst’s intuition
16
and previous experiments serve to exclude many of these potential methods.
17
To reduce the time required to determine the optimal parameter set, we implemented a
18
modified genetic algorithm for parameter optimization.
Genetic algorithms simultaneously
19
optimize multiple parameters by mirroring the evolutionary process of selection and recombination
20
based on genotype fitness.
21
particularly effective for time-consuming fitness calculations.40 As implemented here (Figure 5A),
22
the genetic algorithm first begins with a pseudo-random initial population of method parameter
23
sets and simulates the number of unique lipid identifications generated by each set. From this
24
population, the algorithm selects a user-defined number of the top-performing parameter sets to
This class of optimization algorithm is readily parallelized and
Page 12 ACS Paragon Plus Environment
Page 13 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
pair together. Each pair then exchanges their parameter values to generate two new parameter
2
sets.
3
parameter value has a probability of being randomly modified. This process repeats until the
4
population has converged to a single optimal parameter set.
To reduce the likelihood of premature convergence to a sub-optimal solution, each
5
To evaluate our genetic algorithm’s utility we first performed an exhaustive simulation of
6
all method parameter combinations for a 30-minute polarity switching LC-MS/MS experiment.
7
Only instrument parameter combinations with maximum duty cycles (time between MS scans of
8
the same polarity) below 1200 ms were examined. This duty cycle limitation ensured the majority
9
of chromatographic features could be quantified accurately (>10 pts across the peak) while
10
reducing the number of potential parameter sets to 9,217 (see Supplemental Methods for
11
algorithm description). Simulated unique lipid identifications for these methods ranged from 389
12
to 634 lipids (Figure S8). Using this dataset to return the simulation result quickly, we performed
13
100 head-to-head comparisons between the genetic algorithm and a random parameter search
14
of equal iterations. As seen in Figure 5B, the genetic algorithm outperformed the random search
15
93 out of 100 times and, on average, would have taken 24 minutes to complete (estimated for
16
Intel® Core i7, 3.30 GHz). Encouragingly, the genetic algorithm always returned either the first
17
(634 lipid IDs) or second-ranked (633 lipid IDs) parameter set – suggesting the algorithm rarely
18
converges on a sub-optimal solution (Figure 5C).
19
With the validated genetic algorithm in hand, we explored the utility of in silico method
20
optimization using four different sample types: HAP1 cells, mouse cecum, mouse plasma, and
21
yeast cells. These samples were chosen as they contain varying levels of lipid complexity (1115,
22
614, 570, and 249 identifiable lipids respectively). For each sample, the simulation was provided
23
with sample-specific survey LC-MS experiment, a list of chromatographic features, and a list of
24
potential lipid identifications. As shown in Figure 5D-G, the algorithm returned different optimal
25
parameter sets for each sample type with simulated total unique lipid identifications of 634, 277, Page 13 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 14 of 31
1
228, and 136 respectively. We were encouraged to observe that the TopN, MS/MS IT, and
2
MS/MS IW parameters all scaled with the lipidomic complexity of the sample. These sample-
3
specific parameter sets highlight the importance of optimizing method parameters for individual
4
sample types and the potential for in silico simulation to expedite method development.
5
Next-Generation Lipidomic Analysis
6
The vast structural diversity of lipid species mandate the MS platform be operated in both positive
7
and negative polarities to obtain optimal ionization and fragmentation.33 Such dual-polarity data
8
can either be collected in sequential LC-MS/MS experiments or if supported by the MS platform,
9
the same experiment. Rapid polarity switching methods are advantageous when used for large
10
sample cohort analysis as they halve total acquisition time and simplify data integration.10 Despite
11
these benefits, the use of fast polarity switching presents unique analytical challenges. After the
12
instrument reverses polarity, it cannot collect or analyze any ion populations until the instrument’s
13
electrostatic fields stabilize41 (~230 ms for the Q Exactive HF).
14
dramatically reduces the fraction of the duty cycle used for MS analysis. Additionally, the total
15
duty cycle must be split between both positive and negative ion acquisition, further decreasing
16
the obtainable MS/MS sampling depth for a given polarity.
This stabilization period
17
This inherent challenge is evident from the performance of the optimal methods predicted
18
in Figure 5D-G. Although the samples’ differing lipid diversity produced unique optimal parameter
19
sets and lipid identifications, the actual percentage of possible lipids identified remained relatively
20
unchanged (57%, 45%, 40%, and 54% respectively). This apparent threshold is likely due to the
21
broad dynamic range over which lipid species are present in each sample (~4.5 orders of
22
magnitude, Figure S9). Even for a relatively simple yeast cell extract, the LC-MS/MS method
23
must acquire MS/MS spectra at a sufficient rate to keep pace with the rate of LC peak elution.
24
This acquisition speed requirement mandates reduced MS/MS injection times – generating low
25
S/N spectra for lowly-abundant species. Insufficient profiling depth even when operating under Page 14 ACS Paragon Plus Environment
Page 15 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
optimal settings is evident for the simulated analysis of the HAP1 cell extract presented in Figure
2
6A.
3
chromatogram segment. We define peak depth to be the number of chromatographic features,
4
excluding isotopes, of higher intensity than the least intense identified lipid. During the period of
5
greatest spectral complexity (7.5-12.5 mins), this peak depth can rise to 200 features. Despite
6
the in silico instrument acquiring nearly six MS/MS spectra per second during this period, the
7
optimal 35 ms MS/MS injection times are insufficient to generate lipid identifications for many
8
species.
9
lipidomics experiment on the Q Exactive HF platform, concomitant improvements to instrument
10
Here, we display the maximum lipid identification depth attained for each 0.25 min
To overcome this apparent identification threshold for a single-shot LC-MS/MS
acquisition speed and sensitivity must be realized.
11
Using our in silico simulation tool, we sought to explore how these developments might
12
further improve the Q Exactive HF platform for high-throughput, deep lipidomic profiling.
13
Accordingly, we created 30 different instruments in silico with increased ion flux (1x to 100x) and
14
decreased polarity switching time (233 ms to 0 ms). Although a 0 ms polarity switching time is
15
theoretically impossible, it is not unrealistic to anticipate a 100x boost in sensitivity. Such gains
16
have already been observed for several lipid classes on a Q Exactive Plus instrument using a
17
packed nanobore column operated at 600 nL/min.42 For each theoretical instrument, the genetic
18
algorithm selected the optimal method parameter set for HAP1 analysis (Figure 6B).
19
expected, each instrument generated increased lipid annotations and the instrument with the
20
highest ion flux increase and lowest polarity switching time identified the most lipids (872 lipids).
21
Despite the significant increase in simulated lipid identifications, nearly 243 lipids remain
22
unidentified. Together, these observations suggest that the simulated hardware improvements
23
to the Q Exactive Platform will not provide sufficient sensitivity to profile the majority of lowly
24
abundant lipids reproducibly. Instead, novel intelligent data acquisition techniques23,43–45 will likely
As
Page 15 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 31
1
be needed to reduce the number of uninformative MS/MS spectra and unlock comprehensive
2
lipidomic coverage in a single LC-MS/MS experiment for the Q Exactive HF.
3 4
CONCLUSIONS
5
Here, we describe a proof-of-concept software tool which simulates lipidomic LC-MS/MS data
6
acquisition on the Q Exactive HF platform and accurately calculates the number of potential
7
unique lipid identifications for a given parameter set. By deriving equations which descriptively
8
model the isolation, collection, fragmentation, and analysis of lipid precursors for MS/MS analysis,
9
we demonstrate rapid replication of whole DDA experiments. When coupled with a genetic
10
algorithm for parameter optimization, this approach ensures methods are precisely tuned for a
11
given sample type – increasing experimental lipid identification depth and reducing overall method
12
development time. Additionally, in silico simulation circumvents the need for extensive prior
13
experimental analyses to tease out the complex co-dependence of method parameters.
14
Currently, it remains unclear how much each equation would need to be tuned for any
15
given instrument. With some minor modifications and a carefully designed infusion experiment,
16
the described software tool could automatically re-derive the foundational descriptive models and
17
recalibrate the simulation tool for a given instrument. The addition of a more comprehensive
18
model of false lipid identification to the genetic algorithm’s fitness score would also further
19
enhance the power of the outlined approach. Several method parameters’ including MS/MS
20
isolation width and MS resolution must be balanced to minimize false lipid identifications while
21
maximizing acquisition speed. Although the outlined simulation does include a simplified model
22
of false lipid identifications (i.e., MS/MS PIF and S/N), the construction of a more robust model
23
for putative lipid identification confidence could greatly increase the likelihood the simulation tool
24
returns a parameter set which generates the highest number of correct lipid identifications. Such
Page 16 ACS Paragon Plus Environment
Page 17 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
a development would likely require a more complex simulation of lipid class fragment m/z and
2
intensity. Additionally, the outlined in silico algorithm could be repurposed to generate optimal
3
parameter sets for proteomic or metabolomic data collection provided a sufficiently accurate
4
prediction model for MS/MS identification probability could be derived. Although this manuscript
5
is limited in scope to a single analyte class and instrument platform, we believe in silico simulation
6
represents a promising method for accelerating method development and empowering the
7
generation of high-quality MS datasets for comprehensive and impactful biochemical profiling
8
experiments.
9 10
ASSOCIATED CONTENT
11
Supporting Information
12
Supplemental methods, LC-MS/MS analysis parameters, Compound Discoverer processing
13
parameters, LipiDex processing parameters, simulating LC-MS ion flux, modeling ion
14
accumulation, modeling quadrupole transmission profiles, modeling MS/MS fragmentation
15
efficiency, modeling spectral noise, simulated LC-MS/MS acquisition and lipid identification,
16
dynamic range of lipid identification, simulated performance of method parameter sets, and
17
dynamic range of lipids in four tissues. Raw data are available on Chorus (Project ID 1559). The
18
simulation software is available at https://github.com/coongroup.
19 20
AUTHOR INFORMATION
21
Corresponding Author
22
*Phone: 608-263-1718. E-mail:
[email protected].
23
Notes Page 17 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Page 18 of 31
The authors declare no competing financial interests.
2 3
ACKNOWLEDGMENTS
4
We gratefully acknowledge support from the National Institutes of Health Grant P41 GM108538
5
(awarded to J.J.C) and the Morgridge Institute for Research Metabolism Theme.
6
acknowledge support from the Great Lakes Bioenergy Research Center, U.S. Department of
7
Energy, Office of Science, Office of Biological and Environmental Research under Award
8
Numbers DE-SC0018409 and DE-FC02-07ER64494. Additionally, we thank the Pagliarini Lab
9
for generating the HAP1 cell and yeast lipid extracts and Vanessa Linke for generating the mouse
10
cecum and plasma lipid extracts.
11
REFERENCES
12
(1)
13 14
(2)
(3)
Hebert, A. S.; Richards, A. L.; Bailey, D. J.; Ulbrich, A.; Coughlin, E. E.; Westphall, M. S.; Coon, J. J. The One Hour Yeast Proteome. Mol. Cell. Proteomics 2014, 13 (1), 339–347.
(4)
19 20
Aksenov, A. A.; Da Silva, R.; Knight, R.; Lopes, N. P.; Dorrestein, P. C. Global chemical analysis of biology by mass spectrometry. Nat. Rev. Chem. 2017, 1.
17 18
Walther, T. C.; Mann. M. Mass spectrometry–based proteomics in cell biology. J. Cell Biol. 2010, 190 (4), 491–500.
15 16
We also
Cajka, T.; Fiehn, O. Comprehensive analysis of lipids in biological systems by liquid chromatography-mass spectrometry TrAC - Trends Anal. Chem. 2014, 61, 192–206.
(5)
Blaženović, I.; Kind, T.; Sa, M. R.; Ji, J.; Vaniya, A.; Wancewicz, B.; Roberts, B. S.;
21
Torbasinovic, H.; Lee, T.; Mehta, S. S.; Showalter, M. R.; Song, H.; Kwok, J. F.; Jahn, D.;
22
Kim, J.; Fiehn, O. Structure Annotation of All Mass Spectra in Untargeted Metabolomics.
23
Anal. Chem. 2019, 91 (3), 2155-2162.
Page 18 ACS Paragon Plus Environment
Page 19 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Analytical Chemistry
(6)
Piazza, I.; Kochanowski, K.; Cappelletti, V.; Fuhrer, T.; Noor, E.; Sauer, U.; Picotti, P. A
2
Map of Protein-Metabolite Interactions Reveals Principles of Chemical Communication.
3
Cell 2018, 172 (1–2), 358–372.e23.
4
(7)
Stefely, J. A.; Kwiecien, N. W.; Freiberger, E. C.; Richards, A. L.; Jochem, A.; Rush, M. J.
5
P.; Ulbrich, A.; Robinson, K. P.; Hutchins, P. D.; Veling, M. T.; Guo, X.; Kemmerer, Z. A.;
6
Connors, K. J.; Trujillo, E. A.; Sokol, J.; Marx, H.; Westphall, M. S.; Hebert, A. S.;
7
Pagliarini, D. J.; Coon, J. J. Mitochondrial protein functions elucidated by multi-omic mass
8
spectrometry profiling. Nat. Biotechnol. 2016, 34 (11), 1191–1197.
9
(8)
Hecht, E. S.; Oberg, A. L.; Muddiman, D. C. Optimizing Mass Spectrometry Analyses: A
10
Tailored Review on the Utility of Design of Experiments. J. Am. Soc. Mass Spectrom.
11
2016, 27 (5), 767–785.
12
(9)
Kelstrup, C. D.; Jersie-Christensen, R. R.; Batth, T. S.; Arrey, T. N.; Kuehn, A.; Kellmann,
13
M.; Olsen, J. V. Rapid and deep proteomes by faster sequencing on a benchtop
14
quadrupole ultra-high-field Orbitrap mass spectrometer. J. Proteome Res. 2014, 13 (12),
15
6187–6195.
16
(10)
Yamada, T.; Uchikata, T.; Sakamoto, S.; Yokoi, Y.; Fukusaki, E.; Bamba, T. J.
17
Development of a lipid profiling system using reverse-phase liquid chromatography
18
coupled to high-resolution mass spectrometry with rapid polarity switching and an
19
automated lipid identification software. J. Chromatogr. A 2013, 1292, 211–218.
20
(11)
21 22
Ruzicka, J.; Mchale, K. J.; Peake, D. A. Data Acquisition Parameters Optimization of Quadrupole Orbitrap for Global Lipidomics on LC-MS / MS Time Frame; 2014.
(12)
Kalli, A.; Smith, G. T.; Sweredoski, M. J.; Hess, S. Evaluation and Optimization of Mass
23
Spectrometric Settings during Data-dependent Acquisition Mode: Focus on LTQ-Orbitrap
24
Mass Analyzers. J. Proteome Res. 2013, 12 (7), 3071–3086. Page 19 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
(13)
Randall, S. M.; Cardasis, H. L.; Muddiman, D. C. Factorial experimental designs elucidate
2
significant variables affecting data acquisition on a quadrupole Orbitrap mass
3
spectrometer. J. Am. Soc. Mass Spectrom. 2013, 24 (10), 1501–1512.
4
(14)
5 6
Page 20 of 31
Riter, L. S.; Vitek, O.; Gooding, K. M.; Hodge, B. D.; Julian, R. K. Statistical design of experiments as a tool in mass spectrometry. J. Mass Spectrom. 2005, 40 (5), 565–579.
(15)
Sun, B.; Kovatch, J. R.; Badiong, A.; Merbouh, N. Optimization and Modeling of
7
Quadrupole Orbitrap Parameters for Sensitive Analysis toward Single-Cell Proteomics. J.
8
Proteome Res. 2017, 16 (10), 3711–3721.
9
(16)
10 11
Huffman, G.; Specht, H.; Chen, A. T.; Slavov, N. DO-MS: Data-Driven Optimization of Mass Spectrometry Methods. J Proteome Res. 2019, 18 (6), 2493-2500.
(17)
Jeong, L. N.; Sajulga, R.; Forte, S. G.; Stoll, D. R.; Rutan, S. C. Simulation of elution
12
profiles in liquid chromatography-I: Gradient elution conditions, and with mismatched
13
injection and mobile phase solvents. J. Chromatogr. A 2016, 1457, 41–49.
14
(18)
Bowden, J. A.; Ulmer, C. Z.; Jones, C. M.; Koelmel, J. P.; Yost, R. A. NIST lipidomics
15
workflow questionnaire: an assessment of community-wide methodologies and
16
perspectives. Metabolomics 2018, 14 (5), 53.
17
(19)
Bowden, J. A.; Heckert, A.; Ulmer, C. Z.; Jones, C. M.; Koelmel, J. P.; Abdullah, L.;
18
Ahonen, L.; Alnouti, Y.; Armando, A. M.; Asara, J. M.; Bamba, T.; Barr, J. R.; Bergquist,
19
J.; Borchers, C. H.; Brandsma, J.; Breitkopf, S. B.; Cajka, T.; Cazenave-Gassiot, A.;
20
Checa, A.; Cinel, M. A.; Colas, R. A.; Cremers, S.; Dennis, E. A.; Evans, J. E.; Fauland,
21
A.; Fiehn, O.; Gardner, M. S.; Garrett, T. J.; Gotlinger, K. H.; Han, J.; Huang, Y.; Neo, A.
22
H.; Hyötyläinen, T.; Izumi, Y.; Jiang, H.; Jiang, H.; Jiang, J.; Kachman, M.; Kiyonami, R.;
23
Klavins, K.; Klose, C.; Köfeler, H. C.; Kolmert, J.; Koal, T.; Koster, G.; Kuklenyik, Z.;
24
Kurland, I. J.; Leadley, M.; Lin, K.; Maddipati, K. R.; McDougall, D.; Meikle, P. J.; Mellett, Page 20 ACS Paragon Plus Environment
Page 21 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
N. A.; Monnin, C.; Moseley, M. A.; Nandakumar, R.; Oresic, M.; Patterson, R.; Peake, D.;
2
Pierce, J. S.; Post, M.; Postle, A. D.; Pugh, R.; Qiu, Y.; Quehenberger, O.; Ramrup, P.;
3
Rees, J.; Rembiesa, B.; Reynaud, D.; Roth, M. R.; Sales, S.; Schuhmann, K.;
4
Schwartzman, M. L.; Serhan, C. N.; Shevchenko, A.; Somerville, S. E.; St. John-Williams,
5
L.; Surma, M. A.; Takeda, H.; Thakare, R.; Thompson, J. W.; Torta, F.; Triebl, A.;
6
Trötzmüller, M.; Ubhayasekera, S. J. K.; Vuckovic, D.; Weir, J. M.; Welti, R.; Wenk, M. R.;
7
Wheelock, C. E.; Yao, L.; Yuan, M.; Zhao, X. H.; Zhou, S. Harmonizing lipidomics: NIST
8
interlaboratory comparison exercise for lipidomics using SRM 1950–Metabolites in
9
Frozen Human Plasma. J. Lipid Res. 2017, 58 (12), 2275–2288.
10
(20)
11 12
Hutchins, P. D.; Russell, J. D.; Coon, J. J. LipiDex: An Integrated Software Package for High-Confidence Lipid Identification. Cell Syst. 2018, 6 (5), 621–625.e5.
(21)
Chambers, M. C.; MacLean, B.; Burke, R.; Amodei, D.; Ruderman, D. L.; Neumann, S.;
13
Gatto, L.; Fischer, B.; Pratt, B.; Egertson, J.; Hoff, K.; Kessner, D.; Tasman, N.; Shulman,
14
N.; Frewen, B.; Baker, T. A.; Brusniak, M. Y.; Paulse, C.; Creasy, D.; Flashner, L.; Kani,
15
K.; Moulding, C.; Seymour, S. L.; Nuwaysir, L. M.; Lefebvre, B.; Kuhlmann, F.; Roark, J.;
16
Rainer, P.; Detlev, S.; Hemenway, T.; Huhmer, A.; Langridge, J.; Connolly, B.; Chadick,
17
T.; Holly, K.; Eckels, J.; Deutsch, E. W.; Moritz, R. L.; Katz, J. E.; Agus, D. B.; MacCoss,
18
M.; Tabb, D. L.; Mallick, P. A cross-platform toolkit for mass spectrometry and proteomics
19
Nat. Biotechnol. 2012, 30 (10), 918–920.
20
(22)
Michalski, A.; Cox, J.; Mann, M. More than 100,000 detectable peptide species elute in
21
single shotgun proteomics runs but the majority is inaccessible to data-dependent LC-
22
MS/MS. J. Proteome Res. 2011, 10 (4), 1785–1793.
23 24
(23)
Bailey, D. J.; McDevitt, M. T.; Westphall, M. S.; Pagliarini, D. J.; Coon, J. J. Intelligent Data Acquisition Blends Targeted and Discovery Methods. J. Proteome Res. 2014, 13
Page 21 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
Page 22 of 31
(4), 2152–2161. (24)
Hebert, A. S.; Thöing, C.; Riley, N. M.; Kwiecien, N. W.; Shiskova, E.; Huguet, R.;
3
Cardasis, H. L.; Kuehn, A.; Eliuk, S.; Zabrouskov, V.; Westphall, M. S.; McAlister, G. C.;
4
Coon, J. J. Improved Precursor Characterization for Data-Dependent Mass Spectrometry.
5
Anal. Chem. 2018, 90 (3), 2333–2340.
6
(25)
Goldfarb, D.; Wang, W.; Major, M. B. MSAcquisitionSimulator: data-dependent
7
acquisition simulator for LC-MS shotgun proteomics. Bioinformatics 2016, 32 (8), 1269–
8
1271.
9
(26)
10 11
Bioinformatics 2015, 31 (5), 791–793. (27)
12 13
Smith, R.; Prince, J. T. JAMSS: proteomics mass spectrometry simulation in Java.
Bielow, C.; Aiche, S.; Andreotti, S.; Reinert, K. MSSimulator: Simulation of Mass Spectrometry Data. J. Proteome Res. 2011, 10 (7), 2922–2929.
(28)
Noyce, A. B.; Smith, R.; Dalgleish, J.; Taylor, R. M.; Erb, K. C.; Okuda, N.; Prince, J. T.
14
Mspire-Simulator: LC-MS shotgun proteomic simulator for creating realistic gold standard
15
data. J. Proteome Res. 2013, 12 (12), 5742–5749.
16
(29)
Schulz-Trieglaff, O.; Pfeifer, N.; Gröpl, C.; Kohlbacher, O.; Reinert, K. LC-MSsim – a
17
simulation software for liquid chromatography mass spectrometry data. BMC
18
Bioinformatics 2008, 9.
19
(30)
Lam, H.; Deutsch, E. W.; Eddes, J. S.; Eng, J. K.; Stein, S. E.; Aebersold, R. Building
20
Consensus Spectral Libraries for Peptide Identification in Proteomics. Nat. Methods
21
2008, 5 (10), 873–875.
22 23
(31)
Moini, M.; Jones, B. L.; Rogers, R. M.; Jiang, L. Sodium trifluoroacetate as a tune/calibration compound for positive- and negative-ion electrospray ionization mass Page 22 ACS Paragon Plus Environment
Page 23 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
spectrometry in the mass range of 100–4000 Da. J. Am. Soc. Mass Spectrom. 1998, 305
2
(98).
3
(32)
4 5
mass spectrometers. U.S. Patent 9202681B2, 2015. (33)
6 7
Remes, P. M.; Senko, M. W. Methods for predictive automatic gain control for hybrid
Murphy, R. C.; Fiedler, J.; Hevko, J. Analysis of nonvolatile lipids by mass spectrometry. Chem. Rev. 2001, 101 (2), 479–526.
(34)
Hartler, J.; Triebl, A.; Ziegl, A.; Trötzmüller, M.; Rechberger, G. N.; Zeleznik, O. A.;
8
Zierler, K. A.; Torta, F.; Cazenave-Gassiot, A.; Wenk, M. R.; Fauland, A.; Wheelock, C.
9
E.; Armando, A. M.; Quehenberger, O.; Zhang, Q.; Wakelam, M. J. O.; Haemmerle, G.;
10
Spener, F.; Köfeler, H. C.; Thallinger, G. G. Deciphering lipid structures based on
11
platform-independent decision rules. Nat. Methods 2017, 14 (12), 1171–1174.
12
(35)
Neumann, S.; Böcker, S. Computational mass spectrometry for metabolomics:
13
identification of metabolites and small molecules. Anal. Bioanal. Chem. 2010, 398 (7–8),
14
2779–2788.
15
(36)
Makarov, A.; Denisov, E.; Lange, O.; Horning, S. Dynamic range of mass accuracy in
16
LTQ Orbitrap hybrid mass spectrometer. J. Am. Soc. Mass Spectrom. 2006, 17 (7), 977–
17
982.
18
(37)
19 20
Makarov, A.; Denisov, E. Dynamics of ions of intact proteins in the Orbitrap mass analyzer. J. Am. Soc. Mass Spectrom. 2009, 20 (8), 1486–1495.
(38)
Kind, T.; Liu, K.-H.; Lee, D. Y.; DeFelice, B.; Meissen, J. K.; Fiehn, O. LipidBlast in silico
21
tandem mass spectrometry database for lipid identification. Nat. Methods 2013, 10 (8),
22
755–758.
23
(39)
Koelmel, J. P.; Kroeger, N. M.; Ulmer, C. Z.; Bowden, J. A.; Patterson, R. E.; Cochran, J. Page 23 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 24 of 31
1
A.; Beecher, C. W. W.; Garrett, T. J.; Yost, R. A. LipidMatch: an automated workflow for
2
rule-based lipid identification using untargeted high-resolution tandem mass spectrometry
3
data. BMC Bioinformatics 2017, 18 (1), 331.
4
(40)
5 6
2004. (41)
7 8
Makarov, A.; Kholomeev, A. Mass spectrometer power sources with polarity switching. U.S. Patent 9058964B2, 2008.
(42)
9
Danne-Rasche, N.; Coman, C.; Ahrends, R. Nano-LC/NSI MS Refines Lipidomics by Enhancing Lipid Coverage, Measurement Sensitivity, and Linear Dynamic Range. Anal.
10 11
Haupt, R. L.; Haupt, S. E. Practical Genetic Algorithms Second Edition; Wiley: New York,
Chem. 2018, 90 (13), 8093–8101. (43)
Koelmel, J. P.; Kroeger, N. M.; Gill, E. L.; Ulmer, C. Z.; Bowden, J. A.; Patterson, R. E.;
12
Yost, R. A.; Garrett, T. J. Expanding Lipidome Coverage Using LC-MS/MS Data-
13
Dependent Acquisition with Automated Exclusion List Generation. J. Am. Soc. Mass
14
Spectrom. 2017, 28 (5), 908–917.
15
(44)
Broeckling, C. D.; Hoyes, E.; Richardson, K.; Brown, J. M.; Prenni, J. E. Comprehensive
16
Tandem-Mass-Spectrometry Coverage of Complex Samples Enabled by Data-Set-
17
Dependent Acquisition. Anal. Chem. 2018, 90 (13), 8020–8027.
18
(45)
Bailey, D. J.; Rose, C. M.; McAlister, G. C.; Brumbaugh, J.; Yu, P.; Wenger, C. D.;
19
Westphall, M. S.; Thomson, J. A.; Coon, J. J. Instant spectral assignment for advanced
20
decision tree-driven mass spectrometry. Proc. Natl. Acad. Sci. 2012, 109 (22), 8411–
21
8416.
22 23 Page 24 ACS Paragon Plus Environment
Page 25 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
Figure 1. Simulating MS Acquisition. (A) Fast polarity switching data-dependent scan cycle used for lipidomics data acquisition. (B) Overview of seed information and processing steps for in silico LC-MS/MS simulation. (C) Q-Exactive HF schematic and instrument performance characteristics modelled.
Page 25 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 26 of 31
Figure 2. Instrument Performance Simulation and Validation. (A) Lipid MS/MS injection times extracted from LC-MS/MS analysis of HAP1 cell extracts using various MS/MS isolation width and AGC target values. (B) Correlation of simulated and experimental MS/MS injection times from LC-MS/MS analysis of HAP1 cell extracts. (C) Experimental quadrupole transmission peaks for m/z 524 vertically offset for visual clarity. (D) Simulated quadrupole transmission peaks for m/z 524 vertically offset for visual clarity. (E) Simulated quadrupole transmission peak for PC 18:1/18:1 [M+H]+ isotope cluster. (F) Error of simulated precursor ion fraction across varying MS/MS isolation width settings for MS/MS analysis of negative Calmix ions. (G) MS/MS noise band intensity for negative Calmix ions measured using varied MS/MS resolution settings. (H) Correlation of simulated and experimental MS/MS noise band intensities from LC-MS/MS analysis of HAP1 cell extracts.
Page 26 ACS Paragon Plus Environment
Page 27 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
Figure 3. Modeling Lipid Spectral Matches. (A) Spectral match score for lipid reference standard MS/MS across various MS/MS signal-to-noise ratios. (B) Number of lipid spectral matches from lipid reference standard MS/MS mixed in silico at varying rations with isobaric lipid MS/MS.
Page 27 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 28 of 31
Figure 4. Simulation Accuracy for Varied Method Parameters. Lipid spectral matches, identified features, and unique lipid species identified for (A) experimental and (B) simulated LC-MS/MS acquisition of HAP1 cell extracts across varied MS/MS maximum injection time and top N values. Lipid spectral matches, identified features, and unique lipid species identified for (C) experimental and (D) simulated LC-MS/MS acquisition of HAP1 cell extracts across varied MS/MS isolation width and AGC target values.
Page 28 ACS Paragon Plus Environment
Page 29 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
Figure 5. Genetic Algorithm for Rapid Method Optimization. (A) Overview of genetic algorithm for selection of optimal method parameter sets. (B) Simulated number of unique HAP1 lipid identifications and (C) rank for the optimal method parameter set returned by the genetic algorithm and by a random parameter search of equal iterations. Number of simulated unique lipid identifications and the optimal method parameter set returned by the genetic algorithm for simulated LC-MS/MS analysis of (D) HAP1 cells, (E) mouse cecum, (F) mouse plasma, and (G) yeast cells.
Page 29 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 30 of 31
Figure 6: Theoretical MS Improvements for Increased Lipid Identification. (A) Maximum peak depth for all identifiable lipids (blue) and lipids identified using optimal parameter set (yellow) for 0.25 min segments of simulated LC-MS/MS acquisition of HAP1 cell extracts. (B) Simulated unique HAP1 lipid identifications using reduced polarity switching times and improved ion flux.
Page 30 ACS Paragon Plus Environment
Page 31 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
For TOC Only
Page 31 ACS Paragon Plus Environment