Accelerating Lipidomic Method Development ... - ACS Publications

Jul 12, 2019 - MTBE/MeOH/H2O (10:3:2.5, v/v/v) solvent system, an aliquot of the organic layer was dried under vacuum, and the dry lipid material was ...
0 downloads 0 Views 788KB Size
Subscriber access provided by UNIV OF SOUTHERN INDIANA

Article

Accelerating Lipidomic Method Development through in silico Simulation Paul D Hutchins, Jason D Russell, and Joshua J. Coon Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.9b01234 • Publication Date (Web): 12 Jul 2019 Downloaded from pubs.acs.org on July 16, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Accelerating Lipidomic Method Development through in silico Simulation

Paul D. Hutchins1,3, Jason D. Russell2,3, and Joshua J. Coon1,2,3,4,* 1Department 2Morgridge 3Genome

of Chemistry, University of Wisconsin–Madison, Madison, WI 53706, USA

Institute for Research, Madison, WI 53715, USA

Center of Wisconsin, Madison, WI 53706, USA

4Department

of Biomolecular Chemistry, University of Wisconsin–Madison, Madison, WI 53706,

USA * Corresponding author: [email protected]

Page 1 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 31

1

ABSTRACT

2

Judicious selection of mass spectrometry (MS) acquisition parameters is essential for effectively

3

profiling the broad diversity and dynamic range of biomolecules. Typically, acquisition parameters

4

are individually optimized to maximally characterize analytes from each new sample matrix. This

5

time-consuming process often ignores the synergistic relationship between MS method

6

parameters, producing sub-optimal results. Here we detail the creation of an algorithm which

7

accurately simulates LC-MS/MS lipidomic data acquisition performance for a benchtop

8

quadrupole-Orbitrap MS system. By coupling this simulation tool with a genetic algorithm for

9

constrained parameter optimization, we demonstrate the efficient identification of LC-MS/MS

10

method parameter sets individually suited for specific sample matrices. Finally, we utilize the in

11

silico simulation to examine how continued developments in MS acquisition speed and sensitivity

12

will further increase the power of MS lipidomics as a vital tool for impactful biochemical analysis.

13

INTRODUCTION

14

Continued improvements to mass spectrometer (MS) acquisition speed and sensitivity over the

15

last two decades have reinforced its role as the premier bioanalytical tool for unbiased compound

16

identification and quantitation.1,2 These advances in MS hardware and data acquisition strategies

17

now permit the rapid profiling of thousands of biomolecules from complex matrices3–5 and have

18

opened the door to the system-wide characterization of biological processes.6,7 To translate these

19

technological advances into improved biomolecular measurements, proper selection of

20

instrumental operating parameters through rigorous method development is still required.8

21

Traditionally, the optimization of acquisition settings for new instrument platforms or

22

sample types involves the sequential iteration of select parameters to improve various method

23

performance metrics (e.g., number of identifications and run-to-run overlap).9–12 This pragmatic

24

approach is necessary to ensure sufficient instrument performance within a reasonable timeframe

Page 2 ACS Paragon Plus Environment

Page 3 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

as thousands of potential method parameter combinations exist.13

Although practical, the

2

success of this iterative process heavily depends on user expertise and the time allotted for

3

method optimization.

4

Several approaches have been developed to expedite the MS method development cycle

5

including statistical design of experiment (DOE) methods,8,14 derivation of vital method

6

parameters,15 and software for method performance extraction.16

7

approaches seeks to rapidly uncover interdependent relationships between MS parameters and

8

ensure optimal performance, these methods still require numerous LC-MS/MS experiments. For

9

the field of chromatographic separations, tools exist which accelerate method development by

10

simulating the separation of mixtures in silico.17 These tools model chromatographic systems

11

with sufficient accuracy to predict the optimal operating conditions for theoretical compound

12

mixtures. Such computational simulation is highly attractive as it performs rapid optimization on

13

new sample types with minimal a priori knowledge of the sample’s chromatographic behavior.

14

Although powerful, in silico simulation has not been developed for bioanalytical MS method

15

optimization likely due, in part, to the complexity of ion manipulation, the growing array of data

16

acquisition techniques, and the diverse applications of LC-MS/MS analysis.

Although each of these

17

For the emerging fields of discovery lipidomic analysis via LC-MS/MS, the use of different

18

instrumental methods18 leads, in part, to a wide range of analytical performance.19 To empower

19

the rapid selection of optimal lipidomic MS parameters, we describe an algorithm which simulates

20

LC-MS/MS data-dependent acquisition (DDA) of complex lipid extracts on a quadrupole-Orbitrap

21

MS platform (Q Exactive HF). By focusing on a single instrument platform and biomolecule class,

22

we derive equations which accurately model the ion flux, ion transmission, quadrupole isolation,

23

fragmentation efficiency, and spectral noise under various method parameter combinations.

24

From several experimental inputs and user-supplied method settings, the algorithm rapidly

25

simulates data-dependent MS analysis of the sample and predicts the number of lipid Page 3 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 31

1

identifications for a given experiment. After outlining the creation and validation of the simulation,

2

we couple the in silico model to a modified genetic algorithm to efficiently explore the method

3

parameter space for several sample types and return near-optimal method parameter sets.

4

Finally, we model the effect continued improvements to MS acquisition speed and sensitivity will

5

have on lipidomic LC-MS/MS analysis and its continued growth into a mature -omics technology.

6

EXPERIMENTAL SECTION

7

Materials and Reagents

8

The lipid reference standard mixture was purchased from Avanti Polar Lipids (Differential Ion

9

Mobility System System Suitability Lipidomix, Alabaster, AL).

Trifluoracetic acid (TFA),

10

acetonitrile (ACN), and 1.0 M NaOH were purchased from Fisher Scientific (Hampton, NH).

11

Acetonitrile/aqueous sodium trifluoroacetic acid solution (1:1 ACN/0.1% TFA, pH 3.5) was

12

prepared according to Moini et al.20

13

purchased from Thermo Scientific (Rockford, IL). Lipids were extracted from HAP1 and yeast

14

cells using a CHCl3/MeOH/H2O (6:3:2, v/v/v) solvent system, an aliquot of the organic layer was

15

dried under vacuum, and the dried lipid material resuspended in ACN/IPA/H2O (65:30:5, v/v/v).

16

For mouse cecum and plasma samples, lipids were extracted using an MTBE/MeOH/H2O

17

(10:3:2.5, v/v/v) solvent system, an aliquot of the organic layer was dried under vacuum, and the

18

dry lipid material resuspend in MeOH/toluene (9:1). Further sample preparation details are

19

provided in Supplemental Methods.

20

LC-MS/MS Acquisition

21

Lipid extracts were chromatographically separated using an Acquity CSH C18 column (50 °C, 2.1

22

x 100 mm x 1.7 µm particle size; Waters Corporation) coupled to a Vanquish Binary Pump

23

(Thermo Scientific). Mobile phase A consisted of 10 mM ammonium acetate in 70:30 (v/v)

24

ACN/H2O with 250 μL/L acetic acid and mobile phase B consisted of 10 mM ammonium acetate

25

in 90:10 (v/v) IPA/ACN with 250 μL/L acetic acid. Ten microliters of the lipid extracts were injected

Pierce negative ion calibration solution (Calmix) was

Page 4 ACS Paragon Plus Environment

Page 5 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

onto the column using the Vanquish autosampler (Thermo Scientific), ionized using a HESI

2

heated ESI source (Thermo Scientific), and analyzed on the Q Exactive HF platform (Thermo

3

Scientific). LC-MS/MS data were acquired using alternating positive and negative scan cycles

4

(i.e. polarity switching) consisting of a single MS scan followed by multiple data-dependent

5

MS/MS scans. Complete chromatography and MS acquisition parameters are supplied in Table

6

S1.

7

LC-MS/MS Data Processing

8

All lipid identification from LC-MS/MS data was performed using LipiDex (v1.0.2).20 Briefly,

9

MS/MS scans were extracted from each LC-MS/MS experiment and converted to the Mascot

10

Generic Format (MGF) using ProteoWizard (ProteoWizard Software Foundation, v3.0).21 The

11

resulting MS/MS spectra were searched against the LipiDex HCD Acetate library in LipiDex to

12

generate putative lipid identifications based on spectral similarity to in silico lipid fragmentation

13

spectra. Retention time alignment and chromatographic feature finding were performed using

14

Compound Discoverer (Thermo Scientific, v2.1).

15

assigned to chromatographic features and filtered in LipiDex to generate final lipid identifications.

16

Complete data processing parameters are provided in Table S2, S3.

17

RESULTS AND DISCUSSION

18

Simulating Data-Dependent Acquisition

19

During DDA, precursor ions are selected for MS/MS analysis based on their relative intensity from

20

a survey MS scan (Figure 1A). Once sampled, each unique ion population is often excluded

21

from subsequent re-analysis for a pre-determined period to reduce redundant MS/MS sampling.

22

Although remarkably effective, this straightforward data acquisition strategy can suffer from poor

23

run-to-run identification overlap and depth of biomolecule identification due to DDA’s semi-

24

stochastic MS/MS sampling which targets the most abundant MS1 features.22,23 Although the

25

acquisition speed and sensitivity of modern mass spectrometers have significantly increased the

Putative MS/MS identifications were then

Page 5 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 31

1

depth and reproducibility of DDA methods,24 proper tuning of instrument parameters is still

2

required to harness these improvements.

3

With this goal in mind, we developed a software tool which simulates the lipidomic data

4

acquisition and identification performance of a benchtop quadrupole-Orbitrap (Q Exactive HF)

5

LC-MS/MS system. We demonstrate the rapid selection of method parameters to maximize lipid

6

identifications in a single experiment. The Q Exactive HF’s relatively straightforward instrument

7

geometry and widespread use in the lipidomics community18 render it an ideal target for lipidomic

8

simulation. Although several MS simulation tools exist for proteomic analysis, these tools focus

9

on generating “ground truth” datasets to benchmark data acquisition25 or peptide identification26–

10

29

algorithms and do not model the effect of varied MS operating parameters on biomolecule

11

identification. Derivation of the complex factors governing ion transport, fragmentation, and

12

analysis from first principles is, at this time, not possible and is not the focus of this manuscript.

13

Instead, we aim to descriptively model data-dependent acquisition with sufficient accuracy to

14

return near-optimal method parameter sets and accelerate the lipidomic method development

15

process.

16

As outlined in Figure 1B, the simulation first replicates the Q Exactive HF data-dependent

17

scan cycle based on user-supplied instrument parameters (e.g., Orbitrap transient time) and

18

method settings (e.g., MS/MS isolation width). For each scan cycle, the survey MS scan is

19

reconstructed using user-supplied lists which contain chromatographic features and potential lipid

20

identifications.

21

distinguish unique biochemical features from background noise and other co-eluting

22

chromatographic peaks. When combined with a user-supplied LC-MS survey experiment, these

23

data accurately represent the incoming ion flux generated by the LC-ESI system (Figure S1).

24

After MS scan creation, MS/MS scans are “triggered” with user-provided DDA settings. For each

25

scan, we simulate theoretical signal-to-noise (S/N) ratio (ratio of the most intense to least intense

The generation of chromatographic feature table allows the simulation to

Page 6 ACS Paragon Plus Environment

Page 7 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

fragment ion)30 and precursor ion fraction (PIF). Based on these two metrics and whether the

2

chromatographic feature stems from an identifiable lipid species, the predicted identification

3

success of the MS/MS scan is recorded. The outlined scan cycle simulation process is then

4

repeated for the entire LC gradient and the cumulative number of identified lipid features

5

calculated.

6

Modeling MS/MS Acquisition

7

Accurate simulation of lipid data acquisition and identification performance requires in-depth

8

modeling of the entire MS/MS acquisition cycle.

9

characterization of this process which includes ion transmission, quadrupole isolation, lipid

10

fragmentation efficiency, and spectral noise (Figure 1C). For each triggered MS/MS scan, ions

11

of a given m/z range are selected by the quadrupole mass filter (QMF) and collected in the C-

12

trap. As shown in Figure 2A, the time required (injection time) to collect the requisite number of

13

charges (AGC target) is heavily dependent on the MS/MS isolation width (IW) and the intensity of

14

the precursor. We investigated the interdependence of these parameters by varying the MS/MS

15

AGC target and isolation width for a constant ion flux of an ACN/NaTFA (aq). MS analysis of this

16

sample yielded singly-charged NaTFA clusters across a wide m/z range.31 To mathematically

17

correct MS/MS injections time for the co-isolation of NaTFA isotopes, the MS triggered a single-

18

ion monitoring (SIM) scan with the same isolation width after each scan. As anticipated, we

19

observed higher MS/MS injection times at lower isolation widths and higher AGC target values

20

(Figure S2A). Each measured injection time was converted to a transmission efficiency (E) value

21

(Figure S2B) according to the following equation:

22

E=

Accordingly, we describe the systematic

# of charges IT x Parent Intensity

23

Following the method of Remes at al.,32 we fit a curve to each transmission efficiency profile using

24

the following equation: Page 7 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

E=

a―b d

( )

IW 1+ c

Page 8 of 31

+b

2

The values of b,c, and d were found to be linearly dependent on precursor m/z (Figure S2D) –

3

which permitted the calculation of a theoretical MS/MS injection time for any precursor ion

4

population given the MS/MS AGC target and isolation width. To validate the derived relationship,

5

we estimated the MS/MS injection time for nine lipidomic experiments collected with various

6

MS/MS AGC target and isolation width settings. As seen in Figure 2B, we observed a strong

7

correlation between the actual and simulated injection time values spanning several orders of

8

magnitude.

9

Precursor ion populations isolated for MS/MS analysis are rarely pure and often contain

10

multiple co-isolated species whose m/z falls within the QMF’s isolation width.22 To effectively

11

simulate quadrupole transmission profiles, we measured the relative intensity of several Calmix

12

ions while varying the MS/MS isolation width and isolation center (Figure 2C). The resulting

13

quadrupole transmission peaks were then fit with a flat-topped Gaussian according to the

14

following equation (Figure S3A):

15

I = e ―ax

2

― bx4

16

For all m/z values analyzed, the a term was dependent on the isolation width and m/z value, the

17

b term was dependent on the isolation width, and the apex m/z offset was dependent on the

18

isolation width and m/z value (Figure S3B-D). Based on these equations, the simulation models

19

the quadrupole transmission peak for any given m/z and isolation width with sufficient accuracy

20

to calculate the purity of any given MS/MS ion population (Figure 2D). For each simulated MS/MS

21

scan, the algorithm computes a PIF value using the intensity of nearby mass spectral peaks, the

22

simulated transmission profile, and the MS/MS isolation width setting (Figure 2E). To validate

23

our PIF estimation, we collected SIM scans of the isotopic cluster of several Calmix ions using Page 8 ACS Paragon Plus Environment

Page 9 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

various isolation widths. For all analyzed isolation widths, the relative error for the monoisotopic

2

PIF did not exceed 12% (Figure 2F).

3

Once collected in the C-Trap, lipid precursor ions are fragmented via higher-energy

4

collisional dissociation (HCD) in the HCD cell and the resulting fragments used to identify the lipid

5

class and side chain moieties.33 Fragment ion abundances for lipid species vary widely due to

6

the chemical diversity of lipids and the collision energy used for fragmentation.33,34 Accurate

7

prediction of these fragmentation patterns, especially absolute intensities, is not yet routine.35 To

8

approximate lipid fragmentation, we analyzed the fragmentation efficiency (here defined as the

9

ratio between precursor intensity and the most intense fragment ion) from an LC-MS/MS analysis

10

of HAP1 cell extracts (Figure S4A).

Once log-transformed, the observed fragmentation

11

efficiencies followed a normal distributed (x̄:-1.5, σ: 0.6, Figure S4B).

12

approximation, all simulated MS/MS scans are assigned a random fragmentation efficiency drawn

13

from this distribution. To avoid creating a non-deterministic simulation, the S/N of each MS/MS

14

is simulated fifty times and considered successful only if the majority of replicate simulations

15

exhibited sufficient S/N.

Based on this

16

After collisional dissociation, product ions are mass-analyzed in the Orbitrap to generate

17

MS/MS spectra. When the MS collects insufficient lipid ions for analysis, the signal corresponding

18

to noise dominates the spectrum (Figure S5A). For the Orbitrap mass analyzer, non-chemical

19

noise arises from the thermal noise of the pre-amplifier.36 As observed previously,9,37 we found

20

this spectral noise (here defined as the least intense ion in a spectrum) for an MS/MS of Calmix

21

ions to be inversely dependent on the injection time and the square root of the transient length

22

(Figure 2G, Figure S5B-D).

23

N=

k IT × Res

Page 9 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 31

1

We validated this relationship on several LC-MS/MS experiments collected using varying MS/MS

2

resolution settings and observed a strong correlation between the experimentally-measured and

3

simulated noise levels over ~three orders of magnitude (Figure 2H).

4

Predicting Identification Probability

5

Numerous software tools exist which analyze lipid tandem mass spectra using theoretical

6

databases of in silico spectra38 or fragmentation rules.34 For any given spectrum, the intensity of

7

structurally-informative fragments relative to interfering spectral peaks largely determines the

8

software’s ability to decipher the lipid fragmentation patterns.20 Even if an MS/MS targets a bona

9

fide lipid for which the software has an in silico spectrum, the MS/MS event will not give rise to a

10

confident lipid identification if it contains insufficient fragment ions. Accordingly, the probability a

11

given lipid MS/MS spectrum will yield an identification is mainly dependent on the spectral signal-

12

to-noise ratio and precursor ion fraction.

13

To determine the minimum S/N required for lipid identification, we collected MS/MS scans

14

for five phospholipid and glycerolipid reference standards using different MS/MS injection times.

15

All spectra were searched in LipiDex using a precursor and product ion tolerance of 0.01 Da and

16

the spectral similarity score (dot product) recorded. As seen in Figure 3A, similarity scores

17

dropped considerably for MS/MS spectra with S/N less than 3. Indeed, the similarity scores below

18

this S/N threshold appeared random and followed no discernable trend. Even when fragment

19

ions are sufficiently abundant, the probability of a confident lipid identification decreases if the

20

spectrum contains co-fragmented species.34,39

21

generate lipid identifications, we selected 135 pairs of isobaric lipid spectra, mixed them at varying

22

ratios in silico, and searched them against LipiDex’s spectral libraries (Figure 3B). We observed

23

significantly fewer spectral matches at or below the 1:1 mixing ratio. Based on these two

24

analyses, the simulation considers all lipid MS/MS spectra to be successful if S/N > 3 and PIF >

25

0.5. Note these thresholds are specific to the method of lipid identification used in LipiDex and

To elucidate the minimum PIF required to

Page 10 ACS Paragon Plus Environment

Page 11 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

would likely differ for other lipid identification algorithms. Additionally, these thresholds do not

2

account for instances where co-fragmented lipid species share common fragment m/z values. In

3

such cases, the minimum PIF for a confident lipid identification would be significant higher.

4

Simulating Acquisition Performance

5

The number of identified lipid features in an LC-MS/MS experiment results from the

6

intricate interplay between numerous MS acquisition parameters. For an in silico simulation tool

7

to have utility, it must accurately mirror this complexity for a wide range of method setting

8

combinations. Accordingly, we validated the number of predicted lipid identifications using two of

9

HAP1 cell LC-MS/MS datasets. First, we systematically varied the MS/MS topN and MS/MS

10

maximum injection time values for a 30-minute fast polarity switching LC-MS/MS method and

11

measured the number of lipid identifications generated by each experiment. For complex samples

12

which contain numerous chromatographic features across a wide dynamic range, these two

13

parameters largely control the balance between MS/MS quantity and quality.13 Next, we varied

14

the MS/MS isolation width and AGC target setting across nine LC-MS/MS experiments to

15

generate data with varying levels of spectral signal-to-noise and co-fragmentation. For each

16

collected LC-MS/MS experiment, we performed an LC-MS/MS simulation with the same method

17

parameters.

18

As shown in Figure 4A-B, the number of unique lipids identified for the first experimental

19

set ranged between 490 and 590 lipids while the simulated lipid identification performance varied

20

between 400 and 611 lipids. Encouragingly, the simulation accurately reproduced small relative

21

differences in ID performance (average percent error: 10.5%) even though the MS/MS maximum

22

injection times were only incremented by 5 ms for each experiment. Impressed with this level of

23

accuracy, we further explored the simulation’s ability to model the rate and depth of MS/MS lipid

24

identification across an LC-MS/MS gradient as spectral complexity fluctuates. As displayed for

Page 11 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 31

1

two experiments in Figure S6-7, the simulation modeled the nuanced changes in MS/MS

2

acquisition rate and depth of identification.

3

For the second experimental dataset (isolation width/AGC target), the number of unique

4

lipid identifications only varied between 527 and 483 (Figure 4C). Despite this small range of

5

outcomes, the simulation accurately predicted lipid identification performance with an average

6

error of 12.3% and precisely mirrored the minor identification differences between each

7

experiment (Figure 4D). Together, these experiments suggest the algorithm accurately simulates

8

the effect of different method parameter sets on MS/MS speed, acquisition, and quality.

9

Rapid Method Optimization

10

LC-MS/MS method development involves the careful selection of method parameters likely to

11

provide the maximum number of confident identifications. Although a single polarity switching

12

LC-MS/MS experiment can be simulated in less than 30 seconds using the outlined approach

13

(Intel® i7 CPU, 2.67 GHz), the routine exploration of lipid identification performance for all method

14

parameter combinations is infeasible. Even if the potential values for a given parameter are

15

restricted, there still exist thousands of potential combinations. Typically, the analyst’s intuition

16

and previous experiments serve to exclude many of these potential methods.

17

To reduce the time required to determine the optimal parameter set, we implemented a

18

modified genetic algorithm for parameter optimization.

Genetic algorithms simultaneously

19

optimize multiple parameters by mirroring the evolutionary process of selection and recombination

20

based on genotype fitness.

21

particularly effective for time-consuming fitness calculations.40 As implemented here (Figure 5A),

22

the genetic algorithm first begins with a pseudo-random initial population of method parameter

23

sets and simulates the number of unique lipid identifications generated by each set. From this

24

population, the algorithm selects a user-defined number of the top-performing parameter sets to

This class of optimization algorithm is readily parallelized and

Page 12 ACS Paragon Plus Environment

Page 13 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

pair together. Each pair then exchanges their parameter values to generate two new parameter

2

sets.

3

parameter value has a probability of being randomly modified. This process repeats until the

4

population has converged to a single optimal parameter set.

To reduce the likelihood of premature convergence to a sub-optimal solution, each

5

To evaluate our genetic algorithm’s utility we first performed an exhaustive simulation of

6

all method parameter combinations for a 30-minute polarity switching LC-MS/MS experiment.

7

Only instrument parameter combinations with maximum duty cycles (time between MS scans of

8

the same polarity) below 1200 ms were examined. This duty cycle limitation ensured the majority

9

of chromatographic features could be quantified accurately (>10 pts across the peak) while

10

reducing the number of potential parameter sets to 9,217 (see Supplemental Methods for

11

algorithm description). Simulated unique lipid identifications for these methods ranged from 389

12

to 634 lipids (Figure S8). Using this dataset to return the simulation result quickly, we performed

13

100 head-to-head comparisons between the genetic algorithm and a random parameter search

14

of equal iterations. As seen in Figure 5B, the genetic algorithm outperformed the random search

15

93 out of 100 times and, on average, would have taken 24 minutes to complete (estimated for

16

Intel® Core i7, 3.30 GHz). Encouragingly, the genetic algorithm always returned either the first

17

(634 lipid IDs) or second-ranked (633 lipid IDs) parameter set – suggesting the algorithm rarely

18

converges on a sub-optimal solution (Figure 5C).

19

With the validated genetic algorithm in hand, we explored the utility of in silico method

20

optimization using four different sample types: HAP1 cells, mouse cecum, mouse plasma, and

21

yeast cells. These samples were chosen as they contain varying levels of lipid complexity (1115,

22

614, 570, and 249 identifiable lipids respectively). For each sample, the simulation was provided

23

with sample-specific survey LC-MS experiment, a list of chromatographic features, and a list of

24

potential lipid identifications. As shown in Figure 5D-G, the algorithm returned different optimal

25

parameter sets for each sample type with simulated total unique lipid identifications of 634, 277, Page 13 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 31

1

228, and 136 respectively. We were encouraged to observe that the TopN, MS/MS IT, and

2

MS/MS IW parameters all scaled with the lipidomic complexity of the sample. These sample-

3

specific parameter sets highlight the importance of optimizing method parameters for individual

4

sample types and the potential for in silico simulation to expedite method development.

5

Next-Generation Lipidomic Analysis

6

The vast structural diversity of lipid species mandate the MS platform be operated in both positive

7

and negative polarities to obtain optimal ionization and fragmentation.33 Such dual-polarity data

8

can either be collected in sequential LC-MS/MS experiments or if supported by the MS platform,

9

the same experiment. Rapid polarity switching methods are advantageous when used for large

10

sample cohort analysis as they halve total acquisition time and simplify data integration.10 Despite

11

these benefits, the use of fast polarity switching presents unique analytical challenges. After the

12

instrument reverses polarity, it cannot collect or analyze any ion populations until the instrument’s

13

electrostatic fields stabilize41 (~230 ms for the Q Exactive HF).

14

dramatically reduces the fraction of the duty cycle used for MS analysis. Additionally, the total

15

duty cycle must be split between both positive and negative ion acquisition, further decreasing

16

the obtainable MS/MS sampling depth for a given polarity.

This stabilization period

17

This inherent challenge is evident from the performance of the optimal methods predicted

18

in Figure 5D-G. Although the samples’ differing lipid diversity produced unique optimal parameter

19

sets and lipid identifications, the actual percentage of possible lipids identified remained relatively

20

unchanged (57%, 45%, 40%, and 54% respectively). This apparent threshold is likely due to the

21

broad dynamic range over which lipid species are present in each sample (~4.5 orders of

22

magnitude, Figure S9). Even for a relatively simple yeast cell extract, the LC-MS/MS method

23

must acquire MS/MS spectra at a sufficient rate to keep pace with the rate of LC peak elution.

24

This acquisition speed requirement mandates reduced MS/MS injection times – generating low

25

S/N spectra for lowly-abundant species. Insufficient profiling depth even when operating under Page 14 ACS Paragon Plus Environment

Page 15 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

optimal settings is evident for the simulated analysis of the HAP1 cell extract presented in Figure

2

6A.

3

chromatogram segment. We define peak depth to be the number of chromatographic features,

4

excluding isotopes, of higher intensity than the least intense identified lipid. During the period of

5

greatest spectral complexity (7.5-12.5 mins), this peak depth can rise to 200 features. Despite

6

the in silico instrument acquiring nearly six MS/MS spectra per second during this period, the

7

optimal 35 ms MS/MS injection times are insufficient to generate lipid identifications for many

8

species.

9

lipidomics experiment on the Q Exactive HF platform, concomitant improvements to instrument

10

Here, we display the maximum lipid identification depth attained for each 0.25 min

To overcome this apparent identification threshold for a single-shot LC-MS/MS

acquisition speed and sensitivity must be realized.

11

Using our in silico simulation tool, we sought to explore how these developments might

12

further improve the Q Exactive HF platform for high-throughput, deep lipidomic profiling.

13

Accordingly, we created 30 different instruments in silico with increased ion flux (1x to 100x) and

14

decreased polarity switching time (233 ms to 0 ms). Although a 0 ms polarity switching time is

15

theoretically impossible, it is not unrealistic to anticipate a 100x boost in sensitivity. Such gains

16

have already been observed for several lipid classes on a Q Exactive Plus instrument using a

17

packed nanobore column operated at 600 nL/min.42 For each theoretical instrument, the genetic

18

algorithm selected the optimal method parameter set for HAP1 analysis (Figure 6B).

19

expected, each instrument generated increased lipid annotations and the instrument with the

20

highest ion flux increase and lowest polarity switching time identified the most lipids (872 lipids).

21

Despite the significant increase in simulated lipid identifications, nearly 243 lipids remain

22

unidentified. Together, these observations suggest that the simulated hardware improvements

23

to the Q Exactive Platform will not provide sufficient sensitivity to profile the majority of lowly

24

abundant lipids reproducibly. Instead, novel intelligent data acquisition techniques23,43–45 will likely

As

Page 15 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 31

1

be needed to reduce the number of uninformative MS/MS spectra and unlock comprehensive

2

lipidomic coverage in a single LC-MS/MS experiment for the Q Exactive HF.

3 4

CONCLUSIONS

5

Here, we describe a proof-of-concept software tool which simulates lipidomic LC-MS/MS data

6

acquisition on the Q Exactive HF platform and accurately calculates the number of potential

7

unique lipid identifications for a given parameter set. By deriving equations which descriptively

8

model the isolation, collection, fragmentation, and analysis of lipid precursors for MS/MS analysis,

9

we demonstrate rapid replication of whole DDA experiments. When coupled with a genetic

10

algorithm for parameter optimization, this approach ensures methods are precisely tuned for a

11

given sample type – increasing experimental lipid identification depth and reducing overall method

12

development time. Additionally, in silico simulation circumvents the need for extensive prior

13

experimental analyses to tease out the complex co-dependence of method parameters.

14

Currently, it remains unclear how much each equation would need to be tuned for any

15

given instrument. With some minor modifications and a carefully designed infusion experiment,

16

the described software tool could automatically re-derive the foundational descriptive models and

17

recalibrate the simulation tool for a given instrument. The addition of a more comprehensive

18

model of false lipid identification to the genetic algorithm’s fitness score would also further

19

enhance the power of the outlined approach. Several method parameters’ including MS/MS

20

isolation width and MS resolution must be balanced to minimize false lipid identifications while

21

maximizing acquisition speed. Although the outlined simulation does include a simplified model

22

of false lipid identifications (i.e., MS/MS PIF and S/N), the construction of a more robust model

23

for putative lipid identification confidence could greatly increase the likelihood the simulation tool

24

returns a parameter set which generates the highest number of correct lipid identifications. Such

Page 16 ACS Paragon Plus Environment

Page 17 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

a development would likely require a more complex simulation of lipid class fragment m/z and

2

intensity. Additionally, the outlined in silico algorithm could be repurposed to generate optimal

3

parameter sets for proteomic or metabolomic data collection provided a sufficiently accurate

4

prediction model for MS/MS identification probability could be derived. Although this manuscript

5

is limited in scope to a single analyte class and instrument platform, we believe in silico simulation

6

represents a promising method for accelerating method development and empowering the

7

generation of high-quality MS datasets for comprehensive and impactful biochemical profiling

8

experiments.

9 10

ASSOCIATED CONTENT

11

Supporting Information

12

Supplemental methods, LC-MS/MS analysis parameters, Compound Discoverer processing

13

parameters, LipiDex processing parameters, simulating LC-MS ion flux, modeling ion

14

accumulation, modeling quadrupole transmission profiles, modeling MS/MS fragmentation

15

efficiency, modeling spectral noise, simulated LC-MS/MS acquisition and lipid identification,

16

dynamic range of lipid identification, simulated performance of method parameter sets, and

17

dynamic range of lipids in four tissues. Raw data are available on Chorus (Project ID 1559). The

18

simulation software is available at https://github.com/coongroup.

19 20

AUTHOR INFORMATION

21

Corresponding Author

22

*Phone: 608-263-1718. E-mail: [email protected].

23

Notes Page 17 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Page 18 of 31

The authors declare no competing financial interests.

2 3

ACKNOWLEDGMENTS

4

We gratefully acknowledge support from the National Institutes of Health Grant P41 GM108538

5

(awarded to J.J.C) and the Morgridge Institute for Research Metabolism Theme.

6

acknowledge support from the Great Lakes Bioenergy Research Center, U.S. Department of

7

Energy, Office of Science, Office of Biological and Environmental Research under Award

8

Numbers DE-SC0018409 and DE-FC02-07ER64494. Additionally, we thank the Pagliarini Lab

9

for generating the HAP1 cell and yeast lipid extracts and Vanessa Linke for generating the mouse

10

cecum and plasma lipid extracts.

11

REFERENCES

12

(1)

13 14

(2)

(3)

Hebert, A. S.; Richards, A. L.; Bailey, D. J.; Ulbrich, A.; Coughlin, E. E.; Westphall, M. S.; Coon, J. J. The One Hour Yeast Proteome. Mol. Cell. Proteomics 2014, 13 (1), 339–347.

(4)

19 20

Aksenov, A. A.; Da Silva, R.; Knight, R.; Lopes, N. P.; Dorrestein, P. C. Global chemical analysis of biology by mass spectrometry. Nat. Rev. Chem. 2017, 1.

17 18

Walther, T. C.; Mann. M. Mass spectrometry–based proteomics in cell biology. J. Cell Biol. 2010, 190 (4), 491–500.

15 16

We also

Cajka, T.; Fiehn, O. Comprehensive analysis of lipids in biological systems by liquid chromatography-mass spectrometry TrAC - Trends Anal. Chem. 2014, 61, 192–206.

(5)

Blaženović, I.; Kind, T.; Sa, M. R.; Ji, J.; Vaniya, A.; Wancewicz, B.; Roberts, B. S.;

21

Torbasinovic, H.; Lee, T.; Mehta, S. S.; Showalter, M. R.; Song, H.; Kwok, J. F.; Jahn, D.;

22

Kim, J.; Fiehn, O. Structure Annotation of All Mass Spectra in Untargeted Metabolomics.

23

Anal. Chem. 2019, 91 (3), 2155-2162.

Page 18 ACS Paragon Plus Environment

Page 19 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Analytical Chemistry

(6)

Piazza, I.; Kochanowski, K.; Cappelletti, V.; Fuhrer, T.; Noor, E.; Sauer, U.; Picotti, P. A

2

Map of Protein-Metabolite Interactions Reveals Principles of Chemical Communication.

3

Cell 2018, 172 (1–2), 358–372.e23.

4

(7)

Stefely, J. A.; Kwiecien, N. W.; Freiberger, E. C.; Richards, A. L.; Jochem, A.; Rush, M. J.

5

P.; Ulbrich, A.; Robinson, K. P.; Hutchins, P. D.; Veling, M. T.; Guo, X.; Kemmerer, Z. A.;

6

Connors, K. J.; Trujillo, E. A.; Sokol, J.; Marx, H.; Westphall, M. S.; Hebert, A. S.;

7

Pagliarini, D. J.; Coon, J. J. Mitochondrial protein functions elucidated by multi-omic mass

8

spectrometry profiling. Nat. Biotechnol. 2016, 34 (11), 1191–1197.

9

(8)

Hecht, E. S.; Oberg, A. L.; Muddiman, D. C. Optimizing Mass Spectrometry Analyses: A

10

Tailored Review on the Utility of Design of Experiments. J. Am. Soc. Mass Spectrom.

11

2016, 27 (5), 767–785.

12

(9)

Kelstrup, C. D.; Jersie-Christensen, R. R.; Batth, T. S.; Arrey, T. N.; Kuehn, A.; Kellmann,

13

M.; Olsen, J. V. Rapid and deep proteomes by faster sequencing on a benchtop

14

quadrupole ultra-high-field Orbitrap mass spectrometer. J. Proteome Res. 2014, 13 (12),

15

6187–6195.

16

(10)

Yamada, T.; Uchikata, T.; Sakamoto, S.; Yokoi, Y.; Fukusaki, E.; Bamba, T. J.

17

Development of a lipid profiling system using reverse-phase liquid chromatography

18

coupled to high-resolution mass spectrometry with rapid polarity switching and an

19

automated lipid identification software. J. Chromatogr. A 2013, 1292, 211–218.

20

(11)

21 22

Ruzicka, J.; Mchale, K. J.; Peake, D. A. Data Acquisition Parameters Optimization of Quadrupole Orbitrap for Global Lipidomics on LC-MS / MS Time Frame; 2014.

(12)

Kalli, A.; Smith, G. T.; Sweredoski, M. J.; Hess, S. Evaluation and Optimization of Mass

23

Spectrometric Settings during Data-dependent Acquisition Mode: Focus on LTQ-Orbitrap

24

Mass Analyzers. J. Proteome Res. 2013, 12 (7), 3071–3086. Page 19 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

(13)

Randall, S. M.; Cardasis, H. L.; Muddiman, D. C. Factorial experimental designs elucidate

2

significant variables affecting data acquisition on a quadrupole Orbitrap mass

3

spectrometer. J. Am. Soc. Mass Spectrom. 2013, 24 (10), 1501–1512.

4

(14)

5 6

Page 20 of 31

Riter, L. S.; Vitek, O.; Gooding, K. M.; Hodge, B. D.; Julian, R. K. Statistical design of experiments as a tool in mass spectrometry. J. Mass Spectrom. 2005, 40 (5), 565–579.

(15)

Sun, B.; Kovatch, J. R.; Badiong, A.; Merbouh, N. Optimization and Modeling of

7

Quadrupole Orbitrap Parameters for Sensitive Analysis toward Single-Cell Proteomics. J.

8

Proteome Res. 2017, 16 (10), 3711–3721.

9

(16)

10 11

Huffman, G.; Specht, H.; Chen, A. T.; Slavov, N. DO-MS: Data-Driven Optimization of Mass Spectrometry Methods. J Proteome Res. 2019, 18 (6), 2493-2500.

(17)

Jeong, L. N.; Sajulga, R.; Forte, S. G.; Stoll, D. R.; Rutan, S. C. Simulation of elution

12

profiles in liquid chromatography-I: Gradient elution conditions, and with mismatched

13

injection and mobile phase solvents. J. Chromatogr. A 2016, 1457, 41–49.

14

(18)

Bowden, J. A.; Ulmer, C. Z.; Jones, C. M.; Koelmel, J. P.; Yost, R. A. NIST lipidomics

15

workflow questionnaire: an assessment of community-wide methodologies and

16

perspectives. Metabolomics 2018, 14 (5), 53.

17

(19)

Bowden, J. A.; Heckert, A.; Ulmer, C. Z.; Jones, C. M.; Koelmel, J. P.; Abdullah, L.;

18

Ahonen, L.; Alnouti, Y.; Armando, A. M.; Asara, J. M.; Bamba, T.; Barr, J. R.; Bergquist,

19

J.; Borchers, C. H.; Brandsma, J.; Breitkopf, S. B.; Cajka, T.; Cazenave-Gassiot, A.;

20

Checa, A.; Cinel, M. A.; Colas, R. A.; Cremers, S.; Dennis, E. A.; Evans, J. E.; Fauland,

21

A.; Fiehn, O.; Gardner, M. S.; Garrett, T. J.; Gotlinger, K. H.; Han, J.; Huang, Y.; Neo, A.

22

H.; Hyötyläinen, T.; Izumi, Y.; Jiang, H.; Jiang, H.; Jiang, J.; Kachman, M.; Kiyonami, R.;

23

Klavins, K.; Klose, C.; Köfeler, H. C.; Kolmert, J.; Koal, T.; Koster, G.; Kuklenyik, Z.;

24

Kurland, I. J.; Leadley, M.; Lin, K.; Maddipati, K. R.; McDougall, D.; Meikle, P. J.; Mellett, Page 20 ACS Paragon Plus Environment

Page 21 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

N. A.; Monnin, C.; Moseley, M. A.; Nandakumar, R.; Oresic, M.; Patterson, R.; Peake, D.;

2

Pierce, J. S.; Post, M.; Postle, A. D.; Pugh, R.; Qiu, Y.; Quehenberger, O.; Ramrup, P.;

3

Rees, J.; Rembiesa, B.; Reynaud, D.; Roth, M. R.; Sales, S.; Schuhmann, K.;

4

Schwartzman, M. L.; Serhan, C. N.; Shevchenko, A.; Somerville, S. E.; St. John-Williams,

5

L.; Surma, M. A.; Takeda, H.; Thakare, R.; Thompson, J. W.; Torta, F.; Triebl, A.;

6

Trötzmüller, M.; Ubhayasekera, S. J. K.; Vuckovic, D.; Weir, J. M.; Welti, R.; Wenk, M. R.;

7

Wheelock, C. E.; Yao, L.; Yuan, M.; Zhao, X. H.; Zhou, S. Harmonizing lipidomics: NIST

8

interlaboratory comparison exercise for lipidomics using SRM 1950–Metabolites in

9

Frozen Human Plasma. J. Lipid Res. 2017, 58 (12), 2275–2288.

10

(20)

11 12

Hutchins, P. D.; Russell, J. D.; Coon, J. J. LipiDex: An Integrated Software Package for High-Confidence Lipid Identification. Cell Syst. 2018, 6 (5), 621–625.e5.

(21)

Chambers, M. C.; MacLean, B.; Burke, R.; Amodei, D.; Ruderman, D. L.; Neumann, S.;

13

Gatto, L.; Fischer, B.; Pratt, B.; Egertson, J.; Hoff, K.; Kessner, D.; Tasman, N.; Shulman,

14

N.; Frewen, B.; Baker, T. A.; Brusniak, M. Y.; Paulse, C.; Creasy, D.; Flashner, L.; Kani,

15

K.; Moulding, C.; Seymour, S. L.; Nuwaysir, L. M.; Lefebvre, B.; Kuhlmann, F.; Roark, J.;

16

Rainer, P.; Detlev, S.; Hemenway, T.; Huhmer, A.; Langridge, J.; Connolly, B.; Chadick,

17

T.; Holly, K.; Eckels, J.; Deutsch, E. W.; Moritz, R. L.; Katz, J. E.; Agus, D. B.; MacCoss,

18

M.; Tabb, D. L.; Mallick, P. A cross-platform toolkit for mass spectrometry and proteomics

19

Nat. Biotechnol. 2012, 30 (10), 918–920.

20

(22)

Michalski, A.; Cox, J.; Mann, M. More than 100,000 detectable peptide species elute in

21

single shotgun proteomics runs but the majority is inaccessible to data-dependent LC-

22

MS/MS. J. Proteome Res. 2011, 10 (4), 1785–1793.

23 24

(23)

Bailey, D. J.; McDevitt, M. T.; Westphall, M. S.; Pagliarini, D. J.; Coon, J. J. Intelligent Data Acquisition Blends Targeted and Discovery Methods. J. Proteome Res. 2014, 13

Page 21 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2

Page 22 of 31

(4), 2152–2161. (24)

Hebert, A. S.; Thöing, C.; Riley, N. M.; Kwiecien, N. W.; Shiskova, E.; Huguet, R.;

3

Cardasis, H. L.; Kuehn, A.; Eliuk, S.; Zabrouskov, V.; Westphall, M. S.; McAlister, G. C.;

4

Coon, J. J. Improved Precursor Characterization for Data-Dependent Mass Spectrometry.

5

Anal. Chem. 2018, 90 (3), 2333–2340.

6

(25)

Goldfarb, D.; Wang, W.; Major, M. B. MSAcquisitionSimulator: data-dependent

7

acquisition simulator for LC-MS shotgun proteomics. Bioinformatics 2016, 32 (8), 1269–

8

1271.

9

(26)

10 11

Bioinformatics 2015, 31 (5), 791–793. (27)

12 13

Smith, R.; Prince, J. T. JAMSS: proteomics mass spectrometry simulation in Java.

Bielow, C.; Aiche, S.; Andreotti, S.; Reinert, K. MSSimulator: Simulation of Mass Spectrometry Data. J. Proteome Res. 2011, 10 (7), 2922–2929.

(28)

Noyce, A. B.; Smith, R.; Dalgleish, J.; Taylor, R. M.; Erb, K. C.; Okuda, N.; Prince, J. T.

14

Mspire-Simulator: LC-MS shotgun proteomic simulator for creating realistic gold standard

15

data. J. Proteome Res. 2013, 12 (12), 5742–5749.

16

(29)

Schulz-Trieglaff, O.; Pfeifer, N.; Gröpl, C.; Kohlbacher, O.; Reinert, K. LC-MSsim – a

17

simulation software for liquid chromatography mass spectrometry data. BMC

18

Bioinformatics 2008, 9.

19

(30)

Lam, H.; Deutsch, E. W.; Eddes, J. S.; Eng, J. K.; Stein, S. E.; Aebersold, R. Building

20

Consensus Spectral Libraries for Peptide Identification in Proteomics. Nat. Methods

21

2008, 5 (10), 873–875.

22 23

(31)

Moini, M.; Jones, B. L.; Rogers, R. M.; Jiang, L. Sodium trifluoroacetate as a tune/calibration compound for positive- and negative-ion electrospray ionization mass Page 22 ACS Paragon Plus Environment

Page 23 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

spectrometry in the mass range of 100–4000 Da. J. Am. Soc. Mass Spectrom. 1998, 305

2

(98).

3

(32)

4 5

mass spectrometers. U.S. Patent 9202681B2, 2015. (33)

6 7

Remes, P. M.; Senko, M. W. Methods for predictive automatic gain control for hybrid

Murphy, R. C.; Fiedler, J.; Hevko, J. Analysis of nonvolatile lipids by mass spectrometry. Chem. Rev. 2001, 101 (2), 479–526.

(34)

Hartler, J.; Triebl, A.; Ziegl, A.; Trötzmüller, M.; Rechberger, G. N.; Zeleznik, O. A.;

8

Zierler, K. A.; Torta, F.; Cazenave-Gassiot, A.; Wenk, M. R.; Fauland, A.; Wheelock, C.

9

E.; Armando, A. M.; Quehenberger, O.; Zhang, Q.; Wakelam, M. J. O.; Haemmerle, G.;

10

Spener, F.; Köfeler, H. C.; Thallinger, G. G. Deciphering lipid structures based on

11

platform-independent decision rules. Nat. Methods 2017, 14 (12), 1171–1174.

12

(35)

Neumann, S.; Böcker, S. Computational mass spectrometry for metabolomics:

13

identification of metabolites and small molecules. Anal. Bioanal. Chem. 2010, 398 (7–8),

14

2779–2788.

15

(36)

Makarov, A.; Denisov, E.; Lange, O.; Horning, S. Dynamic range of mass accuracy in

16

LTQ Orbitrap hybrid mass spectrometer. J. Am. Soc. Mass Spectrom. 2006, 17 (7), 977–

17

982.

18

(37)

19 20

Makarov, A.; Denisov, E. Dynamics of ions of intact proteins in the Orbitrap mass analyzer. J. Am. Soc. Mass Spectrom. 2009, 20 (8), 1486–1495.

(38)

Kind, T.; Liu, K.-H.; Lee, D. Y.; DeFelice, B.; Meissen, J. K.; Fiehn, O. LipidBlast in silico

21

tandem mass spectrometry database for lipid identification. Nat. Methods 2013, 10 (8),

22

755–758.

23

(39)

Koelmel, J. P.; Kroeger, N. M.; Ulmer, C. Z.; Bowden, J. A.; Patterson, R. E.; Cochran, J. Page 23 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 31

1

A.; Beecher, C. W. W.; Garrett, T. J.; Yost, R. A. LipidMatch: an automated workflow for

2

rule-based lipid identification using untargeted high-resolution tandem mass spectrometry

3

data. BMC Bioinformatics 2017, 18 (1), 331.

4

(40)

5 6

2004. (41)

7 8

Makarov, A.; Kholomeev, A. Mass spectrometer power sources with polarity switching. U.S. Patent 9058964B2, 2008.

(42)

9

Danne-Rasche, N.; Coman, C.; Ahrends, R. Nano-LC/NSI MS Refines Lipidomics by Enhancing Lipid Coverage, Measurement Sensitivity, and Linear Dynamic Range. Anal.

10 11

Haupt, R. L.; Haupt, S. E. Practical Genetic Algorithms Second Edition; Wiley: New York,

Chem. 2018, 90 (13), 8093–8101. (43)

Koelmel, J. P.; Kroeger, N. M.; Gill, E. L.; Ulmer, C. Z.; Bowden, J. A.; Patterson, R. E.;

12

Yost, R. A.; Garrett, T. J. Expanding Lipidome Coverage Using LC-MS/MS Data-

13

Dependent Acquisition with Automated Exclusion List Generation. J. Am. Soc. Mass

14

Spectrom. 2017, 28 (5), 908–917.

15

(44)

Broeckling, C. D.; Hoyes, E.; Richardson, K.; Brown, J. M.; Prenni, J. E. Comprehensive

16

Tandem-Mass-Spectrometry Coverage of Complex Samples Enabled by Data-Set-

17

Dependent Acquisition. Anal. Chem. 2018, 90 (13), 8020–8027.

18

(45)

Bailey, D. J.; Rose, C. M.; McAlister, G. C.; Brumbaugh, J.; Yu, P.; Wenger, C. D.;

19

Westphall, M. S.; Thomson, J. A.; Coon, J. J. Instant spectral assignment for advanced

20

decision tree-driven mass spectrometry. Proc. Natl. Acad. Sci. 2012, 109 (22), 8411–

21

8416.

22 23 Page 24 ACS Paragon Plus Environment

Page 25 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 1. Simulating MS Acquisition. (A) Fast polarity switching data-dependent scan cycle used for lipidomics data acquisition. (B) Overview of seed information and processing steps for in silico LC-MS/MS simulation. (C) Q-Exactive HF schematic and instrument performance characteristics modelled.

Page 25 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 31

Figure 2. Instrument Performance Simulation and Validation. (A) Lipid MS/MS injection times extracted from LC-MS/MS analysis of HAP1 cell extracts using various MS/MS isolation width and AGC target values. (B) Correlation of simulated and experimental MS/MS injection times from LC-MS/MS analysis of HAP1 cell extracts. (C) Experimental quadrupole transmission peaks for m/z 524 vertically offset for visual clarity. (D) Simulated quadrupole transmission peaks for m/z 524 vertically offset for visual clarity. (E) Simulated quadrupole transmission peak for PC 18:1/18:1 [M+H]+ isotope cluster. (F) Error of simulated precursor ion fraction across varying MS/MS isolation width settings for MS/MS analysis of negative Calmix ions. (G) MS/MS noise band intensity for negative Calmix ions measured using varied MS/MS resolution settings. (H) Correlation of simulated and experimental MS/MS noise band intensities from LC-MS/MS analysis of HAP1 cell extracts.

Page 26 ACS Paragon Plus Environment

Page 27 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 3. Modeling Lipid Spectral Matches. (A) Spectral match score for lipid reference standard MS/MS across various MS/MS signal-to-noise ratios. (B) Number of lipid spectral matches from lipid reference standard MS/MS mixed in silico at varying rations with isobaric lipid MS/MS.

Page 27 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 31

Figure 4. Simulation Accuracy for Varied Method Parameters. Lipid spectral matches, identified features, and unique lipid species identified for (A) experimental and (B) simulated LC-MS/MS acquisition of HAP1 cell extracts across varied MS/MS maximum injection time and top N values. Lipid spectral matches, identified features, and unique lipid species identified for (C) experimental and (D) simulated LC-MS/MS acquisition of HAP1 cell extracts across varied MS/MS isolation width and AGC target values.

Page 28 ACS Paragon Plus Environment

Page 29 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 5. Genetic Algorithm for Rapid Method Optimization. (A) Overview of genetic algorithm for selection of optimal method parameter sets. (B) Simulated number of unique HAP1 lipid identifications and (C) rank for the optimal method parameter set returned by the genetic algorithm and by a random parameter search of equal iterations. Number of simulated unique lipid identifications and the optimal method parameter set returned by the genetic algorithm for simulated LC-MS/MS analysis of (D) HAP1 cells, (E) mouse cecum, (F) mouse plasma, and (G) yeast cells.

Page 29 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 31

Figure 6: Theoretical MS Improvements for Increased Lipid Identification. (A) Maximum peak depth for all identifiable lipids (blue) and lipids identified using optimal parameter set (yellow) for 0.25 min segments of simulated LC-MS/MS acquisition of HAP1 cell extracts. (B) Simulated unique HAP1 lipid identifications using reduced polarity switching times and improved ion flux.

Page 30 ACS Paragon Plus Environment

Page 31 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

For TOC Only

Page 31 ACS Paragon Plus Environment