Subscriber access provided by University of Rhode Island | University Libraries
Article
LipidMS: an R package for lipid annotation in untargeted liquid chromatography-data independent acquisition-mass spectrometry lipidomics Maria Isabel Alcoriza-Balaguer, Juan Carlos García-Cañaveras, Adrian Lopez, Isabel Conde, Oscar Juan, Julian Carretero, and Agustín Lahoz Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.8b03409 • Publication Date (Web): 30 Nov 2018 Downloaded from http://pubs.acs.org on November 30, 2018
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
LipidMS: an R package for lipid annotation in untargeted
2
liquid chromatography-data independent acquisition-mass
3
spectrometry lipidomics
4 5
María Isabel Alcoriza-Balaguer 1, #, Juan Carlos García-Cañaveras 1, #, Adrián López1, Isabel
6
Conde2, Oscar Juan1,3, Julián Carretero4, Agustín Lahoz 1,*
7 8
1 Biomarkers and Precision Medicine Unit and Analytical Unit, Instituto de Investigación
9
Sanitaria Fundación Hospital La Fe, Valencia 46026, Spain.
10
2 Hepatology Unit. Department of Digestive Medicine Hospital Universitari i Politècnic
11
La Fe, Valencia, 46026, Spain.
12
3 Department of Medical Oncology, Hospital Universitari i Politècnic La Fe, Valencia
13
46026, Spain.
14
4 Department of Physiology, University of Valencia, Burjassot 4100, Spain
15
# These authors contributed equally
16
*To whom correspondence should be addressed. Agustín Lahoz. E-mail:
17
[email protected]. Biomarkers and Precision Medicine Unit, Analytical Unit,
18
Instituto de Investigación Sanitaria Fundación Hospital La Fe, Av. Fernando Abril
19
Martorell 106, Valencia 46026, Spain. Tel: 961246652, Fax: 961246620
20 21
1 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 25
1
Abstract
2
High resolution LC-MS-untargeted lipidomics using data-independent acquisition (DIA)
3
has the potential to increase lipidome coverage as it enables the continuous and unbiased
4
acquisition of all eluting ions. However, the loss of the link between the precursor and
5
the product ions combined with the high dimensionality of DIA data sets hinder accurate
6
feature annotation. Here, we present LipidMS, an R-package aimed to confidently
7
identify lipid species in untargeted LC-DIA-MS. To this end, LipidMS combines a
8
coelution score, which links precursor and fragment ions, with fragmentation and
9
intensity rules. Depending on the MS evidence reached by the identification function
10
survey, LipidMS provides three levels of structural annotations: i) “subclass level”, e.g.,
11
PG(34:1); ii) “fatty acyl level”, e.g., PG(16:0_18:1); and iii) “fatty acyl position level”,
12
e.g., PG(16:0/18:1). The comparison of LipidMS with freely available data-dependent
13
acquisition (DDA) and DIA identification tools showed that LipidMS provides
14
significantly more accurate and structural informative lipid identifications. Finally, to
15
exemplify the utility of LipidMS, we investigated the lipidomic serum profile of patients
16
diagnosed with non-alcoholic steatohepatitis (NASH), which is the progressive form of
17
non-alcoholic fatty liver disease, a disorder underlying a strong lipid dysregulation. As
18
previously published, a significant decrease in lyso- and phosphatidylcholines and
19
cholesterol esters and an increase in phosphatidylethanolamines were observed in NASH
20
patients. Remarkably, LipidMS allowed to identify a new set of lipids that may be used
21
for NASH diagnosis. Altogether, LipidMS has been validated as a tool to assist lipid
22
identification in the LC-DIA-MS untargeted analysis of complex biological samples.
23 24 25 26
Keywords: lipidomics, mass spectrometry, data-independent acquisition, lipid annotation, r-package, non-alcoholic steatohepatitis.
2 ACS Paragon Plus Environment
Page 3 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
Lipidomics can be understood as the systems-level scale analysis of lipids and their
2
interacting partners.1 More concretely, from an analytical point of view, it can be defined
3
as the determination of the complete set of lipids (lipidome) present in a given biological
4
sample (e.g. cell, tissue, biofluid, organism…). Lipids are a heterogeneous group of
5
metabolites involved in many biological functions as intermediates or products in
6
signalling pathways, structural components of cell membranes and energy storage sources,
7
among others.1 Alterations in general lipid profiles and in particular lipid species have
8
been identified in many diseases including cancer,2,3 non-alcoholic fatty liver disease,4,5
9
diabetes,6 heart disease,7 and neurological diseases.8 From a quantitative point of view,
10
lipids represent 60-70% of all detected and identified metabolites in the human serum
11
metabolome9 and 20% of the human urine metabolome.10 Based on LIPID MAPS
12
Consortium, lipids are classified into eight classes.11,12 In general, lipids can be described
13
as a combination of various building blocks, usually a core structure that defines their
14
class (e.g., glycerol, sphingoid bases, and cholesterol) and subclass (e.g. polar head
15
groups of phospholipids as phosphocholine and phophoethanolamine) and a variable
16
number of fatty acyl chains (FA) attached to that core structure13 (Figure S1). As a result
17
of the different structural arrangements of the FA into the core structures, isobaric lipids
18
(e.g. PC(18:1/18:1) vs. PC(18:0/18:2)) and isomeric lipids (e.g. PC(16:0/20:4) vs.
19
PC(20:4/16:0)) can be found, hindering their actual identification.
20
Liquid chromatography (LC) coupled to mass spectrometry (MS) is a powerful tool,
21
which enables the comprehensive lipid characterization of biological samples.14 Lipid
22
identification in untargeted MS-based lipidomics usually relies on the combined
23
acquisition of full MS, which provides information about the nominal mass and formula
24
of the lipids, and MS/MS data, which allows to identify the building blocks that compose
25
them.13,14 The most common procedure for the acquisition of MS/MS spectra is to
26
perform a data-dependent acquisition (DDA) in which ions (lipids) of interest are isolated
27
and then subsequently fragmented to obtain their corresponding MS/MS spectra.15 MS
28
data-independent acquisition (DIA) is an alternative to DDA in which no ion isolation is
29
performed and all the ions that elute at a given time are fragmented and detected jointly,
30
thus MS/MS information is obtained for all the eluting compounds. However, the
31
management of DIA data sets is not trivial and it is even more complicated in the case of
32
lipids, where apart from the parent and fragment ions coelution, their building block
33
nature generates a number of fragments that are common to several lipid species and
34
which usually are not well chromatographically resolved (Figure S2). On top of that, the 3 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 25
1
lack of a comprehensive collection of purified well-characterized lipid standards forces
2
lipid identification to be based on the combination of MS and MS/MS data with the only
3
additional support of known fragmentation rules.16 A number of software tools have been
4
developed for the identification of lipids using DDA: MS-DIAL,17 Greazy,18 LipiDex,19
5
LDA,20 or the use of LipidBlast in silico database
6
program. However, only a few freely available tools are designed for DIA-based lipid
7
identification, among them; MS-DIAL,17 Lipid-Pro21 and LipidMatch,22 being MS-DIAL
8
the most used one (based on the number of cites reported by Google Scholar).
9
Here,
we
present
LipidMS,
an
16
searched via NIST MS Search
R
package
(https://CRAN.R-
10
project.org/package=LipidMS) for lipid annotation in LC-DIA-MS. LipidMS calculates
11
a precursor and fragment coelution score (PFCS) for those ions present in a predefined
12
retention time (tR) window, then it applies a set of fragmentation and fragment intensity
13
rules to annotate lipids. Moreover, LipidMS uses either .csv, for already pre-processed
14
data sets, or the common file format for MS data .mzXML as data input formats, thus, it
15
is compatible with multiple mass spectrometer vendors. To assess LipidMS performance,
16
it was first showcased to process LC-DIA-MS data from two test samples (i.e., a standard
17
sample and a pooled human serum sample). These samples were prepared by adding a
18
mixture of 50 representative lipid standards and then analysed using two mass
19
spectrometers (i.e., Agilent Q-ToF 6550 and Waters Synapt G2-Si Q-ToF). LipidMS was
20
also compared with DDA and other DIA existing tools.17 Finally, to exemplify the
21
package utility in a biological context, LipidMS was applied in the lipidomic analysis of
22
serum samples from patients diagnosed with non-alcoholic steatohepatitis (NASH),
23
which is the progressive form of non-alcoholic fatty liver disease (NAFLD), a disorder
24
characterized by a strong lipid dysregulation. NAFLD and NASH have been extensively
25
studied by metabolomics and lipidomics approaches and specific lipid patterns have been
26
proposed as diagnostic and prognostic biomarkers signatures.4,23,24 Not only do our results
27
confirm previously published lipid-related markers, but they also provide a new set of
28
lipids that are now proposed as NASH biomarker lipid-based signature.
29 30 31 32 33
4 ACS Paragon Plus Environment
Page 5 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1 2
Experimental section
3
Other experimental details about chemicals, lipidome extraction, labelling techniques,
4
LC-MS settings and data processing parameters are provided in the Supporting
5
Information.
6 7
LipidMS processing workflow
8
LipidMS was developed in R programming environment25 and it is available via CRAN
9
(https://CRAN.R-project.org/package=LipidMS). LipidMS includes dedicated functions
10
for: MS-data processing, lipid identification, data import, lipid annotations export, data
11
base customization and creating inclusion list for targeted MS analysis (Table S1).
12 13
Format requirement for lipid annotation functions.
14
LipidMS identification functions require two data inputs: i) a peak table for MS1 and one
15
or two peak tables for MS2, depending on the number of collision energies used and ii)
16
one raw data table for MS1 and one or two raw data tables for MS2, depending on the
17
number of collision energies used. The peak tables are mandatory and are used for
18
identification, while the raw data tables are optional and only used for the calculation of
19
the PFCS. If the raw data tables are not used, the association between parent and
20
fragments ions will be based exclusively on tR windows. Both types of tables are obtained
21
from mzXML files when the dataProcessing function is employed. The peak tables must
22
contain deisotoped and tR aligned peaks. Formally, they have to be stored as data frames
23
containing, at least, the following columns: m/z, tR (in seconds), intensity/area and peak
24
identification (PeakID column). The raw data tables provide scan by scan information of
25
each MS or MS/MS data file and have to contain the following columns: m/z, tR (in
26
seconds), intensity/area, peakID and scan number. These tables can be easily obtained
27
performing data processing with LipidMS, although other approaches can also be used.
28
Data acquired in positive and negative electrospray ionization modes (ESI) have to be
29
provided separately as lipid identification functions apply specific rules for each polarity.
30
For further details the reader is referred to the manual package (https://CRAN.R-
31
project.org/package=LipidMS).
32 33
Data conversion
34
Lipid identification functions within LipidMS require a separate peak list for each 5 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 25
1
collision energy used (e.g. MS1, MS2low and MS2high) as input and peak picking tools
2
usually handle only MS1 as input. Therefore, it is mandatory to convert complex DIA-
3
MS data into a format that can be used for peak picking. To convert raw data into mzXML,
4
MSConvert software (ProteoWizard 3.0.10800 64 bit)26 can be used. The procedure to
5
extract each collision energy file for the raw data is instrument dependent. Here, two Q-
6
ToF independent platforms have been used (i.e., a Waters Synapt G2-Si Q-ToF and an
7
Agilent Q-ToF 6550). For the Waters instrument, raw archive file contains three different
8
data acquisition functions (i.e. MS1, MS2 and lockspray). Lockspray files must be
9
removed and the other functions have to be separated by collision energy and then
10
converted into .mzXML files. Whereas for Agilent, the .d raw data archive is directly
11
converted into single mzXML file and subsequently separated into collision energy
12
independent files using the LipidMS sepByCE function. Figure 1 shows the recommend
13
data processing workflow for LipidMS.
14 15
Peak detection and alignment
16
Data pre-processing (i.e. peak picking, deisotoping and alignment) can be performed
17
using either free GUI software packages such as MZmine,27 XCMS,28 enviPick
18
(https://CRAN.R-project.org/package=enviPick) or commercial software packages such
19
as Progenesis QI or MassHunter Workstation. LipidMS includes a function that takes
20
advantage of enviPick for peak picking and of CAMERA29 for alignment and
21
deisotoping, which is strongly recommended for performing data-processing. Moreover,
22
the use of LipidMS dataProcessing function is the easiest way to get the require data
23
inputs for using the PFCS to complement tR windows for the association of parents and
24
fragments.
25 26
Lipid identification
27
LipidMS allows to efficiently annotate lipids within a wide range of concentrations
28
(Figure S3). However, as general rule the use of saturated signals for lipid identification
29
should be avoided as it deteriorates both mass accuracy and peak shape thus hampering
30
feature annotation. Lipid identification is separately performed using data from positive
31
and negative ESI modes through idPOS and idNEG functions, respectively. Nevertheless,
32
specific lipid classes can be identified alone by using class-defined functions (Table S1).
33
The implementation of LipidMS within a lipidomics workflow is depicted in Figure S4.
6 ACS Paragon Plus Environment
Page 7 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
DIA data used to test LipidMS performance and an example script can be found at GitHub
2
(https://github.com/maialba3/LipidMS-data-v1.0).
3 4
Samples included in the study
5 6
Test Samples
7
Two test samples were used to evaluate the performance of LipidMS. These samples were
8
prepared by spiking a mixture containing 50 lipid standards into a blank sample or a
9
pooled human serum sample (Sigma-Aldrich, Madrid, Spain). These lipid standards were
10
selected attending to their biological relevance, their representativeness of lipid classes,
11
and their analytical relevance, to this end isobaric/isomeric species were also included
12
(Table S2).
13 14
Serum samples from patients with NAFLD
15
Patients diagnosed with NAFLD at the Liver Transplantation and Hepatology Unit at the
16
Hospital La Fe (Valencia) were enrolled in this study. NAFLD diagnosis was performed
17
by histological examination of liver biopsy specimens. NAFLD was assessed by using
18
NAFLD activity score (NAS).30 A total of 20 patients with a NAS ≥ 5, which strongly
19
correlates with NASH, were selected. Additionally, 14 serum samples from healthy
20
donors with similar demographic characteristics from the Biobank at IIS-La Fe were
21
selected as control group. All the samples were obtained after receiving informed consent.
22
The study was approved by the Institutional Ethics Committee.
23 24
Results and discussion
25 26
Rationale behind LipidMS
27
LipidMS has been developed in R programming language to serve as an easy-to-use and
28
highly adaptable to end-user tool for assisting lipid annotation in untargeted LC-DIA-MS
29
lipidomics. The building block nature of the majority of lipids enables the establishment
30
of generic structure-derived fragmentation rules that can be used for MS-based
31
identification and structure elucidation. This strategy has been satisfactorily implemented
32
for lipid identification in both DDA and DIA approaches.16,20,22 However, most of the
33
current methods rely on the use of most intense fragments to accomplish lipid 7 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 25
1
identification, which can generate false positives due to the poor selectivity of these ions
2
when coelution is present. In reverse phase chromatography, lipids elution depends on
3
both the lipid class and their FA composition, thus each lipid class usually elutes within
4
a narrow tR window. As a result, many common fragments, as those corresponding to
5
head groups, are poorly chromatographically resolved (Figure S2), which strongly affects
6
their selectivity for lipid annotation. This issue is particularly relevant when complex
7
biological samples are analysed. To overcome these drawbacks, lipid annotation in
8
LipidMS is based on combining two complementary approaches. First, to modulate the
9
stringency in the association of parent with coeluting fragment ions, a PFCS is calculated
10
for all the MS/MS ions present in a predefined tR window around the parent ion. The
11
PFCS score is formally defined as a Pearson correlation coefficient calculated based on
12
the peak shape (distribution of intensities over elution time) of parent and fragment ions
13
and it can be used to test the similarity among those ion chromatograms. This approach
14
has been successfully applied to the analysis of MS-data in the field of metabolomics. 31
15
Second, and most importantly, LipidMS takes advantage of the use of fragmentation and
16
fragment intensity rules. The last are defined based on the relation between the intensities
17
of different fragment ions and are used to elucidate the position of the different FA into
18
the lipid backbone structure. Both fragmentation and intensities rules have been manually
19
curated by using public available spectral information (i.e. LipidMaps,32 Metlin,33
20
LipidBlast,16 HMDB34) and in-house generated MS/MS spectra for DDA and DIA in two
21
different MS/MS platforms (Waters Synapt G2-Si Q-ToF and Agilent Q-ToF 6550). In
22
the fragmentation rules curation procedure, the use of highly intense fragments common
23
to several lipid classes has been avoided when possible and specific well-characterized
24
fragments and adducts have been selected instead. Specific selected fragments as well as
25
the preferred acquisition mode (i.e., ESI+ and ESI-) for each lipid class are summarized
26
in Tables S2-S4. Additionally, the experimental data supporting the selection of the
27
fragmentation rules used by LipidMS are represented in Figures S5-S20.
28 29
Lipid coverage and building block database customization
30
As previously mentioned, most of the lipids can be defined by a backbone structure,
31
which defines the lipid class and subclass, and a number of acyl residues attached to that
32
core structure. Thanks to these features, a lipid database can be built by defining both the
33
lipid core and the set of acyl chains to be incorporated.13 In LipidMS the acyl residues are
34
specified in the building block database (bbDB), where an entity (e.g., FA(16:0) can be 8 ACS Paragon Plus Environment
Page 9 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
used as a specific candidate (i.e., FA(16:0)) or as fatty acyl radical of a number of more
2
complex lipids (e.g., PL, GL, SM). By default the bbDB includes 30 fatty acids, 4
3
sphingoid bases, and 3 bile acids (Table S5), which were selected based on their
4
biological relevance.12 LipidMS arranges those chemical entities to build up a query
5
database (QDB), which will be eventually used to interrogate the MS data. The
6
arrangement of the 37 entities included in the default bbDB covers 22 lipid classes and
7
results in 2502 potential molecular formulas and more than 53000 individual lipids. The
8
lipidome coverage provided by the LipidMS can be easily modified by varying the
9
chemical entities provided in the bbDB by just using the createLipidDB function. For
10
instance, odd fatty acyls as FA(19:0) can be included, which would be used as potential
11
candidate or as a part of more complex lipids (e.g., PC(19:0_19:0) or
12
TG(19:0_19:0_19:0)). Additionally, the repertoire of lipids included in the bbDB used to
13
build the QDB can also be exported elsewhere to be used as a library or a target inclusion
14
list (createLipidDB).
15 16
LipidMS annotation workflow
17
LipidMS contains 31 functions aimed to annotate 22 lipid classes using either positive or
18
negative ESI modes (Table S1). To exemplify LipidMS annotation workflow, the
19
annotation procedure for PG(16:0/18:1) is described in Figure 2. Overall, the following
20
steps (internal functions, indicated in italics) are executed within each identification
21
function survey for lipid annotation (i.e. idPGneg):
22
i)
Based on the set of chemical entities included in the bbDB (Table S5) and on
23
the ionization properties selected for each lipid class (Table S6) a target ion
24
list is generated by LipidMS (QDB). This list is subsequently used to
25
interrogate the full MS data within a defined tR window and a mass error gap
26
(findCandidates). These parameters can be easily set up by the user. At this
27
step, putatively annotated lipids are identified based on the lipid class and the
28
number of carbons and double bonds is determined. This level of survey is
29
not reported by LipidMS by default, as we considered it as non-informative.
30
However, this information can be easily recovered by the findCandidates
31
function or the class identification functions (e.g. idPGneg).
32
ii)
The coeluting fragment ions for each putatively annotated lipid are selected
33
based on the defined tR window. Optionally, a PFCS is then calculated for
34
each of the pair ions used for lipid identification and only those above a 9 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 25
1
previously-defined threshold are retained. To minimize false positives, a
2
value of 10 seconds for the tR and a PFCS value of 0.8 are set by default.
3
However, these values can be easily changed by the user (coelutingFrags).
4
iii)
Based on the established fragmentation rules (Tables S3-S4) and on a by
5
default mass error of 10 ppm, a survey of informative fragment ions of the
6
lipid class (e.g. head groups) is performed among the coeluting fragments
7
extracted in step (ii) (checkClass). The mass error used in each survey can be
8
modified by the user (argument ppm_products).
9
iv)
of the fatty acyl component (chainFrags).
10 11
Then, the same procedure is applied for searching fragment ions informative
v)
Based on the proposed fatty acyls components, combinations that sum up the
12
expected total number of carbons and double bonds determined in step (i) are
13
searched in the MS/MS data (combineChains).
14
vi)
Once the fatty acyls components have been determined, intensity rules, which
15
are based on the relative intensities ratios between the fragments, are applied
16
to elucidate the position of those chains (checkIntensityRules). For further
17
details regarding intensity rules see Tables S3 and S4 and previously
18
published data.19
19
Attending to the MS structural evidence reached by each annotation survey, LipidMS
20
provides different levels of structural information:20,35 i) “subclass level”, where specific
21
class fragments (e.g. head groups of phospholipids) are used to determine the subclass
22
and the precursor ion is used to calculate the total number of carbons and double bonds
23
of the chains. At this level, LipidMS cannot differentiate which fatty acids are linked to
24
the backbone and a sum of several isobaric/isomeric compounds is proposed (e.g.
25
PG(34:1)); ii) “fatty acyl level” (FA level), where the composition of the constituent
26
chains is assigned based on chain specific fragments but no positional information is
27
given (e.g. PG(16:0_18:1); and iii) “fatty acyl position level” (FA position level), where
28
the specific position of each chain is elucidated through fragment intensity ratios (e.g.
29
PG(16:0/18:1)).
30
As a result of the execution of lipid identification function (idPOS or idNEG), two
31
separate R objects, which can be easily saved as tables, are generated (i.e., ‘results peak
32
table’ and ‘annotated peak table’). On the one hand, the ‘results peak table’ contains the
33
following information for each annotated lipid: i) feature identity, annotated as lipid class,
34
total number of carbons, double bonds and fatty acid composition, ii) peak properties, 10 ACS Paragon Plus Environment
Page 11 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
including m/z, tR, peak intensity and peakID information and iii) identification criteria
2
used, reporting information about adduct/s detected, m/z error, structural annotation level,
3
and the mean PFCS value. On the other hand, the ‘annotated peak table’ links the original
4
MS1 data with the ‘results peak table’, providing the following information for each
5
feature: m/z, tR, peak intensity, peakID, all the possible identities ranked by the annotation
6
level, ion adducts and the mean value of the PFCS used in each lipid identification.
7
Further information about the fragments that support each identification can be explored
8
using class-specific identification functions (i.e. idPGneg).
9
Among the extra functions incorporated in LipidMS two should be further explained due
10
to their utility: i) the getInclusionList function, which builds a list of all annotated lipids
11
with the following information: formula; tR in seconds; monoisotopic neutral mass; and
12
lipid identity. This table may be used to apply the DIA-based identities to automatize
13
targeted peak picking in multiple samples containing only MS data or to prioritize ion
14
fragmentation in DDA-based approaches (Figure S4) and ii) the searchIsotopes function,
15
which allows to identify compound isotopes when labelled compounds are used as tracers
16
(e.g., U-13C-glucose- or U-13C-glutamine-). Here, LipidMS uses a control sample, where
17
the tracer is not present, to generate a target inclusion list of lipids and their corresponding
18
tR. This list is subsequently used to search for isotopes in each tR using the raw data
19
generated in the presence of the tracer. Thus, lipids isotope distributions can be obtained
20
(Figure S21). To test the utility of the searchIsotopes function, A549 cells were incubated
21
in parallel containing either U-12C-D-Glucose or U-13C-D-Glucose. Labelling
22
incorporation into palmitic acid was used as an example showing that LipidMS can
23
effectively assess 13C-patterns when labelled compound are used (Table S7). However,
24
it should be noted that further improvements have to be implemented to take full
25
advantage of LipidMS identifications capabilities when using 13-C-labelled samples.
26 27
Performance evaluation of LipidMS
28
As a first step to test the performance of LipidMS, a mixture of 50 representative lipid
29
standards comprising several lipid classes (Table S2) was used to prepare a standard test
30
sample and to fortify a pooled human serum sample. These two test samples were
31
subsequently analyzed in both positive and negative ESI modes in a Q-ToF mass
32
spectrometer (Agilent Q-ToF 6550). LipidMS was able to identify 49 standards at the
33
subclass level in the standard test sample, among them 42 reached the maximum level of
34
annotation possible for each class (i.e., FA and FA position levels), while for the serum 11 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 25
1
test sample, 47 lipid standards were identified at the subclass level and 45 of them at FA
2
and FA position levels when possible (Table 1-2, S8-S12).
3
Once the reliability of LipidMS was proven we decided to compare it with already
4
available tools. MS-DIAL17 was selected as the software of reference since it is one of
5
the most valuable and cited tools used for lipid identification in both DDA and DIA
6
modes. MS-DIAL employs a combination of mass spectral deconvolution, spectral
7
matching and LipidBlast (an in silico library with a broad lipid coverage) for lipid
8
annotation. LipidMS identified a higher number of lipid standards in both test samples
9
compared to MS-DIAL, independently of the acquisition mode (Tables 1-2).
10
Accordingly, LipidMS also reported a higher number of total identified lipids in the
11
untargeted analysis of the pooled human serum sample (Table 2). Interestingly, although
12
the higher number of identifications was reported by LipidMS, MS-DIAL applied to DIA
13
data also provided a higher number of identifications than MS-DIAL applied to DDA
14
data, which highlights the importance of using DIA approaches. The number of false
15
positive identifications was the only parameter in which DDA slightly outperformed
16
DIA-based approaches in our comparison (Table 1). However, even in that aspect,
17
LipidMS proved superior to MS-DIAL when applied to DIA samples. We would like to
18
remark that MS-DIAL only reports two levels of identification: “annotated”, based on
19
MS data, or “identified”, based on both MS and MSMS data. However, no detailed
20
information about the actual level of structural evidence is reported and the highest level
21
of annotation that can be achieved is FA level. Compared to MS-DIAL, LipidMS provides
22
a more detailed report of the level of structural evidence that supports the identification
23
and thanks to the implementation of fragment intensity rules, a highest level of structural
24
information can be reached (i.e., FA position level). Thus, LipidMS significantly
25
outperformed MS-DIAL in the level of structural information reached in each standard
26
identification (Table S8 and S11).
27
Ideally further comparisons with other commonly used DIA methods as LipidMatch22 or
28
Lipid-Pro21 should have been performed. However, LipidMatch only supports Thermo
29
(Q Exactive) files for DIA, while in Lipid-Pro fragmentation rules have to be manually
30
provided for each lipid class, which was found to be very time-consuming. Moreover, we
31
did not find the way to fully implement the rules employed by LipidMS in Lipid-Pro.
32
Finally, to prove that LipidMS can be used with DIA data obtained from multiple
33
platforms, we decided to compare the results obtained for the two test samples analyzed
34
in two different Q-ToF instruments (i.e., a Waters Synapt G2-Si Q-ToF and an Agilent 12 ACS Paragon Plus Environment
Page 13 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
1
Q-ToF 6550). No significant differences were observed in terms of the number of lipid
2
standards identified in both test samples (Table 1-2, S8, S11, S13-S15), where a 98% and
3
92% of coincidence was achieved respectively. Furthermore, similar lipidomic
4
characterization in terms of the type of lipid classes and the level of structural information
5
reached was observed for both instruments (Figure S22). Altogether, these results proved
6
that LipidMS performance is not dependent on the analytical platform used. However, its
7
suitability for other mass analyzers (e.g. Orbitrap) or other vendors could be further
8
confirmed once the package is used by the MS-based lipidomics community.
9 10
Application of LipidMS in the LC-DIA-MS analysis of NAFLD
11
NAFLD is now the commonest liver disorder in the developed world affecting up to a
12
third of individuals. However, diagnosis is usually based on imaging tests and liver biopsy
13
is required for disease confirmation and staging.36 Therefore, finding new non-invasive
14
NAFLD diagnosis and prognosis biomarkers has aroused much interest. An important
15
number of studies have relied on metabolomics or lipidomics for metabolite biomarkers
16
discovery.4,23,24,37 Here, LipidMS was applied for the LC-DIA-MS untargeted analysis of
17
serum samples of patients diagnosed with NASH and of healthy donors. The baseline
18
characteristics of the patients enrolled in the study are summarized in Table S16. The
19
groups were similar with respect to gender, age, body mass index, fasting blood sugar,
20
and hepatic synthetic functions. A pooled sample was generated by mixing equal amounts
21
of each sample and used for lipid identification based on DIA-MS/MS. Combining both
22
positive and negative ionization modes, 258 lipids were identified in the pooled sample
23
and then extracted from the rest of the samples based on their accurate m/z and tR.
24
Principal component analysis showed a clear separation between control and NASH
25
groups (Figure 3A), suggesting differences in their underlying lipidomic profiles. In total,
26
22 lipids were significantly altered between control and NASH patients (p-value ≤ 0.05
27
and a |log2 Fold of Change| ≥ 1) (Figure 3B). Moreover, when analyzing generic trends
28
based on the sum of the intensities of the lipids belonging to a given class, a significant
29
decrease in PC, LPC, and CE and an increase in PE were observed for NASH patients
30
(Figure 3C). These observations are in agreement with previously published data where
31
it is suggested that these lipid species could play a role in disease progression.4,23,24
32
Furthermore, LipidMS was also able to identify some specific lipids that have been
33
previously proposed as NAFLD or NASH biomarkers (e.g. PE(16:0/22:6), PE(18:0/22:6),
34
PC(16:0/20:4) and TG(54:5) among others4,23,24 (Figure 3D). Interestingly, LipidMS also 13 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 14 of 25
1
identified a set of new potential biomarkers of NASH (Table S17). However, this
2
lipidomic signature should be further confirmed in a larger cohort of NASH patients.
3
Overall, our results confirmed previously published data and validated LipidMS for DIA
4
data analysis in untargeted LC-MS lipidomic approaches involving complex biological
5
samples.
6 7
Conclusions
8
A new freely available method for DIA data sets analysis in LC-MS untargeted
9
lipidomics, namely LipidMS, has been developed. The new method takes advantage of
10
combining curated fragmentation and intensity rules with a parent and fragment coelution
11
score, which is calculated in predefined retention time windows for the reliable
12
identification of lipids. LipidMS provides wide lipid coverage and it is easily
13
customizable thanks to the use of R environment.25 Compared to existing DDA and DIA
14
tools (MS-DIAL), LipidMS significantly detected a higher number of lipids in the
15
analysis of two test samples (standard and human serum samples). Moreover, LipidMS
16
provides a detailed description of the level of structural information achieved for each
17
identified lipid and thanks to the fragment and intensity rules implemented in LipidMS a
18
higher level of structural information can be reached (FA position level, compared to FA
19
composition that is the highest level reached by other tools). Data analysis independency
20
and reproducibility was also proved by comparing the results obtained by two
21
independent Q-ToF analytical platforms (Waters Synapt G2-Si Q-ToF and Agilent Q-
22
ToF 6550). LipidMS usefulness was further demonstrated when it was applied to the
23
analysis of real clinical samples, that is NASH serum samples, where not only previously
24
identified lipid patterns were corroborated, but also a new set of biomarkers was
25
proposed. Altogether, LipidMS has been validated as a tool to assist lipid identification
26
in LC-DIA-MS untargeted the analysis of complex biological samples.
27 28
14 ACS Paragon Plus Environment
Page 15 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
Analytical Chemistry
References (1) Wenk, M. R. The emerging field of lipidomics. Nat Rev Drug Discov 2005, 4, 594-610. (2) Hilvo, M.; Denkert, C.; Lehtinen, L.; Muller, B.; Brockmoller, S.; Seppanen-Laakso, T.; Budczies, J.; Bucher, E.; Yetukuri, L.; Castillo, S.; Berg, E.; Nygren, H.; Sysi-Aho, M.; Griffin, J. L.; Fiehn, O.; Loibl, S.; Richter-Ehrenstein, C.; Radke, C.; Hyotylainen, T.; Kallioniemi, O., et al. Novel theranostic opportunities offered by characterization of altered membrane lipid metabolism in breast cancer progression. Cancer Res 2011, 71, 3236-3245. (3) Patterson, A. D.; Maurhofer, O.; Beyoglu, D.; Lanz, C.; Krausz, K. W.; Pabst, T.; Gonzalez, F. J.; Dufour, J. F.; Idle, J. R. Aberrant lipid metabolism in hepatocellular carcinoma revealed by plasma metabolomics and lipid profiling. Cancer Res 2011, 71, 6590-6600. (4) Puri, P.; Baillie, R. A.; Wiest, M. M.; Mirshahi, F.; Choudhury, J.; Cheung, O.; Sargeant, C.; Contos, M. J.; Sanyal, A. J. A lipidomic analysis of nonalcoholic fatty liver disease. Hepatology 2007, 46, 1081-1090. (5) Garcia-Canaveras, J. C.; Peris-Diaz, M. D.; Alcoriza-Balaguer, M. I.; Cerdan-Calero, M.; Donato, M. T.; Lahoz, A. A lipidomic cell-based assay for studying drug-induced phospholipidosis and steatosis. Electrophoresis 2017, 38, 2331-2340. (6) Rhee, E. P.; Cheng, S.; Larson, M. G.; Walford, G. A.; Lewis, G. D.; McCabe, E.; Yang, E.; Farrell, L.; Fox, C. S.; O'Donnell, C. J.; Carr, S. A.; Vasan, R. S.; Florez, J. C.; Clish, C. B.; Wang, T. J.; Gerszten, R. E. Lipid profiling identifies a triacylglycerol signature of insulin resistance and improves diabetes prediction in humans. J Clin Invest 2011, 121, 1402-1411. (7) Meikle, P. J.; Wong, G.; Tsorotes, D.; Barlow, C. K.; Weir, J. M.; Christopher, M. J.; MacIntosh, G. L.; Goudey, B.; Stern, L.; Kowalczyk, A.; Haviv, I.; White, A. J.; Dart, A. M.; Duffy, S. J.; Jennings, G. L.; Kingwell, B. A. Plasma lipidomic analysis of stable and unstable coronary artery disease. Arterioscler Thromb Vasc Biol 2011, 31, 2723-2732. (8) Han, X.; Rozen, S.; Boyle, S. H.; Hellegers, C.; Cheng, H.; Burke, J. R.; Welsh-Bohmer, K. A.; Doraiswamy, P. M.; Kaddurah-Daouk, R. Metabolomics in early Alzheimer's disease: identification of altered plasma sphingolipidome using shotgun lipidomics. PLoS One 2011, 6, e21643. (9) Psychogios, N.; Hau, D. D.; Peng, J.; Guo, A. C.; Mandal, R.; Bouatra, S.; Sinelnikov, I.; Krishnamurthy, R.; Eisner, R.; Gautam, B.; Young, N.; Xia, J.; Knox, C.; Dong, E.; Huang, P.; Hollander, Z.; Pedersen, T. L.; Smith, S. R.; Bamforth, F.; Greiner, R., et al. The human serum metabolome. PLoS One 2011, 6, e16957. (10) Bouatra, S.; Aziat, F.; Mandal, R.; Guo, A. C.; Wilson, M. R.; Knox, C.; Bjorndahl, T. C.; Krishnamurthy, R.; Saleem, F.; Liu, P.; Dame, Z. T.; Poelzer, J.; Huynh, J.; Yallou, F. S.; Psychogios, N.; Dong, E.; Bogumil, R.; Roehring, C.; Wishart, D. S. The human urine metabolome. PLoS One 2013, 8, e73076. (11) Fahy, E.; Subramaniam, S.; Brown, H. A.; Glass, C. K.; Merrill, A. H., Jr.; Murphy, R. C.; Raetz, C. R.; Russell, D. W.; Seyama, Y.; Shaw, W.; Shimizu, T.; Spener, F.; van Meer, G.; VanNieuwenhze, M. S.; White, S. H.; Witztum, J. L.; Dennis, E. A. A comprehensive classification system for lipids. J Lipid Res 2005, 46, 839-861. (12) Fahy, E.; Subramaniam, S.; Murphy, R. C.; Nishijima, M.; Raetz, C. R.; Shimizu, T.; Spener, F.; van Meer, G.; Wakelam, M. J.; Dennis, E. A. Update of the LIPID MAPS comprehensive classification system for lipids. J Lipid Res 2009, 50 Suppl, S9-14. (13) Han, X.; Yang, K.; Gross, R. W. Multi-dimensional mass spectrometry-based shotgun lipidomics and novel strategies for lipidomic analyses. Mass Spectrom Rev 2012, 31, 134-178. (14) Cajka, T.; Fiehn, O. Comprehensive analysis of lipids in biological systems by liquid chromatography-mass spectrometry. Trends Analyt Chem 2014, 61, 192-206. (15) Zhu, X.; Chen, Y.; Subramanian, R. Comparison of information-dependent acquisition, SWATH, and MS(All) techniques in metabolite identification study employing ultrahighperformance liquid chromatography-quadrupole time-of-flight mass spectrometry. Anal Chem 2014, 86, 1202-1209. 15 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
Page 16 of 25
(16) Kind, T.; Liu, K. H.; Lee, D. Y.; DeFelice, B.; Meissen, J. K.; Fiehn, O. LipidBlast in silico tandem mass spectrometry database for lipid identification. Nat Methods 2013, 10, 755-758. (17) Tsugawa, H.; Cajka, T.; Kind, T.; Ma, Y.; Higgins, B.; Ikeda, K.; Kanazawa, M.; VanderGheynst, J.; Fiehn, O.; Arita, M. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat Methods 2015, 12, 523-526. (18) Kochen, M. A.; Chambers, M. C.; Holman, J. D.; Nesvizhskii, A. I.; Weintraub, S. T.; Belisle, J. T.; Islam, M. N.; Griss, J.; Tabb, D. L. Greazy: Open-Source Software for Automated Phospholipid Tandem Mass Spectrometry Identification. Anal Chem 2016, 88, 5733-5741. (19) Hutchins PD, R. J., Coon JJ. LipiDex: An Integrated Software Package for High-Confidence Lipid Identification. Cell Systems 2018, 6, 621-625. (20) Hartler J, T. A., Ziegl A, Trötzmüller M, Rechberger GN, Zeleznik OA, Zierler KA, Torta F, Cazenave-Gassiot A, Wenk MR, Fauland A, Wheelock CE, Armando AM, Quehenberger O, Zhang Q, Wakelam MJO, Haemmerle G, Spener F, Köfeler HC, Thallinger GG. Deciphering lipid structures based on platform-independent decision rules. Nature Methods 2017, 14, 1171-1174. (21) Ahmed, Z.; Mayr, M.; Zeeshan, S.; Dandekar, T.; Mueller, M. J.; Fekete, A. Lipid-Pro: a computational lipid identification solution for untargeted lipidomics on data-independent acquisition tandem mass spectrometry platforms. Bioinformatics 2015, 31, 1150-1153. (22) Koelmel JP, K. N., Ulmer CZ, Bowden JA, Patterson RE, Cochran JA, Beecher CWW, Garrett TJ, Yost RA. LipidMatch: an automated workflow for rule-based lipid identification using untargeted high-resolution tandem mass spectrometry data. BMC Bioinformatics 2017, 18, 112. (23) Kavya Anjani, M. L., Nataliya Sokolovska, Christine Poitou, Judith Aron-Wisnewsky, Jean-Luc Bouillot, Philippe Lesnik, Pierre Bedossa, Anatol Kontush, Karine Clement, Isabelle Dugail, Isabelle Dugail, Isabelle Dugail, Joan Tordjman. Circulating phospholipid profiling identifies portal contribution to NASH signature in obesity. Journal of Hepatology 2015, 62, 905-912. (24) Puri P, W. M., Cheung O, Mirshahi F, Sargeant C, Min HK, Contos MJ, Sterling RK, Fuchs M, Zhou H, Watkins SM, Sanyal AJ. The Plasma Lipidomic Signature of NonalcoholicSteatohepatitis. Hepatology 2009, 50, 1827-1838. (25) R Core Team. R Foundation for Statistical Computing, 2016. (26) R Core Team. R: A language and environment for statistical computing. 2008. (27) Pluskal, T.; Castillo, S.; Villar-Briones, A.; Oresic, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 2010, 11, 395. (28) Smith, C. A.; Want, E. J.; O'Maille, G.; Abagyan, R.; Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Analytical chemistry 2006, 78, 779-787. (29) Kuhl, C.; Tautenhahn, R.; Bottcher, C.; Larson, T. R.; Neumann, S. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Analytical chemistry 2012, 84, 283-289. (30) Brunt EM, K. D., Wilson LA, Belt P, Neuschwander-Tetri BA; NASH Clinical Research Network (CRN). Nonalcoholic fatty liver disease (NAFLD) activity score and the histopathologic diagnosis in NAFLD: distinct clinicopathologic meanings. Hepatology 2011, 53, 810-820. (31) Hao Li, Y. C., Yuan Guo, Fangfang Chen, and Zheng-Jiang Zhu. MetDIA: Targeted Metabolite Extraction of Multiplexed MS/MS Spectra Generated by Data-Independent Acquisition. Analytical chemistry 2016, 88, 8757-8764. (32) Fahy, E.; Sud, M.; Cotter, D.; Subramaniam, S. LIPID MAPS online tools for lipid research. Nucleic Acids Res 2007, 35, W606-612. (33) Smith, C. A.; O'Maille, G.; Want, E. J.; Qin, C.; Trauger, S. A.; Brandon, T. R.; Custodio, D. E.; Abagyan, R.; Siuzdak, G. METLIN: a metabolite mass spectral database. Ther Drug Monit 2005, 27, 747-751. (34) Wishart, D. S.; Jewison, T.; Guo, A. C.; Wilson, M.; Knox, C.; Liu, Y.; Djoumbou, Y.; Mandal, R.; Aziat, F.; Dong, E.; Bouatra, S.; Sinelnikov, I.; Arndt, D.; Xia, J.; Liu, P.; Yallou, F.; Bjorndahl, T.; 16 ACS Paragon Plus Environment
Page 17 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2 3 4 5 6 7 8 9 10 11
Analytical Chemistry
Perez-Pineiro, R.; Eisner, R.; Allen, F., et al. HMDB 3.0--The Human Metabolome Database in 2013. Nucleic Acids Res 2013, 41, D801-807. (35) Yepy Hardi Rustam, a. G. E. R. Analytical Challenges and Recent Advances in Mass Spectrometry Based Lipidomics. Analytical chemistry 2018, 90, 374-397. (36) Younossi, Z.; Anstee, Q. M.; Marietti, M.; Hardy, T.; Henry, L.; Eslam, M.; George, J.; Bugianesi, E. Global burden of NAFLD and NASH: trends, predictions, risk factors and prevention. Nature reviews. Gastroenterology & hepatology 2018, 15, 11-20. (37) Garcia-Canaveras, J. C.; Donato, M. T.; Castell, J. V.; Lahoz, A. A comprehensive untargeted metabonomic analysis of human steatotic liver tissue by RP and HILIC chromatography coupled to mass spectrometry reveals important metabolic alterations. Journal of proteome research 2011, 10, 4825-4834.
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 17 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 18 of 25
1
Acknowledgements
2 3 4
This work has been supported by the European Regional Development Fund (FEDER) Institute of Health Carlos III of the Spanish Ministry of Economy and Competitiveness (PI14/0026 and PI17/01282).
18 ACS Paragon Plus Environment
Page 19 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
Figures
Figure 1. Simplified diagram of LipidMS operations.
19 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 20 of 25
Figure 2. Flow diagram of lipid annotation in LipidMS. The steps for the identification of 747.5177 m/z with a tR of 285 seconds is shown as an example.
20 ACS Paragon Plus Environment
Page 21 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
Figure 3. Lipidome alterations in the serum of NASH patients. (A) Principal component analysis scores plot for the control and NASH samples; (B) Volcano plot of the 258 lipids annotated by LipidMS and coloured by lipid class, significant differential abundance for lipid species was assigned to p value 1.5. (C) Boxplots showing significant changes in lipid classes; (D) Boxplots showing significant changes for lipids that have been previously reported as NASH biomarkers detected by LipidMS. Mann-Whitney U tests were used to calculate statistical significance, and p values were corrected using the Benjamini-Hochberg procedure. *, p value < 0.05; **, p value < 0.01; ***, p value < 0.001
21 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 22 of 25
Tables Table1. Summary of the lipid standards identified in the test sample using the Agilent Q-ToF 6550. Possible Levels of structural annotation (PFCS ≥ 0.8)
Class FA (16)1 CE (1) 1
LPL (1) 1
Subclass Level3 Subclass Level Fatty Acyl Level
3
Subclass Level Fatty Acyl Level
3
Subclass Level PL (11)
1
Fatty Acyl Level Fatty Acyl Position Level
Cer (2) 1 SM (1) 1 Glycerolipids (9) 1 Bile acids (9) 1, 2
LIPIDMS
MS-DIAL
DIA
DDA
DIA
16
12
9
1
0
1
0
0
0
0
0
0
1
1
1
0
0
0
4
9
11
6
0
0
Fatty Acyl Level3
2
2
2
Subclass Level
0
1
1
Fatty Acyl Level3
1
0
0
2
5
6
7
0
0
9
-
-
Total identified standards
49/50
30/41***
31/41***
Total identified standards at max. annotation level
42/50
29/41**
29/41***
9
4
23
3
Fatty Acyl Level Fatty Acyl Position Level
3
Subclass Level
Total number of false positives 4
(1) denotes the total number of lipids per class, (2) MS-DIAL does not support bile acids identification, (3) in bold the maximum level of structural annotation reached in each lipid class, (4) false positives identities are annotated based on molecular ion and characteristic lipid fragment, specific identities are listed in Table S10. Statistical p-value was calculated by 2 test * p