Mass Spectral Similarity Networking and Gas-Phase Fragmentation

Jul 17, 2019 - Flavonoids represent an important class of natural products with a central role in plant physiology and human health. Their accurate an...
2 downloads 0 Views 3MB Size
Subscriber access provided by Nottingham Trent University

Article

Mass Spectral Similarity Networking and Gas-Phase Fragmentation Reactions in the Structural Analysis of Flavonoid Glycoconjugates Alan Cesar Pilon, Haiwei Gu, Daniel Raftery, Vanderlan Da Silva Bolzani, Norberto Peporine Lopes, Ian Castro-Gamboa, and Fausto Carnevale Neto Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.8b05479 • Publication Date (Web): 17 Jul 2019 Downloaded from pubs.acs.org on July 18, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Mass Spectral Similarity Networking and Gas-Phase Fragmentation Reactions in the Structural Analysis of Flavonoid Glycoconjugates

Alan Cesar Pilon†‡, Haiwei Gu§,∥, Daniel Raftery§,⊥, Vanderlan da Silva Bolzani†, Norberto Peporine Lopes‡, Ian Castro-Gamboa† and Fausto Carnevale Neto*†‡§ Núcleo de Bioensaios, Biossíntese e Ecofisiologia de Produtos Naturais (NuBBE), Departamento de Química Orgânica, Instituto de Química, Universidade Estadual Paulista (UNESP), Araraquara, 14800-900, São Paulo, Brazil. ‡ Núcleo de Pesquisa em Produtos Naturais e Sintéticos (NPPNS), Departamento de Física e Química, Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, 14040903, São Paulo, Brazil. § Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, 850 Republican St., Seattle, WA 98109, United States. ∥ Jiangxi Key Laboratory for Mass Spectrometry and Instrumentation, East China Institute of Technology, Nanchang, Jiangxi Province 330013, China. ⊥ Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States. †



Corresponding Author

Fausto Carnevale Neto, PhD. Northwest Metabolomics Research Center Department of Anesthesiology and Pain Medicine University of Washington 850 Republican St. Seattle, WA 98109 Tel: +1 (206) 685-4753 Fax: 206-616-4819 Email: [email protected]

1 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 24

ABSTRACT Flavonoids represent an important class of natural products with a central role in plant physiology and human health. Their accurate annotation using untargeted mass spectrometry analysis still relies on differentiating similar chemical scaffolds through spectral matching to reference library spectra. In this work, we combined molecular network analysis with rules for fragment reactions and chemotaxonomy to enhance the annotation of similar flavonoid glyconjugates. Molecular network topology progressively propagated the flavonoids chemical functionalization according to collision-induced dissociation (CID) reactions, as the following chemical attributes: aglycone nature, saccharide type and number, and presence of methoxy substituents. This structural-based distribution across the spectral networks revealed the chemical composition of flavonoids across intra- an inter-species, and guided the putatively assignment of sixty-four isomers and isobars in the Chrysobalanaceae plant species, most of which are not accurately annotated by automated untargeted MS2 matching. These proof-of-concept results demonstrated how molecular networking progressively grouped structurally related molecules according to their product ion scans, abundances and ratios The approach can be extrapolated to other classes of metabolites sharing similar structures and diagnostic fragments from tandem mass spectrometry.

Keywords: flavonoid O-glycoconjugates, Chrysobalanaceae, molecular networking, fragmentation reaction, tandem mass spectrometry, natural products

2 ACS Paragon Plus Environment

Page 3 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

INTRODUCTION Flavonoids constitute one of the main classes of natural products in plants, with more than 9,000 known structures.1,2 They are responsible for several events that regulate the adaptation and survival in plants, including the attraction of pollinators and microorganisms, and the protection against UV radiation, pathogens and herbivores.1,3–5 As important phytochemicals, flavonoids are an integral part of human and animal diets, and their effects on health-promotion and food industry are of great interest.6–8 Flavonoids mostly contain a C6-C3-C6 benzo-γ-pyrone structural core, along with various substitutions catalyzed by isomerases, reductases, hydroxylases, methylases, and glycosyltransferases that produce a variety of compounds with the same elemental composition.9,10 In plants, they primarily occur in the form of glycoconjugates, i.e., conjugated with one to several saccharides at the oxygen atom of an -OH group (O- glycosidic bond) or a carbon atom (C-glycosidic bond) of the aglycone skeleton.11 Consequently, there is a critical need for the development of analytical methods to elucidate these structurally similar organic compounds in complex biological matrices and plant sources. Liquid chromatography combined with electrospray ionization mass spectrometry (HPLC-ESIMS) has become the most important technique for the identification and the structural characterization of flavonoids.12 It provides accurate masses of detected ions with the use of high-resolution instruments, and the structural information derived from fragmentation reactions using tandem mass spectrometry (MS/MS or MS2) such as the quadrupole time-of-flight (QTOF) mass spectrometer.13 Recently, innovative MSnbased methods have been developed for the untargeted profiling of all flavonoids in complex natural sources, providing key fragments about both aglycones and glycans.14–19 According to these systematic approaches, the flavonoids’ dissociative behavior can reveal the nature of aglycones, the glycosylation type and position, the structure/linkage of their sugar conjugates, and methoxy/methyl substitutes.10,11,20,21 Unfortunately, despite the numerous well-established methods for the accurate identification of flavonoids, there is no single procedure capable to provide the rapid and unambiguous elucidation of all isomers from comprehensive untargeted profiling.17 Most approaches rely on time-consuming manual evaluation of the characteristic neutral losses and relative abundances of common fragments. The increased sharing of experimental MS/MS data and the growing number of spectral databases, such as FlavonoidSearch,22 METLIN,23,24 MassBank,25 MASST,26 NuBBEDB,27 Sumner/Bruker,26

3 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 24

ReSpect28 have promoted the development of various bioinformatics approaches that assist the interpretation of large MS/MS datasets.29–32 Global natural products social molecular networking (GNPS) is a particularly efficient tool for processing and annotating MS2 data through spectral similarity algorithms.33–41 The spectral network (molecular network, MN) concept is based on the organization and visualization of tandem MS information via spectral similarity (homologous MS2 fragmentations).35,38 Structurally related compounds usually share similar MS/MS spectra, and GNPS groups the similar spectra in a network-based format, allowing the visual exploration of identical and analogous molecules. The chemical annotation via GNPS arises from the direct spectral matching between the MS/MS spectra and their available databases; and through the visual inspection of the mass differences between closely related structures, highlighted by their spectral likeness within the network. For example, a mass difference of 15 u between two precursor ions suggests an additional CH3 moiety, and mass differences of 162, 146 or 132 u may refer to the neutral loss of hexose, deoxyhexose or pentose.16,38 Both annotation methods, spectral matching with reference libraries or network propagation of m/z differences, do not fully take into account the relative abundance and ratio of the product-ions, limiting the accuracy of GNPS to correctly distinguish between chemical substructures. This is critical for the annotation of flavonoids derivatives and their widespread isomerism.17 In this study, we propose the incorporation of multiple fragmentation rules and chemotaxonomy in the spectral similarity networks (MS/MS data) for the assignment of glycoconjugated flavonoids present in six plant species of the family Chrysobalanaceae, known by the high distribution and diversity of flavonoids.42 This strategy was built on characterizing how product-ions values, abundances and ratios guide the structural organization of the spectral network, with the goal of uncovering the identities of structurally related compounds, as illustrated in Scheme 1. The rationale can be expanded for any other classes of metabolites that have a broad fragmentation profile when applied to tandem mass spectrometry, such as alkaloids,43–45 polyphenols,46,47 lipids,48–50 and others.51–53

EXPERIMENTAL METHODS

4 ACS Paragon Plus Environment

Page 5 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Plant material. Six Chrysobalanaceae plant species were collected in different regions from São Paulo State, Brazil, during development of the NuBBE extracts library, as part of the interdisciplinary BIOTA/SP program (FAPESP No. 2003/02176-7).54,55 Vouchers were deposited at the “Herbarium Maria Eneyda P. K. Fidalgo”, Instituto Botânico de São Paulo, São Paulo, Brazil. Table S-1 depicts the plant species, their collection locations and the voucher codes. For more information on extraction procedures and sample preparation for LC-MS analysis see Supporting Information. General Experimental Procedures. The HPLC-ESI-MS/MS was carried out using an Agilent 1200 SL LC system coupled to an Agilent 6520 Q-TOF mass spectrometer (Agilent Technologies, Santa Clara, CA, USA) and equipped with an electrospray ionization (ESI) source. The separation conditions were adapted from previously reported LC-MS profiling of semi-polar metabolites in the methanol extract of tomatoes56 by the software HPLC calculator v. 3.0.57 Separation was performed using an Agilent Eclipse XDB-C18 column (100 mm x 3 mm x 1.8 μm), with an H2O:CH3CN solvent system containing 0.5% acetic acid (both solvents). Gradient elution was performed as follows: 3% CH3CN for 1 min, 3% to 9% during 13 min.; 9% to 35% during 18 min.; 35% to 100% during 7 min.; 100% CH3CN for 6 min, for a total runtime of 45 min. The flow rate was 0.7 mL/min, the injection volume was 10 µL, and the sample concentration was 15.0 mg.mL-1. The ESI conditions were as follows: electrospray ion source ESI Agilent Jet Stream Technology in positive and negative ionization modes; voltage 3.5 kV; cone voltage 40 V; source temperature 120 °C; desolvation temperature 250 °C; cone gas flow 20 L.h-1; desolvation gas flow 600 L.h1.

Nitrogen was used as the drying and collision gases; the collision energy was either 15, 30 or 45 eV ((+)-

ESI) and 15, 25 or 35 eV ((-)-ESI); MS scan rate of 1.03 spectra.s-1 and MS/MS scan rate of 1.05 spectra.s-1 across the range m/z 100–1000. Two precursor ions were selected per cycle (relative threshold > 1000 counts), excluding ions previously detected on blank injections (solvent). The MS was calibrated using purine (2 μM) and Agilent’s HP-921 calibration solution. An exclusion list containing signals from blank injections (triplicate) was made to avoid the MS2 analysis of background ions. Ten LC-samples were subjected to the automated HPLC-ESI-QTOF/MS analysis in both positive and negative ionization modes at different collision induced dissociation (CID) energies: 15, 30 and 45 eV for positive ESI, and 15, 25 and 35 eV for negative ESI.

5 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 24

Data annotation. HPLC-ESI-MS/MS data were processed using MassHunter Workstation ver. B.06.00 software (Agilent Technologies, Palo Alto, CA, USA). Tandem mass spectrometry molecular networks were created using the GNPS platform (http://gnps.ucsd.edu).36,37 The LC-MS2 data for each sample were converted into two new mzXML files using MSConvert freeware.58 The files were subjected to Spectral Networks algorithm (GNPS). The data was filtered by removing all MS2 peaks within +/- 17 u of the precursor m/z. MS2 spectra were filtered by choosing only the top 6 peaks in the +/- 50 u window throughout the spectrum. The data was clustered with MSCluster with a parent mass tolerance of 2.0 u and a MS2 fragment ion tolerance of 0.5 u to create consensus spectra (consensus spectra, see https://ccmsucsd.github.io/GNPSDocumentation/networking). The network was created by filtering the edges to have a cosine score above 0.5 and more than 4 matched peaks. The spectra in the network were searched against the GNPS spectral libraries (GNPS - https://gnps.ucsd.edu/ProteoSAFe/libraries.jsp), using same setup as the input data.37 Network visualization was performed in Cytoscape 2.8.3 and 3.4.36,59 To avoid misinterpretation of HPLC contaminants, blank (mobile phase) injections were uploaded as a distinct sample group on GNPS workflow and excluded from the networks. Node colors were mapped based on the source files of the MS2 data and the edge thickness attribute was defined to reflect cosine similarity scores, with thicker lines indicating higher similarity.34,36,59 LC-MS2 data were deposited in the MassIVE Public GNPS data set (http://massive.ucsd.edu, MSV000082164). All networks can be accessed online (links in supplementary material). RESULTS AND DISCUSSION The analyses of ten samples prepared from the stem and/or leaves of six plant species at different ESI modes and CID energy (CE) resulted in more than 35,000 of MS2 spectra for sixty experiments, as shown in Table 1. The base-peak chromatograms are depicted in Fig. S-1. Prior to generate the MN, every MS/MS spectrum within a defined precursor ion mass tolerance (PIMT) was merged into a single consensus MS/MS spectrum and called a node (Fig. 1A).60 The m/z values and the respective intensities of all the product ions that compose the node were converted into vectors. Molecular networking calculates the cosine scores (similarity) for the vectors, and groups the similar nodes in networks (clusters) connected by their spectral relatedness (edges). Since similar fragmentation often implies related structures, the spectral network represents groups of related molecules (nodes), where each

6 ACS Paragon Plus Environment

Page 7 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

node corresponds to a single chemical species, connected by chemical similarities (edges), i.e., neutral losses and common fragments (Fig. 1B).37 Accordingly, six molecular networks (MNs) were calculated using the data acquired at the same ionization mode and CE. Two additional MNs were also created using all the CEs for (+) ESI and (-) ESI modes, respectively. The resulting molecular networks revealed the clustering of various classes of metabolites, including flavonoid glycoconjugates, as shown in Figs. S-2 to S-9. Moreover, they showed how different collision energies and ionization modes guided their structural arrangement along the MNs. Considering the flavonoid glycoconjugates, the cluster formed among (+) ESI experiments were mostly grouped due to neutral losses, while in (-) ESI, they were more impacted through the relative abundance and ratio of their products-ions. MNs using all collision energies generated larger flavonoid glycoconjugates clusters, with 77 and 82 nodes for positive and negative ESI modes, respectively, as shown in Table 1. This result suggests that by increasing the number of MS2 spectra to be merged into the consensus MS2 spectra, the better is the quality of the spectral information displayed in each node, and the structural organization of the MN based on spectral similarity. Contrarily, the cluster using CID energy of 15 eV in (+) ESI mode was largely consolidated by the limited fragmentation information, mostly labile cleavages of the saccharide subunits across all samples, which grouped different glycosylated metabolites, not only flavonoids. Since MSCluster does not use the retention time in the pipeline,37 the nodes can group positional isomers and in source fragments that show similar parent mass and MS2 data, especially for flavonoids with several saccharides such as di- and tri-O-glycosylates. Despite this, GNPS calculates the mean and the standard deviation of the retention times (Rt) of all the MS/MS spectra that compose each node, which allows one to investigate the presence of different isomers within a same node. The results, plotted in Fig. 2, showed the nodes related to the flavonoids were formed by MS scans with low Rt variation in all samples across all ionization modes, as well as a STD error < 0.5 min. for over 88% of nodes for (+)-ESI and 72% for (-)-ESI MNs. The consensus MS2 spectra in the network were then searched against spectral libraries in GNPS. The library spectra were filtered in the same manner as the input data. It was applied a spectral similarity cutoff of 50%. The reference spectrum with higher matching scores (≥ 50%) was preferably selected. 20

7 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 24

out of 77 nodes related to the flavonoid glycoconjugates cluster in (+)-ESI using all CEs, and 21 out of 82 nodes in (-)-ESI cluster, were putatively annotated via GNPS database search. However, manual data curation showed that more than 70% of the candidates in (+)-ESI MN, and 86% for (-)-ESI MN were misannotated, as illustrated in Table 1. GNPS spectral matching tends to focus on major fragmentation differences, such as neutral losses, and reduce the contribution of discrete product ion variations as relative abundances to the chemical annotation.37 Nevertheless, molecular networking progressively groups compounds based on MS2 fragmentation patterns, driving the distribution of structurally related molecules according to their product ion scans, abundances and ratios. We used the self-organization of the spectral networks, prior to the spectral matching, to improve the assignment of the flavonoid glycoconjugates. Initially, the nodes were supplemented based on multiple CID fragmentation rules for glycosylated flavonoids by considering neutral losses, intensity values, and the ratio of product-ions such as [Y0-H]-/Y0-, as shown in Table 2. The proposed mechanism for the formation of the main fragments in flavonoid glycoconjugates are shown in Figs. S-10 to S-14. Chemotaxonomic metadata of Chrysobalanaceae plant species were also incorporated into the MNs to guide flavonoid annotation and substructure distribution across the spectral networks. The various layouts derived from the metadata were added to the MNs in Cytoscape and showed the differential distribution of nodes within the clusters, as demonstrated in Figs. 3A-C, 4A-C and Fig. 5A-B. The resulting MS2 clusters for the flavonoid glycoconjugates from Chrysobalanaceae showed the predominance of pentose derivatives (red color in Fig. 3C) in samples from Hirtella genus (purple in Figure 3B), quercetin derivatives in the Parinari species (triangular nodes in Fig. 3C, and green color Fig. 3B), a higher content of tri-O-glycosides (larger nodes in Fig. 3C) in Licania, and absence of myricetin derivatives in Coupeia, as previously described39,40 (square nodes in Fig. 3C, and red color - Fig. 3B). Interestingly, the degree of methoxy substituents was associated with the number of hydroxy moieties at the catechol group of the quercetin and myricetin derivatives. MNs built on (-)-ESI experiments using CE = 15 eV and 35 eV were able to organize the flavonoid glycoconjugates according to the number of saccharides (mono-, di- or tri-), the glycan linkages (1→6 or 1→2), and the glycosylation positions (3-O, 7-O or 3,7-O) as shown in Fig. 5A and B, respectively.

8 ACS Paragon Plus Environment

Page 9 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

The new layers of chemical information allocated in the MN topology were used to guide the assignment of unknown nodes, as exemplified in Fig. 6. The yellow node 6411 (m/z 461, Fig 6, table S-2) was analyzed using the layouts related to species, sugar types, position and presence of methoxy groups.. This node consisted of a flavonoid from Couepia plant species (characterized by the absence of myricetin derivatives, and indicted as red nodes on the top right of the box, Fig. 6). It contained mono- or di-methoxy groups (border width at the middle of the right box in Fig. 6), hexose and/or deoxyhexose (orange and blue nodes at the middle of the right box in Fig. 6) that were bonded at position O-3 (yellow nodes at the bottom of the right box, Fig. 6). The chemical structure of the node 6411 was investigated via gas-phase fragmentation rules of flavonoid glycoconjugates (Table 2). It was observed to have neutral losses of 162 u and 15 u, and the ions m/z 461, 446, 299 and 283 which revealed a methoxylated kaempferol with a hexose. The 3-O position was validated by the diagnostic ion ratio 299/298 ([Y0-H]•- / Y0- > 1.6 at 35 eV), and the methoxylation at C-4' by the diagnostic ion m/z 269. Thus, the node 6411 was annotated as 4'-methoxykaempferol-3-O-hexoside (Fig. S-15). The same approach led to the annotation of another unknown node 7833 (m/z 637, table S-2, Fig. 6) as 3’,4’-dimethoxy-myricetin-3-O-dideoxyhexosyl-(1→2)dideoxyhexoside. The structural-based propagation across the molecular network topology guided the detection of ion ratios 345/344 and 330/329 ([Y0-H]•-/Y0-), neutral losses of 164 u and 292 u, the radical elimination of •CH3, and the diagnostic ion m/z 473, Fig. S-16. The use of MN to propagate the structural information from untargeted MS/MS analyses led to the annotation of 64 flavonoid glycoconjugates from six species of Chrysobalanaceae plant family, as shown in Table S-2. MS/MS spectra for all annotated compounds are publicly available in the GNPS platform (GNPS spectral library IDs were listed in Table S-2). The location of the flavonoids across the MNs guided the unprecedented description of tri-O-glycosides in Licania species. It also revealed the distribution of methoxy substituents in flavonol derivatives, including the tentative determination of their position at the flavonoid`s aglycone through the assistance of gas-phase fragmentation reactions (Figs. S-15 and S-16).

CONCLUSIONS We combined MN capabilities with detailed chemical and taxonomical knowledge for the annotation of flavonoids present in complex samples of Chrysobalanaceae species. These series of chemical

9 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 24

attributes derived from pre-established fragmentation behavior of authentic standards11,16,61–64 were integrated with taxonomical data in a unified, multi-informative, and layout-driven molecular network to navigate through the chemical diversity of flavonoids in a new and more intuitive manner. This approach can progressively propagate a detailed annotation of unknown metabolites within the same class through the mapping of the major gas-phase fragmentation reactions, and reveal the chemical distribution across intra- and inter-species by adding layers of chemotaxonomic description. The MS2 spectra provided information about the glycosylation type and position, nature of aglycone and the structure/linkage of their glycan moieties. The systematic investigation of all nodes based on the interpretation of the fragmentation patterns in the gas-phase, in conjunction with previous studies for flavonoid identification via MS2, attest to the potential of MN as a tool to explore the chemical diversity of structurally related molecules, and to unravel their complexity. According to the literature, more than fifty flavonoids have been described, so far, from Chrysobalanaceae plant species, with a predominance of flavonol glycosides.42 Based on the inspection of all MN nodes related to the flavonoid, sixty-four metabolites were putatively annotated in the Chrysobalanaceae plant species, which mostly were flavonol 3-O-glycosides. This is also the first report of methoxy, O-di- and tri-O-glycosyl flavonols for this family. Since the results of this study are publicly available, other GNPS users may access these data for their studies in search of flavonoid glycoconjugates. Moreover, this same approach can be extended to different classes of metabolites that share similar fragmentation pattern and generate diagnostic ions from tandem mass spectrometry analysis.

ASSOCIATED CONTENT Supporting Information Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org. Author Contributions The manuscript was written through contributions of all authors. / All authors have given approval to the final version of the manuscript.

10 ACS Paragon Plus Environment

Page 11 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Notes The authors declare no competing financial interest.

ACKNOWLEDGMENT This work was funded by the Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) for fellowship support (2010/07564-9, 2012/20613-4 and 2014/12343-2 to FCN, 2010/17935-4 and 2016/13292-8 to ACP, 2003/02176-7 and 2010/52327-5 to IC-G and VSB, and 2014/50265-3 to NPL), Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), National Natural Science Foundation of China (No. 21365001), Chinese National Instrumentation Program (2011YQ170067) and the University of Washington (to HG and DR). The authors thank G. A. Nagana Gowda, Northwest Metabolomics Research Center, for useful discussions and suggestions.

REFERENCES (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22)

Quideau, S. Angew. Chemie Int. Ed. 2006, 45 (41), 6786–6787. Gottlieb, O. R. Micromolecular evolution, systematics and ecology, 1st ed.; Springer-Verlag: Berlin, 1982. Falcone Ferreyra, M. L.; Rius, S. P.; Casati, P. Front. Plant Sci. 2012, 3, 222. Koes, R. E.; Quattrocchio, F.; Mol, J. N. M. BioEssays 1994, 16 (2), 123–132. Silva, D. B.; Turatti, I. C. C.; Gouveia, D. R.; Ernst, M.; Teixeira, S. P.; Lopes, N. P. Sci. Rep. 2015, 4 (1), 4309. Hodgson, J. M.; Croft, K. D. Mol Asp. Med 2010, 31 (6), 495–502. Pietta, P. G. J. Nat. Prod. 2000, 63 (7), 1035–1042. George, V. C.; Dellaire, G.; Rupasinghe, H. P. V. J. Nutr. Biochem. 2017, 45, 1–14. Le Roy, J.; Huss, B.; Creach, A.; Hawkins, S.; Neutelings, G. Front. Plant Sci. 2016, 7, 735, DOI: 10.3389/fpls.2016.00735. Williams, C. A.; Grayer, R. J. Nat. Prod. Rep. 2004, 21 (4), 539. Qin, Y.; Gao, B.; Shi, H.; Cao, J.; Yin, C.; Lu, W.; Yu, L.; Cheng, Z. J. Pharm. Biomed. Anal. 2017, 142, 113–124. Vukics, V.; Ringer, T.; Kery, A.; Bonn, G. K.; Guttman, A. J. Chromatogr. A 2008, 1206 (1), 11– 20. Aksenov, A. A.; da Silva, R.; Knight, R.; Lopes, N. P.; Dorrestein, P. C. Nat. Rev. Chem. 2017, 1 (7), 0054. Stobiecki, M.; Kachlicki, P.; Wojakowska, A.; Marczak, Ł. Phytochem Lett 2015, 11, 358–367. Stobiecki, M. Phytochemistry 2000, 54 (3), 237–256. Vukics, V.; Guttman, A. Mass Spectrom. Rev. 2010, 29 (1), 1–16. Abrankó, L.; Szilvássy, B. J Mass Spec 2015, 50 (1), 71–80. Cuyckens, F.; Ma, Y. L.; Pocsfalvi, G.; Claeys, M. Analusis 2000, 28 (10), 888–895. Mahmood, T.; Akhtar, N.; Khan, B. A.; Ahmad, M.; Khan, H. M. S.; Zaman, S. U. Int. J. Acad. Res. 2010, 2 (2), 121–126. Vukics, V.; Guttman, A. Mass Spec Rev 2010, 29 (1), 1–16. Cuyckens, F.; Claeys, M. J Mass Spec 2005, 40 (3), 364–372. Akimoto, N.; Ara, T.; Nakajima, D.; Suda, K.; Ikeda, C.; Takahashi, S.; Muneto, R.; Yamada, M.; Suzuki, H.; Shibata, D.; Sakurai, N. Sci. Rep. 2017, 7 (1), 1243.

11 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(23) (24) (25) (26)

(27) (28) (29) (30) (31) (32) (33) (34) (35) (36) (37)

(38) (39) (40)

Page 12 of 24

Smith, C. A.; O’Maille, G.; Want, E. J.; Qin, C.; Trauger, S. A.; Brandon, T. R.; Custodio, D. E.; Abagyan, R.; Siuzdak, G. Ther. Drug Monit. 2005, 27 (6), 747–751. Kite, G. C. Anal. Chem. 2018, acs.analchem.8b03613. Horai, H.; Arita, M.; Kanaya, S.; Nihei, Y.; Ikeda, T.; Suwa, K.; Ojima, Y.; Tanaka, K.; Tanaka, S.; Aoshima, K. J. Mass Spectrom. 2010, 45 (7), 703–714. Wang, M.; Jarmusch, A. K.; Vargas, F.; Aksenov, A. A.; Gauglitz, J. M.; Weldon, K.; Petras, D.; Silva, R. da; Quinn, R.; Melnik, A. V; van der Hooft, J. J. J.; Caraballo Rodríguez, A. M.; Nothias, L. F.; Aceves, C. M.; Panitchpakdi, M.; Brown, E.; Di Ottavio, F.; Sikora, N.; Elijah, E. O.; Labarta-Bajo, L.; Gentry, E. C.; Shalapour, S.; Kyle, K. E.; Puckett, S. P.; Watrous, J. D.; Carpenter, C. S.; Bouslimani, A.; Ernst, M.; Swafford, A. D.; Zúñiga, E. I.; Balunas, M. J.; Klassen, J. L.; Loomba, R.; Knight, R.; Bandeira, N.; Dorrestein, P. C. bioRxiv 2019, 591016. Pilon, A. C.; Valli, M.; Dametto, A. C.; Pinto, M. E. F.; Freire, R. T.; Castro-Gamboa, I.; Andricopulo, A. D.; Bolzani, V. S. Sci. Rep. 2017, 7 (1), 7215. Sawada, Y.; Nakabayashi, R.; Yamada, Y.; Suzuki, M.; Sato, M.; Sakata, A.; Akiyama, K.; Sakurai, T.; Matsuda, F.; Aoki, T.; Hirai, M. Y.; Saito, K. Phytochemistry 2012, 82, 38–45. Xia, J.; Sinelnikov, I. V; Han, B.; Wishart, D. S. Nucleic Acids Res. 2015, gkv380. Smith, C. A.; Want, E. J.; O’Maille, G.; Abagyan, R.; Siuzdak, G. Anal. Chem. 2006, 78 (3), 779– 787. Tsugawa, H.; Cajka, T.; Kind, T.; Ma, Y.; Higgins, B.; Ikeda, K.; Kanazawa, M.; VanderGheynst, J.; Fiehn, O.; Arita, M. Nat. Methods 2015, 12, 523. Frank, A. M.; Monroe, M. E.; Shah, A. R.; Carver, J. J.; Bandeira, N.; Moore, R. J.; Anderson, G. A.; Smith, R. D.; Pevzner, P. A. Nat. Methods 2011, 8, 587. Moree, W. J.; Yang, J. Y.; Zhao, X.; Liu, W. T.; Aparicio, M.; Atencio, L.; Ballesteros, J.; Sánchez, J.; Gavilán, R. G.; Gutiérrez, M.; Dorrestein, P. C. J. Chem. Ecol. 2013, 39 (7), 1045–1054. de Oliveira, G.; Carnevale Neto, F.; Demarque, D.; de Sousa Pereira-Junior, J.; Sampaio Peixoto Filho, R.; de Melo, S.; da Silva Almeida, J.; Lopes, J.; Lopes, N. Planta Med. 2016, 83 (07), 636– 646. Olivon, F.; Allard, P.-M.; Koval, A.; Righi, D.; Genta-Jouve, G.; Neyts, J.; Apel, C.; Pannecouque, C.; Nothias, L.-F.; Cachet, X.; Marcourt, L.; Roussi, F.; Katanaev, V. L.; Touboul, D.; Wolfender, J.-L.; Litaudon, M. ACS Chem. Biol. 2017, 12 (10), 2644–2651. Watrous, J.; Roach, P.; Alexandrov, T.; Heath, B. S.; Yang, J. Y.; Kersten, R. D.; van der Voort, M.; Pogliano, K.; Gross, H.; Raaijmakers, J. M. Proc Natl Acad Sci 2012, 109 (26), E1743–E1752. Wang, M.; Carver, J. J.; Phelan, V.; Sanchez, L. M.; Garg, N.; Peng, Y.; Nguyen, D. D.; Watrous, J.; Kapono, C. A.; Luzzatto-Knaan, T.; Porto, C.; Bouslimani, A.; Melnik, A. V; Meehan, M. J.; Liu, W.-T.; Crüsemann, M.; Boudreau, P. D.; Esquenazi, E.; Sandoval-Calderón, M.; Kersten, R. D.; Pace, L. A.; Quinn, R. A.; Duncan, K. R.; Hsu, C.-C.; Floros, D. J.; Gavilan, R. G.; Kleigrewe, K.; Northen, T.; Dutton, R. J.; Parrot, D.; Carlson, E. E.; Aigle, B.; Michelsen, C. F.; Jelsbak, L.; Sohlenkamp, C.; Pevzner, P.; Edlund, A.; McLean, J.; Piel, J.; Murphy, B. T.; Gerwick, L.; Liaw, C.-C.; Yang, Y.-L.; Humpf, H.-U.; Maansson, M.; Keyzers, R. A.; Sims, A. C.; Johnson, A. R.; Sidebottom, A. M.; Sedio, B. E.; Klitgaard, A.; Larson, C. B.; Boya P, C. A.; Torres-Mendoza, D.; Gonzalez, D. J.; Silva, D. B.; Marques, L. M.; Demarque, D. P.; Pociute, E.; O’Neill, E. C.; Briand, E.; Helfrich, E. J. N.; Granatosky, E. A.; Glukhov, E.; Ryffel, F.; Houson, H.; Mohimani, H.; Kharbush, J. J.; Zeng, Y.; Vorholt, J. A.; Kurita, K. L.; Charusanti, P.; McPhail, K. L.; Nielsen, K. F.; Vuong, L.; Elfeki, M.; Traxler, M. F.; Engene, N.; Koyama, N.; Vining, O. B.; Baric, R.; Silva, R. R.; Mascuch, S. J.; Tomasi, S.; Jenkins, S.; Macherla, V.; Hoffman, T.; Agarwal, V.; Williams, P. G.; Dai, J.; Neupane, R.; Gurr, J.; Rodríguez, A. M. C.; Lamsa, A.; Zhang, C.; Dorrestein, K.; Duggan, B. M.; Almaliti, J.; Allard, P.-M.; Phapale, P.; Nothias, L.-F.; Alexandrov, T.; Litaudon, M.; Wolfender, J.-L.; Kyle, J. E.; Metz, T. O.; Peryea, T.; Nguyen, D.-T.; VanLeer, D.; Shinn, P.; Jadhav, A.; Müller, R.; Waters, K. M.; Shi, W.; Liu, X.; Zhang, L.; Knight, R.; Jensen, P. R.; Palsson, B. Ø.; Pogliano, K.; Linington, R. G.; Gutiérrez, M.; Lopes, N. P.; Gerwick, W. H.; Moore, B. S.; Dorrestein, P. C.; Bandeira, N. Nat. Biotechnol. 2016, 34 (8), 828–837. Yang, J. Y.; Sanchez, L. M.; Rath, C. M.; Liu, X.; Boudreau, P. D.; Bruns, N.; Glukhov, E.; Wodtke, A.; de Felicio, R.; Fenner, A. J Nat Prod 2013, 76 (9), 1686–1699. Allard, P.-M. M.; Péresse, T.; Bisson, J.; Gindro, K.; Marcourt, L.; Pham, V. C.; Roussi, F.; Litaudon, M.; Wolfender, J.-L. L. Anal chem 2016, 88 (6), 3317–3323. Bouslimani, A.; Melnik, A. V.; Xu, Z.; Amir, A.; da Silva, R. R.; Wang, M.; Bandeira, N.;

12 ACS Paragon Plus Environment

Page 13 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(41) (42) (43) (44) (45) (46) (47) (48) (49) (50) (51) (52) (53) (54) (55) (56) (57) (58) (59) (60) (61) (62) (63) (64)

Alexandrov, T.; Knight, R.; Dorrestein, P. C. Proc. Natl. Acad. Sci. 2016, 113 (48), E7645–E7654. Nothias, L.-F.; Nothias-Esposito, M.; da Silva, R.; Wang, M.; Protsyuk, I.; Zhang, Z.; Sarvepalli, A.; Leyssen, P.; Touboul, D.; Costa, J.; Paolini, J.; Alexandrov, T.; Litaudon, M.; Dorrestein, P. C. J. Nat. Prod. 2018, 81 (4), 758–767. Carnevale Neto, F.; Pilon, A. C.; da Silva Bolzani, V.; Castro-Gamboa, I. Phytochem. Rev. 2013, 12 (1), 121–146. Stevigny, C.; Jiwan, J. H.; Rozenberg, R.; de Hoffmann, E.; Quetin‐Leclercq, J. Rapid Commun Mass Spectrom 2004, 18 (5), 523–528. Guaratini, T.; Feitosa, L. G. P.; Silva, D. B.; Lopes, N. P.; Lopes, J. L. C.; Vessecchi, R. J. Mass Spectrom. 2017, 52 (9), 571–579. Cardozo, K. H. M.; Vessecchi, R.; Carvalho, V. M.; Pinto, E.; Gates, P. J.; Colepicolo, P.; Galembeck, S. E.; Lopes, N. P. Int. J. Mass Spectrom. 2008, 273 (1), 11–19. Fulcrand, H.; Mané, C.; Preys, S.; Mazerolles, G.; Bouchut, C.; Mazauric, J.-P.; Souquet, J.-M.; Meudec, E.; Li, Y.; Cole, R. B. Phytochemistry 2008, 69 (18), 3131–3138. Rodrigues, C. M.; Rinaldo, D.; dos Santos, L. C.; Montoro, P.; Piacente, S.; Pizza, C.; Hiruma‐ Lima, C. A.; Brito, A. R. M.; Vilegas, W. Rapid Commun. Mass Spectrom. 2007, 21 (12), 1907– 1914. Hsu, F.-F.; Turk, J. J. Am. Soc. Mass Spectrom. 2008, 19 (11), 1673–1680. Hsu, F.-F.; Turk, J. J. Am. Soc. Mass Spectrom. 1999, 10 (7), 600–612. Hsu, F.-F.; Turk, J. J. Am. Soc. Mass Spectrom. 2010, 21 (4), 657–669. Demarque, D. P.; Crotti, A. E. M.; Vessecchi, R.; Lopes, J. L. C.; Lopes, N. P. Nat Prod Rep 2016, 33, 432–455. Carnevale Neto, F.; Guaratini, T.; Costa‐Lotufo, L.; Colepicolo, P.; Gates, P. J.; Lopes, N. P. Rapid Commun. Mass Spectrom. 2016, 30 (13), 1540–1548. Berry, K. A. Z.; Barkley, R. M.; Berry, J. J.; Hankin, J. A.; Hoyes, E.; Brown, J. M.; Murphy, R. C. Anal. Chem. 2017, 89 (1), 916–921. Lopes, E. M. C.; Carreira, R. C.; Agripino, D. G.; Torres, L. M. B.; Cordeiro, I.; Bolzani, V. da S.; Dietrich, S. M. de C.; Young, M. C. M. Rev. Bras. Farmacogn. 2008, 18, 655–660. Carnevale Neto, F.; Pilon, A. C.; Selegato, D. M.; Freire, R. T.; Gu, H.; Raftery, D.; Lopes, N. P.; Castro-Gamboa, I. Front. Mol. Biosci. 2016, 3 (59), DOI: 10.3389/fmolb.2016.00059. Gómez-Romero, M.; Segura-Carretero, A.; Fernández-Gutiérrez, A. Phytochemistry 2010, 71 (16), 1848–1864. Guillarme, D.; Nguyen, D. T. T.; Rudaz, S.; Veuthey, J.-L. Eur. J. Pharm. Biopharm. 2008, 68 (2), 430–440. Kessner, D.; Chambers, M.; Burke, R.; Agus, D.; Mallick, P. Bioinformatics 2008, 24 (21), 2534– 2536. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N. S.; Wang, J. T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Genome Res 2003, 13 (11), 2498–2504. Frank, A. M.; Bandeira, N.; Shen, Z.; Tanner, S.; Briggs, S. P.; Smith, R. D.; Pevzner, P. A. J. Proteome Res. 2008, 7 (1), 113–122. Ferreres, F.; Llorach, R.; Gil‐Izquierdo, A. J Mass Spec 2004, 39 (3), 312–321. Yang, W.; Qiao, X.; Bo, T.; Wang, Q.; Guo, D.; Ye, M. Rapid Commun. Mass Spectrom. 2014, 28 (4), 385–395. Ablajan, K.; Abliz, Z.; Shang, X.-Y.; He, J.-M.; Zhang, R.-P.; Shi, J.-G. J. Mass Spectrom. 2006, 41 (3), 352–360. Ma, Y. L.; Li, Q. M.; Van den Heuvel, H.; Claeys, M. Rapid Commun Mass Spec 1997, 11 (12), 1357–1364.

13 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 24

FIGURE CAPTIONS Scheme 1. Workflow demonstrating the combination of MN and gas-phase fragmentation reactions for annotation of flavonoid-O-glycoconjugates. In (1) the flavonoid content in six plant species of Chrysobalanaceae family were investigated using LC-ESI-QTOF MS/MS analyses. The data were clustered using the GNPS platform (right side), in parallel to the systematic annotation of the flavonoids according to their main MS/MS dissociative reactions (neutral losses, relative abundance, ratio and diagnostic ions, left side). In (2), the diagnostic product ions (left side), and the chemotaxonomic information of Chrysobalanaceae species were incorporated into the MS topology to propagate the annotation of flavonoids. In (3), the resulting layout-driven MN integrating the taxonomical and chemical attributes guided the rapid and accurate annotation of flavonoid glycoconjugates. Figure 1. Process of node and cluster formation by the GNPS platform. (A) The similarity degree between intra- or inter-sample spectra are evaluated by the MScluster algorithm and grouped into a consensus spectrum. (B) Clusters are formed when pre-establish degree of similarity between consensus spectra is reached. The edges represent the similarity degree between the nodes. Figure 2. Dispersion of the retention time mean (RTmean) and standard deviation error (STDerror) of the flavonoid nodes across the molecular networks. (A) and (C) showed the RTmean in MNs using (+) and (-) ESI modes, respectively; (B) and (D) corresponded to the STDerror using (+) and (-) ESI modes, respectively; the red circles indicated the flavonoids, blue diamonds showed other detected metabolites. Figure 3. MN applied to flavonoid glycoconjugates from six Chrysobalanaceae species using LC-(+)ESI-MS/MS data (CE = 15, 30 and 45 eV). (A) layout representing the collision energy used in MS/MS experiments as node size, the text within each node as precursor ion, and the edge thickness related to the similarity degree among the nodes. (B) layout correlating the aglycone nature (shape), type and number of attached sugars (color and size), and methoxylation degree (border width). (C) layout representing the distribution of flavonoids across four Chrysobalanaceae genera. Figure 4. MN applied to glycosylated flavonoids from six Chrysobalanaceae species using LC-(-)-ESIMS/MS data (CE = 15, 25 and 35 eV). (A) layout representing the collision energy used in MS/MS experiments as node size, the text within each node as precursor ion, and the edge thickness related to the similarity degree among the nodes. (B) layout correlating the aglycone nature (shape), type and number of attached sugars (color and size), and methoxylation degree (border width). (C) layout representing the distribution of flavonoids across four Chrysobalanaceae genera. Figure 5. MN applied to glycosylated flavonoids from six Chrysobalanaceae species using LC-(-)-ESIMS/MS using two different collision energies: (A) 15 eV, for the determination of glycan linkage:1→6 represented by red-lozenge and 1→2 by blue hexagon. (B) 35 eV, for the determination of glycosylation positions (3-O yellow, 7-O blue or 3,7-O green). Figure 6. Propagating flavonoid glycoconjugates annotation using gas-phase fragmentation reactions and molecular networking. Nodes 6411 (m/z 461 - yellow node) and 7833 (m/z 637 - yellow node) were annotated based on the characteristics of the neighboring nodes (red nodes - center of the figure) and other layouts (upper right: species, middle right: types of glycosides and lower right: quantity and position of glycosides) with spectral analysis of neutral losses, diagnostic ions, intensity and ions ratio (red spectra on the left and yellow on the top and bottom represent the red and yellow nodes).

14 ACS Paragon Plus Environment

Page 15 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Table 1. Molecular networks generated using MS2 data from six species of Chrysobalanaceae. Each collision energy created one MN using ESI positive and negative modes. Collision energy in ESI (+) Collision energy in ESI (-) Attributes 15 eV 30 eV 45 eV All CE 15 eV 25 eV 35 eV All CE Total MS2 Spectra*

4084

6390

1036

13594

5175

4648

3870

8956

Total Nodes*

350

404

290

617

321

335

321

584

3007

987

684

1820

1248

915

728

2210

232

74

55

77

71

78

65

82

Flavonoid

MS2

Spectra

Flavonoid Nodes GNPS Candidates**

20 (26% of cluster – 77 nodes)

21 (25% of cluster – 82 nodes)

GNPS Misannotation

70% (14 out of 20 candidates)

86% (18 out of 21 candidates)

*excluding self-loop nodes; **spectral matching for flavonoids using GNPS databases (https://gnps.ucsd.edu/ProteoSAFe/libraries.jsp).

15 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 24

Table 2. Key product ions from tandem mass spectrometry used for determination of isomeric flavonoids glyconjugates. All proposed neutral losses, diagnostic ions, ion intensity and ratios were based on the MS2 flavonoid literature.10-15,18-24, 58-61. Flavonol Type sugar type number of sugar

Diagnostic Product Ion [M-sugar+H]+ [M-sugar-H]Yi+, i = 1-3 Yi-

position of sugar di-O or O-di 3,7-di-O interglycosidic bond

[Yi-H]-•/Yidi-O O-di [Yi-2H]int. of Yi-

Yi*+ [M-CH3+H]+• methoxy substituent [M-CH4+H]+ [M-CH3OH+H]+ 0,2A+ 1,3A+ aglycone 0,2B+ * neutral loss; ** intensity.

Additional Information -132, -146, -162 u* mono = -132 , -146, or -162 u di = 2 x (-132 , -146, or -162 u) tri = 3 x O-3 = ratio 3-4 (int.**) O-7 = ratio 1 (int.**) Y0 < Y1 Y0 > Y1 (1 → 2) = 13-89% (peak base) (1 → 6) < 13% (peak base) (1 → 6) = internal glucose residue -15 u* -16 u* -32 u* Retro-Diels Alder reactions

16 ACS Paragon Plus Environment

Page 17 of 24

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment

Page 18 of 24

Page 19 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment

Page 20 of 24

Page 21 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment

Page 22 of 24

Page 23 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56

ACS Paragon Plus Environment

Page 24 of 24