Bioactive Natural Products Prioritization Using Massive Multi

Aug 22, 2017 - Adoption of such a prioritization pipeline should contribute to the renewal of interest in leading drug discovery campaigns from sets o...
1 downloads 20 Views 822KB Size
Subscriber access provided by UNIVERSITY OF ADELAIDE LIBRARIES

Article

Bioactive natural products prioritization using massive multi-informational molecular networks Florent Olivon, Pierre-Marie Allard, Alexey Koval, Davide Righi, Gregory Genta-Jouve, Johan Neyts, Cécile Apel, Christophe Pannecouque, Louis-Félix Nothias, Xavier CACHET, Laurence Marcourt, Fanny Roussi, Vladimir L. Katanaev, David Touboul, Jean-Luc Wolfender, and Marc Litaudon ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.7b00413 • Publication Date (Web): 22 Aug 2017 Downloaded from http://pubs.acs.org on August 22, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Chemical Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Bioactive natural products prioritization using massive multiinformational molecular networks Florent Olivon,†,∆ Pierre-Marie Allard,‡,∆ Alexey Koval,§ Davide Righi,‡ Gregory Genta-Jouve,¶‖ Johan Neyts,⊥ Cécile Apel,† Christophe Pannecouque,⊥ Louis-Félix Nothias,† Xavier Cachet,# Laurence Marcourt,‡ Fanny Roussi,† Vladimir L. Katanaev,§,◊ David Touboul,† Jean-Luc Wolfender, *,‡ Marc Litaudon*,† †

Institut de Chimie des Substances Naturelles, CNRS-ICSN, UPR 2301, Université Paris-Saclay, 91198, Gif-surYvette, France. ‡School of Pharmaceutical Sciences, University of Geneva, University of Lausanne, CMU – Rue Michel Servet 1, 1211 Geneva 11, Switzerland. §Department of Pharmacology and Toxicology, University of ¶ Lausanne, CH-1005 Lausanne, Switzerland. Equipe C-TAC, UMR CNRS 8638 COMETE - Université Paris Descartes, 4 avenue de l’Observatoire, 75006 Paris, France. ⊥Laboratory for Virology and Experimental Chemotherapy, Rega Institute for Medical Research, KU Leuven Minderbroedersstraat 10, B-3000 Leuven, Belgium. # Laboratoire de Pharmacognosie, UMR CNRS 8638 COMETE - Université Paris Descartes, 4 avenue de l’Observatoire, 75006 Paris, France. ◊School of Biomedicine, Far Eastern Federal University, Vladivostok, Russian Federation.

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT: Natural products represent an inexhaustible source of novel therapeutic agents. Their complex and constrained three-dimensional structures endow these molecules with exceptional biological properties, thereby giving them a major role in drug discovery programs. However, the search for new bioactive metabolites is hampered by the chemical complexity of the biological matrices in which they are found. The purification of single constituents from such matrices requires such a significant amount of work that it should ideally be performed only on molecules of high potential value (i.e. chemical novelty and biological activity). Recent bioinformatics approaches based on state-of-the-art mass spectrometry metabolite profiling methods are beginning to address the complex task of chemical identification of individual metabolites within complex mixtures. However, in parallel to these developments, methods providing information on the bioactivity potential of natural products prior to their isolation are still lacking and are of key interest to target the isolation of valuable natural products only. In the present investigation, we propose an integrated analysis strategy for bioactive natural products prioritization. Our approach uses massive molecular networks embedding various informational layers (bioactivity and taxonomical data) to highlight potentially bioactive scaffolds within the chemical diversity of crude extracts collections. We exemplify this workflow by targeting the isolation of predicted active and non-active metabolites from two botanical sources (Bocquillonia nervosa and Neoguillauminia cleopatra) against two biological targets (Wnt signaling pathway and chikungunya virus replication). Eventually, the detection and isolation processes of a daphnane diterpene orthoester and four 12-deoxyphorbols inhibiting the Wnt signaling pathway and exhibiting potent antiviral activities against CHIKV virus are detailed. Combined with efficient metabolite annotation tools, this bioactive natural products prioritization pipeline proves to be efficient. Implementation of this approach in drug discovery programs based on natural extract screening should speed up and rationalize the isolation of bioactive natural products.

ACS Paragon Plus Environment

Page 2 of 17

Page 3 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Over the last two decades, the relative lack of interest in natural substances by large pharmaceutical companies, resulting in a constant decline of FDA approved new molecular entities, has been balanced by small firms’ dynamism.1 While the discovery of new lead compounds from natural sources had to face important challenges,2 there are still today a significant number of natural product (NPs) drugs in development.3 NPs derivatives and synthetic drugs inspired from natural pharmacophores still represent over 50% of all drugs in clinical use.34 Despites this success industry is reluctant to pursue NPs based drug discovery programs mainly because of the complexity of the work on NPs extracts. The study of a large number of plant extracts by HTS, followed by the isolation and characterization of bioactive constituents is highly challenging. In addition, the bioguided purification process often leading to reisolation of known structures is laborious, time consuming and costly. The problematic of reisolation of known structures is now being addressed by a number of novel bioinformatics tools that efficiently process and analyze mass spectrometry (MS) and nuclear magnetic resonance (NMR) data acquired from complex extracts, and assist the process of dereplication.5 These tools, combined to ad hoc databases, provide structural information of known compounds prior to their isolation and may even allow the annotation of previously unreported metabolites by in silico prediction.6-7 Possessing such data at the early stages of investigations allows guiding targeted isolation of novel or desired scaffolds only. Nevertheless, if current dereplication tools can identify NPs upstream of the isolation process, effective strategies for the early detection of bioactive NPs within crude extracts are still crucially lacking. In this direction, a few workflows, based either on NMR or MS metabolite profiling, have been developed to correlate the extracts’ chemical composition with the results of biological tests. For example, one elegant way of highlighting bioactive metabolites directly within crude extracts is based on NMR metabolite profiling using DANS (Differential Analysis by 2D NMR Spectrometry). This approach has been used to highlight small signaling molecules differently expressed in Caenorhabditis elegans.8 Here, the use of NMR data offers direct structural insight into the nature of metabolites but also limits the sensitivity of the approach. In addition, this methodology is hardly applicable to large collections of NPs extracts. Recently, a compound activity mapping platform, capable of predicting a biologic synthetic fingerprint for individual metabolites in NPs libraries, has been developped.9 This approach is based on the combination of high-content screening data with MS1 data and appears to be a promising tool for spotting bioactive compounds within mixtures and predicting their mode of action. The strategy requires the acquisition of hundreds of parameters for the phenotypic profiling of each extract using dedicated image-based screening platforms optimized for NPs drug-discovery, thus limiting its application to medium-throughput screens.10 Furthermore, since fragmentation data is not taken into account, the structural information acquired when analyzing extracts libraries is not making use of the full capabilities of MS platforms. In order to further identify compounds in MS based metabolite profiling workflows, the introduction of the molecular networking (MN) approach has appeared as a powerful tool to process tandem mass spectrometry (MS2) fragmentation data.11 The MN concept is based on the organization and visualization of tandem MS data through a spectral similarity map, revealing the presence of homologous MS2 fragmentations. As structurally related compounds share similar fragmentation spectra, their nodes tend to gather together and create clusters of analogues. MN allows visualizing large data sets and offers the possibility to map additional information over the networks.12,13 Several strategies have recently been developed based on the combination of large sets of

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

integrated biological and chemical data, with the aim of detecting more efficiently new compounds of therapeutic interest. For example, to evaluate the metabolic potential of new strains of cyanobacteria, Tamagnini and colleagues14 demonstrated that the combination of a genetic analysis to detect strains possessing PKS and NRPS genes with the MN approach was an interesting combination to target the isolation of new compounds endowed with promising therapeutic properties. In this study, however, a direct correlation between bioactivity and extracts' chemical composition was not performed. Recently, Gerwick and colleagues have also shown that combining bioassay screening results with MN was a powerful tool for the discovery of molecules possessing both a new structure and a desired biological activity.15 This approach allowed the targeted isolation of a new cytotoxic natural product, samoamide A, from a cluster representative of a novel cyanobacterial chemotype. This latter study however relied on a limited collection of cyanobacteria, only 10 species of the Symploca genus were investigated, and the approach required a preliminary fractionation step to be performed on the samples. The present work aims to explore and highlight the existing link between spectral molecular networks and bioactivity of NPs extracts. Indeed, by combining structural information afforded by LC-MS2 analyses with bioassay results acquired over natural extracts collections, correlations between bioactivities and specific chemical scaffolds are expected to be made prior to any fractionation step. When screening NPs in extracts for a specific bioactivity, it is anticipated to find families of structurally related compound exerting similar bioactivities, of course modulated by their minor structural differences (structure-activity relationship concept). Thus, constituents responsible of the activity of a group of extracts in a particular bioassay should emerge when exploring a massive MN exhibiting a vast diversity of structures and mapping bioactivity results and taxonomical information on top of these MNs. Herein, we propose a bioactive NP prioritization approach based on the merging of taxonomical and bioassay data over a massive MN acquired over NPs extract library to generate multi-informative molecular maps, which will then guide the isolation of compounds of interest. To evaluate the reliability and relevance of the envisioned strategy, a set of 292 Euphorbiaceae extracts from New Caledonia was selected and screened against both oncogenic Wnt signaling pathway and chikungunya virus (CHIKV) replication. The bioactive NPs prioritization workflow led to the selection of various molecular features from two plant species (Neoguillauminia cleopatra and Bocquillonia nervosa), which were hypothesized as being active against the screened targets. The MS-guided isolation of the selected features led to the characterization of novel and known NPs, which were effectively confirmed to present inhibiting activities in the micro to nanomolar ranges. This approach allowed to efficiently targeting the isolation and structural determination efforts on compounds having a high bioactivity potential without requiring additional iterative bio-guided fractionation steps.

ACS Paragon Plus Environment

Page 4 of 17

Page 5 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

RESULTS AND DISCUSSION Extracts library. A unique collection of 292 extracts from 107 (93 % endemic) New Caledonian Euphorbiaceae species was selected as a starting point for this study. The extracts were prepared from leaves (47%), bark or twigs (48%), whole plant (3%) and fruits (2%), and subjected to polyamide filtration prior to LC-MS2 analyses and biological evaluations. This botanical family was chosen for its large structural diversity, characterized by unique diterpenoids of various skeletons endowed with a wide range of biological activities.16-21 Bioassays. The collection was screened over a CHIKV cell-based assay and the oncogenic Wnt signaling pathway. The first assay was selected because our previous work demonstrated that Euphorbiaceae produced potent CHIKV replication inhibitors.22-24 In contrast, the collection had not been previously screened in the Wnt signaling assay. It is known that the over-activation of this pathway leads to the formation and proliferation of many cancers, including the triplenegative breast cancer, which is the deadliest form of breast tumor. There are still no approved therapies focusing on this pathway, making it a promising target for drug discovery.25 All extracts were screened in a CHIKV replication inhibition assay affording EC50 ranging from 0.02 to 100 µg.mL-1. For the Wnt bioassay, IC50 values were determined from two replicates of five concentrations: 100, 33, 10, 3 and 0.3 µg.mL-1. During bioassays, some extracts showed strong cytotoxicity, which prevented determination of their IC50. Organization and annotation of chemical data. All extracts were profiled by UHPLCHRMS using a generic linear chromatographic gradient. In parallel, automated data-dependent fragmentation data were acquired using an Orbitrap MS analyzer. The LC-MS2 data were organized as MN to map the chemical space of the Euphorbiaceae extracts library. More than 1.8 million spectra were organized as 88’687 nodes, themselves grouped in 7’840 clusters (Figure 1A). Additionally, MS2 spectra constituting the whole MN were annotated against an in silico fragmented database of all NPs structures previously reported to occur in Euphorbiaceae (subset of the Dictionary of Natural Products).26 The spectral comparison was made using parent-mass as a filter for direct match or using the variable dereplication mode to link experimental spectra to possible structural analogues according to a previously described workflow.27 Generation of the multi-informational molecular map. All data acquired at the chemical, biological and taxonomical levels were then merged as a multi-informative molecular map constituting the core of the established strategy. - Bioassay results mapping. The basic idea behind this prioritization process relies on exploring the networks and comparing the chemical compositions between bioactive and inactive extracts. The results of biological screening of extracts, reported as IC50 or EC50, were first classified according to their level of activity: 0 to 5 µg.mL-1, 5 to 20 µg.mL-1, 20 to 50 µg.mL-1, above 50 µg.mL-1 and not determined for Wnt assay, and 0 to 1 µg.mL-1, 1 to 3 µg.mL-1, 3 to 10 µg.mL-1, 10 to 50 µg.mL-1, 50 to 100 µg.mL-1 and not determined for CHIKV assay. Each activity level was assigned a specific color tag (details are given in caption of Figure 1) and this color mapping was applied to all nodes in the MN. These stratification levels were set to group at least 10 extracts within each category. The establishment of such bioactivity levels can be modulated but

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

a division into various activity levels appears to be more informative than a simple active/nonactive categorization. The color mapping allows visualization of a few clusters clearly restricted to ions coming from bioactive extracts, suggesting common bioactive scaffolds. Various cluster classes could then be observed (See Fig. 1A and Figures S52 and S53). Some of a unique colour (for example see Figure S50) and thus corresponding to metabolites families found in extracts belonging to a unique activity level. Others showing various colours and thus corresponding to families of metabolites found in extracts belonging to various activity levels. Clusters displaying mainly red, orange or yellow colours could thus be easily spotted within the whole MN by rapid visual inspection. In order to more precisely filter clusters of interest, subnetworks of potential Wnt pathways inhibitors and potential CHIKV inhibitors were created by restricting the original MN to nodes presenting at least two scans of an ion belonging to an extract possessing potent bioactivity (IC50 < 50 µg.mL-1 in Wnt assay and EC50 < 10 µg.mL-1 in CHIKV assay), and further applying a topological filter to remove nodes not having at least 4 neighbors in a distance of 4. This allowed the original number of cluster to be reduced from 7840 to 192 (2.45 % of the original MN) in the case of Wnt (Figure S52), and to 380 (4.85 % of the original MN) in the case of CHIKV (Figure S54). Interestingly, some clusters were common to both subnetworks, potentially indicating metabolites active in both bioassays (Figures S48 and S50). - Taxonomic information mapping. Additionally, these subnetworks were mapped with the taxonomical information of the extracts library. Extracts identities were either grouped at the genus or species levels as well as according to the plant parts thereof. Each group was assigned a specific color tag. This additional layout was used to check if specific clusters of nodes were restricted to given species, genera or plant parts. When similar features from a network were shared between various extracts, all mapped with high bioactivities, it can be hypothesized that their co-occurrence makes them potential bioactive candidates. In other terms, the higher the number of active extracts within a cluster, the greater the probability for these compounds to be active. On the other hand, if the nodes were found to be present in a unique bioactive extract, it can be hypothesized that these compounds are possibly responsible of the observed bioactivity as they can be differentiated from ubiquitous compounds present in the extract.

ACS Paragon Plus Environment

Page 6 of 17

Page 7 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

B. Bioactivity mapping - Wnt

C. Bioactivity mapping - CHIKV

A. Massive Molecular Network

CHIKV EC50 of the extracts:

Wnt IC50 of the extracts: 0 - 5 µg.mL-1

> 50 µg.mL-1

0 - 1 µg.mL-1

5 - 20 µg.mL-1

n.d.

1 - 3 µg.mL-1

50 - 100 µg.mL-1

3 - 10 µg.mL-1

n.d.

20 - 50 µg.mL-1

10 - 50 µg.mL-1

MNA MN1

O C15H 31

O

3

EC50 = 17.7 µM

H

H

H OH

OH O C15H 31

O HO HO

O

O

OH

O

H

OH

MNA H

H

HO HO HO

H OH

O HO OH

O H OH O

OH

O

7

1

MN1

OH

MN1

MNA

Taxonomical mapping:

Taxonomical mapping:

Ions from Bocquillonia nervosa leaves extract

Bocquillonia nervosa LC-MS2 25 fractions

Ions from other species

D. B. nervosa fraction-enriched MN1 cluster C15H 31

F11

IC50 = 27.8 µM

F13

H OH OH

O HO HO

F15

F19

O O

MN-guided purification: Targeted isolation of 4 compounds + associated IC50 Wnt assay

H H OH

O HO OH

OH O

OH C15H 31

O

IC50 = 1.18 µM C15H 31

F17 F18

IC50 = 1.14 µM

H

OH

3

Ions from Codiaeum spp. extracts

1

H

F14

Ions from Neoguillominia cleopatra bark extract

C15H 31

O

H

F16

LC-MS 2 profiling of the entire collection

O

Bocquillonia nervosa fraction mapping: F12

+

O

H

O HO

H OH

H H OH

ions of MN1 from other species

H

H

4

O

OH

MN1 + fractions

O HO HO

O

2

OH

IC50 = 0.034 µM

Figure 1. A. Generation of a Massive Molecular Network from 292 LC-MS2 analyses of New Caledonian Euphorbiaceae extracts, B and C. Bioactivity mapping allows the detection of bioactive clusters (top) and taxonomical mapping is used for the selection of the Bocquillonia nervosa and Neoguillauminia cleopatra extracts (bottom), D. Implementation of a ‘fraction’ mapping for Molecular Networking-guided purification of targeted bioactive compounds.

Prioritization and targeted isolation of potential Wnt pathway inhibitors. After mapping the full MN with the Wnt assay results, generating a subnetwork (Supplementary Figure S52) as described above and applying the taxonomical mapping, a visual inspection of the filtered cluster allowed 21 clusters of potential Wnt pathways inhibitors to be

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 17

focused on (Supplementary Figure S53). Detailed investigations of these clusters revealed four clusters, named MN1 to MN4, combining various nodes mainly corresponding to features present in Bocquillonia spp. extracts and more particularly from the leaves extract of Bocquillonia nervosa (Figure 1B and Supplementary Figure S48). Further inspection of these four clusters and their associated LC-MS2 traces revealed that ions from MN1 corresponded to sodium adducts [M+Na]+ of the same molecular species present as [M+H]+ or [M-H2O+H]+ cations in MN2-4 (Supplementary Figure S49). Among these four clusters, MN1 was the most representative of the Bocquillonia nervosa leaves EtOAc extract given that it included most of the compounds displayed in MN2-4. Based on these observations, this extract, exhibiting an IC50 of 2.6 µg.mL-1, was selected for further chemical investigation and only ions from MN1 were targeted. To provide insight in the structural characteristics of compounds of MN1-4, MS2 spectra of previously isolated Euphorbiaceae diterpenoids were uploaded in the massive molecular network. A standard analogue matching with ions from MN2 (see S48)28 suggested that molecules from this cluster potentially shared the 12-deoxyphorbol scaffold. The crude extract was subjected to normal-phase flash chromatography to yield 25 fractions of increasing polarity. The LC-MS2 profiles of all fractions were combined with data of the initial massive MN and mapping was done according to fraction numbers (Figure 1D). Ions from MN1 were distributed in fractions F11-F19. Based on peak intensities (Supplementary Figures S44 to S47), the four major compounds were selected for a targeted isolation. The MS-guided purification of two new (2 and 3) and two known (1 and 4) 12-deoxyphorbol esters, was then achieved (structural determination of 1—4 are detailed in Supplementary S5 and S6). Compounds 2-4 further allowed nodes annotation in their protonated forms within clusters MN2, MN3 and MN4 (Supplementary Figure S49). All MS2 spectra generated from [M+Na]+ ions of compounds 1-4 exhibited the neutral loss of a 256 atomic mass units, corresponding to the loss of the C16-membered ester chain produced by a McLafferty rearrangement, followed by the loss of a molecule of water. The characteristic product ion pairs m/z 369 → 351 (scaffold 1), m/z 385 → 367 (scaffold 2), m/z 403 → 385 (scaffold 3) and m/z 353 → 335 (scaffold 4) were also decisive for spectral matching as several 12-deoxyphorbol with different ester chains were distinguished in MN1 (Supplementary Figure S51). In addition, it should be noted that a good compatibility between datasets obtained from different instruments was observed. Even though the crude extracts collection on one hand, and Bocquillonia fractions on the other hand were analyzed on an Orbitrap and a Q-ToF instrument, respectively, no complications hampered the generation and analysis of the merged MN. This overall compatibility between experimental data acquired using different MS systems implies that these massive MN can serve as a common base for inter-laboratory collaborative projects. Table 1: Wnt Signaling Inhibition in HTB-19 Cells of Compounds 1-6; Antimetabolic and Antiviral Activities of Compounds 1−7 in Vero Cells against CHIKV Compound

Wnt IC50 a

CC50 (Vero)a

1 2 3 4 5 6

1.14 ± 0.31 0.034 ± 0.023 27.8 ± 15.1 1.18 ± 0.24 > 225 > 290

12.7 ± 12.0c 4.85 ± 4.53c 56.1 ± 12.0 30.0 ± 28.4 101c > 291

CHIKV EC50 a 0.13 ± 0.03 0.09 ± 0.05 2.14 ± 0.3 0.02 ± 0.001 > 225 4.23 ± 0.58

SIb 98 54 26 1500 > 0.5 > 69

ACS Paragon Plus Environment

Page 9 of 17

35.1 ± 1.44 17.7 ± 0.8 2 nd 7 Chloroquine 89 ± 28 11 ± 7 8 a Data in µM. Unless otherwise stated, values are the median ± median absolute deviation calculated from at least three independent assays b SI or window for antiviral selectivity is calculated as CC50 Vero/EC50 CHIKV. nd = not determined c CC50 determined microscopically: the evaluation of the adverse effect of compounds 1, 2 and 5 on the host cells were carried out by means of microscopy scoring

Using a reporter gene methodology, compounds 1—6 were evaluated for their capacity to inhibit the Wnt signaling pathway in HTB-19 cells (Table 1). When compared to the IC50 values of known natural inhibitors,29 compound 2, and to a lesser extent 1 and 4, proved to be highly potent inhibitors as evidenced by their IC50 values (0.0336 ± 0.0234, 1.14 ± 0.31 and 1.18 ± 0.24 µM, respectively). From the structure-activity relationships point of view, the presence of a 6,7epoxy function was found essential for a strong inhibiting activity (2 vs 3), while a 5-hydroxy group played no significant role (1 vs 4). Moreover, in order to further evaluate this approach, two major compounds, identified as aurantiamide acetate (5) and 3,3',4'-tri-O-methylellagic acid (6) (Figure S4), were isolated from fraction F18 during the purification step of compound 4. These compounds appeared as members of non-prioritized clusters in the massive network, as evidenced by the color mapping (Figure 2).

0 - 5 µg.mL-1 5 - 20 µg.mL-1 20 - 50 µg.mL-1 > 50 µg.mL-1 n.d.

1: 1.14 µM 3: 27.8 µM

5: > 225 µM 6: > 290 µM

3

1 MN1

Inactive Cluster

Extracts IC50

Targeted Bioactive Cluster

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

5

6

Figure 2. Selected bioactive (dominant red color, left box) and inactive (dominant grey color, right box) clusters mapping Wnt assay. Nodes colored in blue were associated with cytotoxic extracts. IC50 of compounds 1, 3, 5 and 6 are indicated in each box.

Prioritization and targeted isolation of potential CHIKV inhibitors. Similarly, potential CHIKV inhibitors were targeted by applying the CHIKV bioassay results over the full MN and generating the associated subnetworks, which allowed less than 5% of the original MN to be focus on (Supplementary Figure S53). Tigliane-type diterpenoids including prostratin analogues, have been previously reported to exhibit strong anti-CHIKV activities.22,24,28 As expected, this CHIKV-bioactivity mapping highlighted the previously selected clusters MN1-4 associated with B. nervosa extract (Supplementary Figure S50). Compounds 1, 2 and 4 showed extremely potent anti-CHIKV activity with EC50 values of 0.13 ± 0.03 µM (SI = 98), 0.09 ± 0.05 µM (SI = 54) and 0.02 ± 0.001 µM (SI = 1500), respectively, and represent the most potent 12-deoxyphorbols reported so far. The anti-CHIKV activity of this type of compounds is not new but exemplifies the bioassay versatility of our approach. In addition, since several phorbol esters have been previously reported to exhibit strong anti-HIV activities,22 compounds 1-4 were also investigated for

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 17

selective antiviral activity in MT4 cells against HIV-I (IIIB) and HIV-II (ROD). Compounds 1, 2 and 4 exhibited extremely potent anti-HIV activities with IC50 values in the nM or sub-nM range, and selectivity index ranging between 4’000 and 24’000 (see details in S10). In order to detect potential original CHIKV inhibitors not belonging to the deoxyphorbol series, other clusters were also investigated. One of them, MNA (Figure 1C) was found to be of particular interest. Its taxonomical mapping indicated that its constituents were mainly related to the bark extract of Neoguillauminia cleopatra, which exhibited a strong anti-CHIKV activity (EC50 = 1.2 µg.mL-1). This species, which is the only representative of the endemic New Caledonian genus Neoguillauminia, had never been investigated so far and thus had chances to yield potentially new NPs. In order to obtain preliminary structural information on the compounds of interest, the MNA cluster was annotated by variable dereplication against an in silico fragmented spectral library taxonomically restricted to Euphorbiaceae metabolites using a previously developed dereplication pipeline.6 This dereplication mode is a modification tolerant spectral library search allowing spectral match with compounds having different parent ion mass but sharing MS2 spectral similarities. The results of this annotation step indicated the possible presence of highly oxygenated diterpenoids bearing a polyunsaturated side chain (Supplementary Figure S55). From this cluster, two ions (m/z 677.37 and m/z 695.31) were selected based on their intensities. A mass-directed isolation30 of these ions within the crude extract of N. cleopatra bark was achieved and afforded a novel daphnane diterpene orthoester (DDO), named neoguillauminin A (7), and corresponding to m/z 695.3789 ([M+H]+: C40H55O10). The structural elucidation of compound 7 is detailed in Supplementary S7. The compound at m/z of 677.3686 ([M+H]+: C40H53O9) could not be isolated but its occurrence in the cluster and the difference of 18 Da compared to compound 7 indicated a possible epoxydated analogue of neoguillaumin A at position C6-C7, as it is commonly observed in the DDO series. Neoguillauminin A (7) exhibited an EC50 value of 17.7 ± 0.8 µM (SI = 2), which is, in this case, far from the most potent CHIKV inhibitors but was comparable to the EC50 of the reference compound in this bioassay (chloroquine, 11 ± 7 µM, SI = 8). The low EC50 value found for the crude extract might be explained by the probable presence of more powerful derivatives, possibly belonging to the selected cluster. As illustrated in the Bocquillonia nervosa example, a marked difference of the biological potency can be observed among compounds belonging to a prioritized cluster (compound 2 800 fold more active than 3, and compound 4 100 fold more active than 3, in the case of Wnt and CHIKV assays, respectively). Thus, in order not to mistakenly conclude on the absence of any potent biologically active compounds, it is important to carry out the purification of a representative number of molecules within a considered cluster. In the case of MNA, the mapping of biological activities of fractions into the networks as proposed by Gerwick et al.15, could have been a helpful tool to refine observations at the cluster scale and to target isolation of compounds with stronger bioactivities.

CONCLUSION We showed that the integration of chemical, biological and taxonomical information within a unified multi-informative chemical map provides a powerful tool to navigate through the chemical space of wide collections of extracts in a radically new manner. The mapping of IC50 values over a massive molecular network made up of hundreds of samples, allowed the efficient selection of clusters of potential bioactive metabolites. The merging of biological data on top of

ACS Paragon Plus Environment

Page 11 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

MN also offers various additional indirect advantages such as the possibility to easily draw structure-activity relationships within a chemical family by isolating representative members of a given cluster. Also, if multiple bioassays results were to be mapped over massive MN it is expected to easily highlight pan-assay interference compounds (PAINS) by systematically spotting these structures in the bioassay mapped MN. Further improvement of the proposed approach could be made on the technical side by developing more efficient MS-directed platforms allowing to directly isolating minor compounds starting from grams of complex extracts. Also, data-treatment solution allowing the establishment of a better correlation between the pie chart display of a node and the actual precursor intensity of this feature in the MS1 profile should globally improve the predictivity of the approach. Implementation and development of advanced statistical correlations such as those proposed by Cech et al.31 on massive multiinformative MN might also further refine the prioritization process. By shifting the classical bio-guided purification process to a more focused mass-guided purification an efficient selection and isolation of bioactive NPs is possible, thus representing a clear optimization of the isolation workload and associated costs. This bioactive prioritization approach is easily implementable and scalable. It is generic and can be adapted to a large variety of bioassays. Adoption of such prioritization pipeline should contribute to the renewal of the interest in leading drug discovery campaigns from sets of crude complex extracts from natural origin.

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 17

MATERIAL AND METHODS Data dependent LC-HRMS2 analysis. Chromatographic separation was performed on an Acquity UHPLC system interfaced to a Q-Exactive Plus mass spectrometer (Thermo Scientific, Bremen, Germany), using a heated electrospray ionization (HESI-II) source. The LC conditions were as follows: column: Waters BEH C18 100x2.1 mm, 1.7 µm; mobile phase: (A) water with 0.1% formic acid; (B) acetonitrile with 0.1% formic acid; flow rate: 600 µL.min-1; injection volume: 1 µL; gradient: linear gradient of 5%—100% B over 8 min and isocratic at 100% B for 3 minutes. In positive ion mode, diisooctyl phthalate C24H38O4 [M+H]+ ion (m/z 391.28429) was used as internal lock mass. The optimized HESI-II parameters were the following: source voltage: 3.5 kV (pos), sheath gas flow rate (N2): 48 units; auxiliary gas flow rate: 11 units; spare gas flow rate: 2.0; capillary temperature: 300.00 °C (pos), S-Lens RF Level: 55. The mass analyzer was calibrated using a mixture of caffeine, methionine-arginine-phenylalanine-alanineacetate (MRFA), sodium dodecyl sulfate, sodium taurocholate and Ultramark 1621 in an acetonitrile/methanol/water solution containing 1% formic acid by direct injection. The datadependent MS/MS events were performed on the 4 most intense ions detected in full scan MS (Top4 experiment). The MS/MS isolation window width was 2 Da, and the normalized collision energy (NCE) was set to 35 units. In data-dependent MS/MS experiments, full scans were acquired at a resolution of 35 000 FWHM (at m/z 200) and MS/MS scans at 17 500 FWHM both with a maximum injection time of 50 ms. After being acquired in MS/MS scan, parent ions were placed in a dynamic exclusion list for 3.0 seconds. Data dependent LC-HRMS2 analysis (fractions of B. nervosa). LC-UV analyses were performed with a Dionex Ultimate 3000 RSLC system with an Accucore C18 column (2.1 × 100 mm; 2.6 µm ThermoScientific). The mobile phase consisted of water-acetonitrile (MeCN) with 0.1 % formic acid 20:80 during 5 min, then a gradient from 20:80 to 100:0 in 20 min, at a flow rate of 350 µL.min-1. A 2 min isopropanol-MeCN 50:50 washout was applied after each run. The separation temperature was maintained at 30 °C during the analysis. LC–MS analyses were achieved by coupling the LC–UV system to a quadrupole time-of-flight mass spectrometer 6540 Agilent (Agilent Technologies, Massy, France) equipped with an ESI dual source, operating in positive-ion mode. ESI conditions were set with the capillary temperature at 325 °C, source voltage at 5 kV, and a sheath gas flow rate of 69 L.min-1. The mass spectrometer was operated in full-scan mode. Mass spectra were recorded from m/z 50 to m/z 1000 at 2 GHz leading to a mass resolution of 20,000 at m/z 922. MS1 scan was followed by MS/MS scans of the three most intense ions above an absolute threshold of 3000 counts (2 m/z isolation width, a collision energy of 35 eV and an active exclusion of 30 s after 3 consecutive scans on a same parent ion). Calibration solution, containing two internal reference masses (purine, C5H4N4, m/z = 121.0509, and HP-921 [hexakis-(1H,1H,3Htetrafluoropentoxy)phosphazene], C18H18O6N3P3F24, m/z = 922.0098), routinely led to mass accuracy below 2 ppm. LC-UV data were analyzed with Chromeleon software (Dionex) and MS data were analyzed with MassHunter software (Agilent). Molecular Network Analysis. The MS2 data were converted from the .RAW (Thermo) or .d (Agilent) standard data-format to .mzXML format using the MSConvert software, part of the ProteoWizard package.32 The molecular networks were created using the online workflow at Global Natural Products Social molecular networking (GNPS) (http://gnps.ucsd.edu). MS2 spectra were window filtered by choosing only the top 6 peaks in the +/- 50 Da window throughout the spectrum. The data were clustered with MS-Cluster with a parent mass tolerance

ACS Paragon Plus Environment

Page 13 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

of 0.1 Da and a MS2 fragment ion tolerance of 0.5 Da to create consensus spectra.33 Further, consensus spectra that contained less than 2 spectra were discarded. A network was then created where edges were filtered to have a cosine score above 0.7 and more than 6 matched peaks. Further edges between two nodes were kept in the network if and only if each of the nodes appeared in each other's respective top 10 most similar nodes. The library spectra were filtered in the same manner as the input data. All matches kept between network spectra and library spectra were required to have a score above 0.7 and at least 6 matched peaks. The full MS dataset is uploaded and accessible on the GNPS servers as Massive Datasets N° MSV000079034 (GNPS_New_Caledonian_Euphorbiaceae_part1) and MSV000079069 (GNPS_New_Caledonian_Euphorbiaceae_part2).

AUTHOR INFORMATION Corresponding Authors * E-mail: [email protected] * E-mail: [email protected] ORCID Florent Olivon: 0000-0002-3662-5390 Pierre-Marie Allard: 0000-0003-3389-2191 Alexey Koval: 0000-0002-8920-4426 Davide Righi: 0000-0002-4034-5455 Grégory Genta-Jouve: 0000-0002-9239-4371 Johan Neyts: 0000-0002-0033-7514 Louis-Félix Nothias: 0000-0001-6711-6719 Xavier Cachet: 0000-0001-9150-1932 Fanny Roussi: 0000-0002-5941-9901 Vladimir Katanaev: 0000-0002-7909-5617 Marc Litaudon: 0000-0002-0877-8234 Author Contributions ∆ F.O. and P.M.A. contributed equally to this work. Notes The authors declare no conflict of interest. ACKNOWLEDGMENTS The authors are very grateful to South and North Provinces of New Caledonia that have facilitated our field investigation. This work has benefited from an “Investissement d’Avenir” grant managed by Agence Nationale de la Recherche (CEBA, ANR-10-LABX-25-01). The authors are grateful to V. Dumontet (ICSN) for his contribution to plant materials collection and identification, and to C. Collard, N. Verstraeten, K. Erven and C. Vanderheydt from the “Laboratory for Virology and Experimental Chemotherapy” at the “Rega Institute for Medical Research” in Leuven, for the evaluation of antiviral activity. JLW is thankful to the Swiss National Science Foundation for the support of metabolomics projects (SNF grants 310030E164289 and 316030-164095). ASSOCIATED CONTENT Supporting Information

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 17

The Supporting Information is available free of charge on the ACS Publications website at DOI: Plant extraction and isolation processes, bioassay procedures, supporting networks, structures and NMR spectra for compounds 1—7, structural elucidation and NMR data for compounds 1—3 and 7, anti-HIV-I and anti-HIV-II activities of compounds 1—6.

REFERENCES (1) Meunier, B. (2012) Does chemistry have a future in therapeutic innovations? Angew. Chem. Int. Ed. 51, 8702—8706. (2) Atanasov, A. G.; Waltenberger, B.; Pferschy-Wenzig, E.-M.; Linder, T.; Wawrosch, C.; Uhrin, P.; Temml, V.; Wang, L.; Schwaiger, S.; Heiss, E. H.; Rollinger, J. M.; Schuster, D.; Breuss, J. M.; Bochkov, V.; Mihovilovic, M. D.; Kopp, B.; Bauer, R.; Dirsch, V. M.; Stuppner, H. (2015) Discovery and resupply of pharmacologically active plant-derived natural products: a review. Biotechnol. Adv. 33, 1582—1614. (3) Butler, M. S.; Robertson, A. A. B.; Cooper, M. A. (2014) Natural product and natural product derived drugs in clinical trials. Nat. Prod. Rep. 31, 1612—1661. (4) Newman, D. J.; Cragg, G. M. (2016) Natural products as sources of new drugs from 1981 to 2014. J. Nat. Prod. 79, 629—661. (5) Hufsky, F.; Scheubert, K.; Bocker, S. (2014) New kids on the block: novel informatics methods for natural product discovery. Nat. Prod. Rep. 31, 807—817. (6) Allard, P.-M.; Genta-Jouve, G.; Wolfender, J.-L. (2017) Deep metabolome annotation in natural products research: towards a virtuous cycle in metabolite identification. Curr. Opin. Chem. Biol. 36, 40—49. (7) Jeffryes, J. G.; Colastani, R. L.; Elbadawi-Sidhu, M.; Kind, T.; Niehaus, T. D.; Broadbelt, L. J.; Hanson, A. D.; Fiehn, O.; Tyo, K. E. J.; Henry, C. S. (2015) MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics. J. Cheminform. 7, 44. (8) Pungaliya, C.; Srinivasan, J.; Fox, B. W.; Malik, R. U.; Ludewig, A. H.; Sternberg, P. W.; Schroeder, F. C. (2009) A shortcut to identifying small molecule signals that regulate behavior and development in Caenorhabditis elegans. Proc. Natl. Acad. Sci. 106, 7708—7713. (9) Kurita, K. L.; Glassey, E.; Linington, R. G. (2015) Integration of high-content screening and untargeted metabolomics for comprehensive functional annotation of natural product libraries. Proc. Natl. Acad. Sci. 112, 11999—12004. (10) Kurita, K. L.; Linington, R. G. (2015) Connecting phenotype and chemotype: high-content discovery strategies for natural products research. J. Nat. Prod. 78, 587—596. (11) Wang, M.; Carver, J. J.; Phelan, V. V.; Sanchez, L. M.; Garg, N.; Peng, Y.; Nguyen, D. D.; Watrous, J.; Kapono, C. A.; Luzzatto-Knaan, T.; Porto, C.; Bouslimani, A.; Melnik, A. V.; Meehan, M. J.; Liu, W.-T.; Crusemann, M.; Boudreau, P. D.; Esquenazi, E.; Sandoval-Calderon, M.; Kersten, R. D.; Pace, L. A.; Quinn, R. A.; Duncan, K. R.; Hsu, C.-C.; Floros, D. J.; Gavilan, R. G.; Kleigrewe, K.; Northen, T.; Dutton, R. J.; Parrot, D.; Carlson, E. E.; Aigle, B.; Michelsen, C. F.; Jelsbak, L.; Sohlenkamp, C.; Pevzner, P.; Edlund, A.; McLean, J.; Piel, J.; Murphy, B. T.; Gerwick, L.; Liaw, C.-C.; Yang, Y.-L.; Humpf, H.-U.; Maansson, M.; Keyzers, R. A.; Sims, A. C.; Johnson, A. R.; Sidebottom, A. M.; Sedio, B. E.; Klitgaard, A.; Larson, C. B.; Boya P, C. A.; Torres-Mendoza, D.; Gonzalez, D. J.; Silva, D. B.; Marques, L. M.; Demarque, D. P.; Pociute, E.; O'Neill, E. C.; Briand, E.; Helfrich, E. J. N.; Granatosky, E. A.; Glukhov, E.; Ryffel, F.; Houson, H.; Mohimani, H.; Kharbush, J. J.; Zeng, Y.; Vorholt, J. A.; Kurita, K. L.; Charusanti, P.;

ACS Paragon Plus Environment

Page 15 of 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

McPhail, K. L.; Nielsen, K. F.; Vuong, L.; Elfeki, M.; Traxler, M. F.; Engene, N.; Koyama, N.; Vining, O. B.; Baric, R.; Silva, R. R.; Mascuch, S. J.; Tomasi, S.; Jenkins, S.; Macherla, V.; Hoffman, T.; Agarwal, V.; Williams, P. G.; Dai, J.; Neupane, R.; Gurr, J.; Rodriguez, A. M. C.; Lamsa, A.; Zhang, C.; Dorrestein, K.; Duggan, B. M.; Almaliti, J.; Allard, P.-M.; Phapale, P.; Nothias, L.-F.; Alexandrov, T.; Litaudon, M.; Wolfender, J.-L.; Kyle, J. E.; Metz, T. O.; Peryea, T.; Nguyen, D.-T.; VanLeer, D.; Shinn, P.; Jadhav, A.; Muller, R.; Waters, K. M.; Shi, W.; Liu, X.; Zhang, L.; Knight, R.; Jensen, P. R.; Palsson, B. O.; Pogliano, K.; Linington, R. G.; Gutierrez, M.; Lopes, N. P.; Gerwick, W. H.; Moore, B. S.; Dorrestein, P. C.; Bandeira, N. (2016) Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828—837. (12) Crüsemann, M.; O’Neill, E. C.; Larson, C. B.; Melnik, A. V.; Floros, D. J.; da Silva, R. R.; Jensen, P. R.; Dorrestein, P. C.; Moore, B. S. (2016) Prioritizing natural product diversity in a collection of 146 bacterial strains based on growth and extraction protocols. J. Nat. Prod. 80, 588—597. (13) Floros, D. J.; Jensen, P. R.; Dorrestein, P. C.; Koyama, N. (2016) A metabolomics guided exploration of marine natural product chemical space. Metabolomics 12, 145—156. (14) Brito, Â.; Gaifem, J.; Ramos, V.; Glukhov, E.; Dorrestein, P. C.; Gerwick, W. H.; Vasconcelos, V. M.; Mendes, M. V.; Tamagnini, P. (2015) Bioprospecting Portuguese Atlantic coast cyanobacteria for bioactive secondary metabolites reveals untapped chemodiversity. Algal Res. 9, 218—226. (15) Naman, C. B.; Rattan, R.; Nikoulina, S. E.; Lee, J.; Miller, B. W.; Moss, N. A.; Armstrong, L.; Boudreau, P. D.; Debonsi, H. M.; Valeriote, F. A.; Dorrestein, P. C.; Gerwick, W. H. (2017) Integrating molecular networking and biological assays to target the isolation of a cytotoxic cyclic octapeptide, samoamide A, from an American Samoan marine cyanobacterium. J. Nat. Prod. 80, 625—633. (16) Vasas, A.; Hohmann, J. (2014) Euphorbia diterpenes: isolation, structure, biological activity, and synthesis (2008–2012). Chem. Rev. 114, 8579—8612. (17) Devappa, R. K.; Makkar, H. P. S.; Becker, K. (2011) Jatropha diterpenes: a review. J. Am. Oil Chem. Soc. 88, 301—322. (18) Sabandar, C. W.; Ahmat, N.; Jaafar, F. M.; Sahidin, I. (2013) Medicinal property, phytochemistry and pharmacology of several Jatropha species (Euphorbiaceae): a review. Phytochemistry 85, 7—29. (19) Mwine, T. J.; Van Damme, P. (2011) Why do Euphorbiaceae tick as medicinal plants?: a review of Euphorbiaceae family and its medicinal features. J. Med. Plants Res. 5, 652-662. (20) Xu, J.-B.; Yue, J.-M. (2014) Recent studies on the chemical constituents of Trigonostemon plants. Org Chem Front. 1, 1225—1252. (21) Shi, Q.-W.; Su, X.-H.; Kiyota, H. (2008) Chemical and pharmacological research of the plants in genus Euphorbia. Chem. Rev. 108, 4295—4327. (22) Nothias-Scaglia, L.-F.; Pannecouque, C.; Renucci, F.; Delang, L.; Neyts, J.; Roussi, F.; Costa, J.; Leyssen, P.; Litaudon, M.; Paolini, J. (2015) Antiviral activity of diterpene esters on chikungunya virus and HIV replication. J. Nat. Prod. 78, 1277—1283. (23) Allard, P.-M.; Martin, M.-T.; Tran Huu Dau, M.-E.; Leyssen, P.; Guéritte, F.; Litaudon, M. (2012) Trigocherrin A, the first natural chlorinated daphnane diterpene orthoester from Trigonostemon cherrieri. Org. Lett. 14, 342—345. (24) Bourjot, M.; Delang, L.; Nguyen, V. H.; Neyts, J.; Guéritte, F.; Leyssen, P.; Litaudon, M. (2012) Prostratin and 12-O-tetradecanoylphorbol 13-acetate are potent and selective inhibitors of chikungunya virus replication. J. Nat. Prod. 75, 2183—2187.

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(25) Anastas, J. N.; Moon, R. T. (2013) WNT signalling pathways as therapeutic targets in cancer. Nat Rev Cancer 13, 11—26. (26) http://dnp.chemnetbase.com/. (27) Allard, P.-M.; Péresse, T.; Bisson, J.; Gindro, K.; Marcourt, L.; Pham, V. C.; Roussi, F.; Litaudon, M.; Wolfender, J.-L. (2016) Integration of molecular networking and in-silico MS/MS fragmentation for natural products dereplication. Anal. Chem. 88, 3317—3323. (28) Olivon, F.; Palenzuela, H.; Girard-Valenciennes, E.; Neyts, J.; Pannecouque, C.; Roussi, F.; Grondin, I.; Leyssen, P.; Litaudon, M. (2015) Antiviral activity of flexibilane and tigliane diterpenoids from Stillingia lineata. J. Nat. Prod. 78, 1119—1128. (29) Fuentes, R. G.; Arai, M. A.; Ishibashi, M. (2015) Natural compounds with Wnt signal modulating activity. Nat. Prod. Rep. 32, 1622—1628. (30) Azzollini, A.; Favre-Godal, Q.; Zhang, J.; Marcourt, L.; Ebrahimi, S. N.; Wang, S.; Fan, P.; Lou, H.; Guillarme, D.; Queiroz, E. F.; Wolfender, J.-L. (2016) Preparative scale MS-guided isolation of bioactive compounds using high-resolution flash chromatography: antifungals from Chiloscyphus polyanthos as a case study. Planta Med. 82, 1051—1057. (31) Kellogg, J. J.; Todd, D. A.; Egan, J. M.; Raja, H. A.; Oberlies, N. H.; Kvalheim, O. M.; Cech, N. B. (2016) Biochemometrics for natural products research: comparison of data analysis approaches and application to identification of bioactive compounds. J. Nat. Prod. 79, 376—386. (32) Chambers, M. C.; Maclean, B.; Burke, R.; Amodei, D.; Ruderman, D. L.; Neumann, S.; Gatto, L.; Fischer, B.; Pratt, B.; Egertson, J.; Hoff, K.; Kessner, D.; Tasman, N.; Shulman, N.; Frewen, B.; Baker, T. A.; Brusniak, M.-Y.; Paulse, C.; Creasy, D.; Flashner, L.; Kani, K.; Moulding, C.; Seymour, S. L.; Nuwaysir, L. M.; Lefebvre, B.; Kuhlmann, F.; Roark, J.; Rainer, P.; Detlev, S.; Hemenway, T.; Huhmer, A.; Langridge, J.; Connolly, B.; Chadick, T.; Holly, K.; Eckels, J.; Deutsch, E. W.; Moritz, R. L.; Katz, J. E.; Agus, D. B.; MacCoss, M.; Tabb, D. L.; Mallick, P. (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918—920. (33) Frank, A. M.; Bandeira, N.; Shen, Z.; Tanner, S.; Briggs, S. P.; Smith, R. D.; Pevzner, P. A. (2008) Clustering millions of tandem mass spectra. J. Proteome Res. 7, 113—122.

ACS Paragon Plus Environment

Page 16 of 17

Page 17 of 17

ACS Chemical Biology

Natural Products Extracts Library

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

Massive molecular networks

Bioactivity mapping

Bioactive Natural Products O C15H31

Taxonomical mapping

Prioritized NPs family

ACS Paragon Plus Environment

H

H

MS-targeted isolation

O

H OH

O HOHO

O OH