Subscriber access provided by UNIV OF DURHAM
Metabolic clustering analysis as a strategy for compound selection in the drug discovery pipeline for leishmaniasis Emily G Armitage, Joanna Godzien, Imanol Peña, Ángeles López-Gonzálvez, Santiago Angulo, Ana Gradillas, Vanesa Alonso-Herranz, Julio Martín, Jose M, Fiandor, Michael P. Barrett, Raquel Gabarro, and Coral Barbas ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.8b00204 • Publication Date (Web): 19 Apr 2018 Downloaded from http://pubs.acs.org on April 19, 2018
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ACS Chemical Biology
1
Metabolic clustering analysis as a strategy for compound selection in the drug discovery pipeline
2
for leishmaniasis
3 4
Emily G Armitage1,2,3, Joanna Godzien1, Imanol Peña2, Ángeles López-Gonzálvez1, Santiago Angulo1,
5
Ana Gradillas1, Vanesa Alonso-Herranz1, Julio Martín2, Jose M Fiandor2, Michael P. Barrett3, Raquel
6
Gabarro2 and Coral Barbas1
7 8
1. Centre for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia, Universidad CEU San
9
Pablo, Campus Montepríncipe, Boadilla del Monte, 28668 Madrid, Spain
10 11
2. GSK I+D Diseases of the Developing World (DDW), Parque Tecnológico de Madrid, Calle de Severo
12
Ochoa 2, 28760 Tres Cantos, Madrid, Spain
13 14
3. Wellcome Centre for Molecular Parasitology, Institute of Infection, Immunity and Inflammation,
15
College of Medical Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8TA, UK &
16
Glasgow Polyomics, Wolfson Wohl Cancer Research Centre, College of Medical Veterinary & Life
17
Sciences, University of Glasgow, Glasgow G61 1QH, UK
18 19
Corresponding author: Coral Barbas. Telephone - +34 913724711; email -
[email protected]; postal
20
address - Centre for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia, Universidad
21
CEU San Pablo, Campus Montepríncipe, Boadilla del Monte, 28668 Madrid, Spain.
22 23
Abstract
24 25 26 27 28 29 30 31 32 33 34 35
A lack of viable hits, increasing resistance and limited knowledge on mode of action is hindering drug discovery for many diseases. To optimise prioritisation and accelerate the discovery process, a strategy to cluster compounds based on more than chemical structure is required. We show the power of metabolomics in comparing effects on metabolism of 28 different candidate treatments for Leishmaniasis (25 from the GSK Leishmania box, two analogues of Leishmania box series and amphotericin B as a gold standard treatment), tested in the axenic amastigote form of Leishmania donovani. Capillary electrophoresis - mass spectrometry was applied to identify the metabolic profile of Leishmania donovani and principal components analysis was used to cluster compounds on potential mode of action, offering a medium throughput screening approach in drug selection/prioritisation. The comprehensive and sensitive nature of the data has also made detailed effects of each compound obtainable, providing a resource to assist in further mechanistic studies and prioritisation of these compounds for the development of new anti-leishmanial drugs.
36
1 ACS Paragon Plus Environment
ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
37
Key words: Drug discovery; drug repurposing; metabolomics; principal components analysis;
38
Leishmania donovani; mode of action
39 40
Abbreviations: MoA – mode of action; CE-MS – capillary electrophoresis – mass spectrometry; MSI –
41
metabolomics standards initiative; QC – quality control; PCA principal components analysis; gas
42
chromatography – mass spectrometry (GC-MS); liquid chromatography – mass spectrometry (LC-MS)
43 44 45
Authors declare no conflict of interest
46
2 ACS Paragon Plus Environment
Page 2 of 26
Page 3 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ACS Chemical Biology
47
1
48
Due to a lack of viable hits or increasing resistance to currently available treatments, the bottleneck
49
in research towards new therapies for many different diseases is a growing concern. Limited
50
knowledge on the mode of action (MoA) or polypharmacological effects of existing treatments could
51
be hindering the discovery of new compounds. Studying compound MoA is valuable to understand
52
how they could be improved, to propose combination therapies looking for synergistic actions and
53
also to determine possible toxic effects. For diseases where drug repurposing is a popular approach
54
e.g. for neglected tropical diseases, MoA studies are specifically important since compounds were
55
not originally designed to target the new disease type. Un-targeted approaches to study MoA are
56
useful when compounds are suspected to have polypharmacy effects beyond known targets and to
57
cluster compounds with the same MoA for improving the selection for further in vivo studies.
58
Metabolomics offers a valuable approach to clustering compounds on MoA1.
59
Neglected tropical diseases are a prime example where resistance to current treatments is
60
problematic, funding is limited and drug repurposing is popular. There is a requirement for novel
61
strategies in the drug discovery pipeline for medium throughput screening that combines a balance
62
on breadth and depth of knowledge on drug MoA. The leishmaniases are a spectrum of neglected
63
tropical diseases caused by protozoa of the genus Leishmania. Leishmania donovani provokes one of
64
the most severe forms that is visceral leishmaniasis2 and existing therapeutic options for this are
65
limited3. From a recent screening of 1.8 million compounds against the three kinetoplastid parasites
66
most relevant to human disease (Leishmania donovani, Trypanosoma brucei and Trypanosoma
67
cruzi), 192 non-cytotoxic active hits against Leishmania donovani were selected to be included in the
68
so called Leishmania box2. While some general hypotheses were generated relating to compound
69
structure, suggesting that many of them could target kinases, proteases, cytochromes and host-
70
pathogen interactions, the MoA of each is still unknown. Classification into MoA is an important
71
element in analysing activity data. To optimise prioritisation and accelerate discovery, a strategy to
72
cluster compounds based on more than chemical structure is required. Metabolomics and other hit-
73
to-screen assays can be powerful tools in the analysis of MoA4.
74
The metabolomics approach to study the MoA of compounds for drug discovery purposes has been
75
successfully applied in many fields. For recent reviews, see Vincent and Barrett 20155 for
76
parasitology; Armitage and Southam et al. 20166 for oncology; Rankin et al. 20167 for cardiology;
77
Adamski 20168 for diabetes; Atzori et al. 20129 for perinatology;
78
osteoporosis drug discovery; dos Santos et al. 201611 for antibacterial MoA of plant derived
Introduction
3 ACS Paragon Plus Environment
Gennari et al. 201510 for
ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
79
products; Mikami et al. 201212 for updates specific to MS-based metabolomics and Hoerr et al.
80
201613 for updates specific to NMR-based metabolomics.
81 82
Studying compound MoA can be challenging, especially since it is difficult to distinguish drug effects
83
from generic stress responses. A way to overcome this is to study many compounds in the same
84
organism in parallel so that generic stress responses can be identified in all drug treated samples
85
(therefore not drug specific). Different approaches can be taken to study MoA and the ‘omic’
86
approaches can be particularly attractive due to their medium-high-throughput screening
87
capabilities combined with high sensitivity and coverage. Moreover, integration of omic data with in
88
silico network analysis is a systems pharmacology approach that can be used to identify compound
89
MOA on a multiscale14.
90 91
The metabolomics approach has recently been applied to study compounds of the Malaria box,
92
another tropical disease with similar unmet medical needs15,16. Metabolomics screening was applied
93
to reveal the metabolic perturbations induced by 90 of the almost 30,000 compounds that were
94
previously shown to selectively inhibit growth of cultured P. falciparum asexual red blood cell stages,
95
in addition to samples treated with known anti-malarials. The key features of the medium-high
96
throughput screen were the use of the 96-well format and the use of high-sensitivity LC-MS to
97
reproducibly detect 460 putatively annotated metabolites from a range of metabolic pathways.
98
Though the number of compounds and the number of metabolites detected was high, authors of
99
this study reported significant batch effects using this experimental design that were partially
100
overcome by normalisation of treated samples to untreated controls on each plate, but systematic
101
variation was still observed in a subset of the drug treatments. Moreover, single doses of 1µM were
102
studied for 5h of exposure, irrespective of growth inhibition rates, meaning that some treatments
103
did not elicit metabolic response under the conditions tested.
104 105
The Leishmania box contains a total of 192 compounds2. In the present research, lead compounds of
106
the Leishmania box have been screened using metabolomics. . Twenty-eight compounds (27
107
compounds or analogues from the box in addition to amphotericin B) have been studied in
108
Leishmania donovani axenic amastigotes, chosen as the most relevant in vitro model of human
109
leishmaniasis. Samples of parasites exposed to each compound were prepared in parallel with un-
110
treated control samples and analysed using an un-targeted metabolomics approach to reveal the
111
similarities and differences in the metabolome following treatment. The dose of compound and
112
exposure time was chosen based on individual kill kinetics[1, 14].
4 ACS Paragon Plus Environment
Page 4 of 26
Page 5 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ACS Chemical Biology
113 114
We aimed to reveal clusters of compounds with similar action against the parasite metabolome, as
115
shown previously for anti-malarials15. The identification of clusters allows selection of compounds
116
for further consideration in the drug discovery pipeline. Capillary electrophoresis – mass
117
spectrometry (CE-MS) was chosen as the analytical tool to study the metabolomes of treated
118
parasites, combined with definitive identification of the majority of the metabolic profile screened
119
(metabolomics standards initiative (MSI) level 118). Moreover, due to the scale of the study, analyses
120
were performed in batches and data were integrated for processing. As one of the largest scale
121
metabolomics studies employing CE-MS, important strategies were identified in data treatment that
122
build on previously observed limitations in metabolomics and could be useful in the field beyond the
123
scope of this research.
124 125 126
2
Results and discussion
127 128
2.1
129
Following filtration to remove features directly associated with specific compounds and therefore
130
likely metabolites of the compounds themselves, in addition to filtration by QC RSD (keeping those
131
features with RSD < 30%), 174 features remained and data were assessed for quality and batch
132
effect. Supplementary figure 1S shows an overview of the analyses from three analytical batches
133
considering the internal standard signal, total useful signal and total number of features. As shown,
134
certain samples had particularly low numbers of features and total useful signal. These samples
135
corresponded to five specific compound groups in addition to one anomalous sample from another
136
group. Parasite numbers calculated for each sample before and after washing were consulted to
137
confirm that these lower profiles did not occur because of lower parasite number in those samples.
138
Trends in signal were observed in the internal standard and total useful signal. Three methods of
139
normalisation were performed to observe how data quality could be improved to remove this batch
140
effect. Supplementary figure 2S shows the scores plots generated for the first two PCs before and
141
after normalisation by total useful signal, internal standard and a commonly used method in
142
metabolomics – locally estimated scatterplot smoothing (LOESS). Normalisation by internal standard
143
was deemed most appropriate since it did not skew the remaining samples based on the number of
144
features present as total useful signal normalisation did.
Assessment of data quality and overview of entire analysis
145
5 ACS Paragon Plus Environment
ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
146
Batch effects are common and often unavoidable in large-scale studies19,20. The challenge has been
147
addressed previously, mainly for gas chromatography – mass spectrometry (GC-MS) and liquid
148
chromatography – mass spectrometry (LC-MS) data21,22, although to our knowledge has not been
149
addressed for CE-MS based metabolomics. Moreover, CE-MS is often discounted based on its
150
reputation for irreproducibility, particularly in migration time, although advancements in technology
151
and methodology are making CE-MS increasingly popular23. In our experience, careful choice of
152
analytical method, experimental design and studying of raw data to find the best parameters for
153
alignment of analytical batches, makes CE-MS a robust and viable choice in multiple-batch studies,
154
especially for ionic and polar metabolites where the alternative mechanism would be to use HILIC
155
based LC-MS, that has a deeper complexity of issues surrounding robustness and reproducibility24.
156 157
2.2
Identification of Leishmania donovani axenic amastigote metabolic profile
158 159
Before further multivariate analysis, identification of the entire CE-MS profile was performed for un-
160
treated parasite samples. The 174 peaks following filtration and normalisation were first annotated
161
and from this 105 were found to be unique features. The remaining features were identified as
162
fragments, dimers or artefacts of other metabolites present in the data and were therefore removed
163
in all further multivariate analysis. All features passed filters for presence in un-treated parasites and
164
as such this identified profile serves as the first complete metabolic profile of Leishmania donovani
165
axenic amastigotes in CE-MS. A total of 36 metabolites could be definitively identified to MSI level 1,
166
determined through analysis of authentic standards and a further 10 were identified to level 2. A
167
network showing metabolic interactions is shown in Figure 2, where MSI level 1 identified
168
metabolites are highlighted in bold. KEGG enzyme numbers are shown. Yellow dots indicate
169
metabolites closely relating detected metabolites but that were not detected themselves.
170
Supplementary table 2 details the experimental m/z, migration time and MSI level of identification
171
for all 105 uniquely distinguished metabolites.
172 173
2.3
Metabolic clustering analysis by principal components
174 175
Multivariate analysis of data using PCA can be challenging, especially when the experimental and
176
biological complexity increases. For example, in the related study of metabolomics on malaria box
177
compounds, the first two PCs revealed only stochastic biological/experimental variation, while
178
usable information was embedded in PC3 onwards15, representing a low proportion of total
179
variability in the model. To explore this in our data, two approaches of metabolic clustering were 6 ACS Paragon Plus Environment
Page 6 of 26
Page 7 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ACS Chemical Biology
180
employed to study the similarities and differences in parasites treated with one compound
181
compared with un-treated cells. These approaches together revealed complementary information
182
on the likely MoA of different compound clusters that could be used to select a subset of
183
compounds for further analysis in the drug discovery pipeline for leishmaniasis. In both cases, all
184
features (identified, annotated or unidentified, as detailed in supplementary table 2) were included
185
in the multivariate analyses. In the first approach, a further filter to keep only those features with
186
RSD