Metabolic Clustering Analysis as a Strategy for Compound Selection

Apr 19, 2018 - Metabolic Clustering Analysis by Principal Components ... Therefore, although it cannot be discounted since it accounts for almost 50% ...
3 downloads 3 Views 1MB Size
Subscriber access provided by UNIV OF DURHAM

Metabolic clustering analysis as a strategy for compound selection in the drug discovery pipeline for leishmaniasis Emily G Armitage, Joanna Godzien, Imanol Peña, Ángeles López-Gonzálvez, Santiago Angulo, Ana Gradillas, Vanesa Alonso-Herranz, Julio Martín, Jose M, Fiandor, Michael P. Barrett, Raquel Gabarro, and Coral Barbas ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.8b00204 • Publication Date (Web): 19 Apr 2018 Downloaded from http://pubs.acs.org on April 19, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

1

Metabolic clustering analysis as a strategy for compound selection in the drug discovery pipeline

2

for leishmaniasis

3 4

Emily G Armitage1,2,3, Joanna Godzien1, Imanol Peña2, Ángeles López-Gonzálvez1, Santiago Angulo1,

5

Ana Gradillas1, Vanesa Alonso-Herranz1, Julio Martín2, Jose M Fiandor2, Michael P. Barrett3, Raquel

6

Gabarro2 and Coral Barbas1

7 8

1. Centre for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia, Universidad CEU San

9

Pablo, Campus Montepríncipe, Boadilla del Monte, 28668 Madrid, Spain

10 11

2. GSK I+D Diseases of the Developing World (DDW), Parque Tecnológico de Madrid, Calle de Severo

12

Ochoa 2, 28760 Tres Cantos, Madrid, Spain

13 14

3. Wellcome Centre for Molecular Parasitology, Institute of Infection, Immunity and Inflammation,

15

College of Medical Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8TA, UK &

16

Glasgow Polyomics, Wolfson Wohl Cancer Research Centre, College of Medical Veterinary & Life

17

Sciences, University of Glasgow, Glasgow G61 1QH, UK

18 19

Corresponding author: Coral Barbas. Telephone - +34 913724711; email - [email protected]; postal

20

address - Centre for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia, Universidad

21

CEU San Pablo, Campus Montepríncipe, Boadilla del Monte, 28668 Madrid, Spain.

22 23

Abstract

24 25 26 27 28 29 30 31 32 33 34 35

A lack of viable hits, increasing resistance and limited knowledge on mode of action is hindering drug discovery for many diseases. To optimise prioritisation and accelerate the discovery process, a strategy to cluster compounds based on more than chemical structure is required. We show the power of metabolomics in comparing effects on metabolism of 28 different candidate treatments for Leishmaniasis (25 from the GSK Leishmania box, two analogues of Leishmania box series and amphotericin B as a gold standard treatment), tested in the axenic amastigote form of Leishmania donovani. Capillary electrophoresis - mass spectrometry was applied to identify the metabolic profile of Leishmania donovani and principal components analysis was used to cluster compounds on potential mode of action, offering a medium throughput screening approach in drug selection/prioritisation. The comprehensive and sensitive nature of the data has also made detailed effects of each compound obtainable, providing a resource to assist in further mechanistic studies and prioritisation of these compounds for the development of new anti-leishmanial drugs.

36

1 ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

37

Key words: Drug discovery; drug repurposing; metabolomics; principal components analysis;

38

Leishmania donovani; mode of action

39 40

Abbreviations: MoA – mode of action; CE-MS – capillary electrophoresis – mass spectrometry; MSI –

41

metabolomics standards initiative; QC – quality control; PCA principal components analysis; gas

42

chromatography – mass spectrometry (GC-MS); liquid chromatography – mass spectrometry (LC-MS)

43 44 45

Authors declare no conflict of interest

46

2 ACS Paragon Plus Environment

Page 2 of 26

Page 3 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

47

1

48

Due to a lack of viable hits or increasing resistance to currently available treatments, the bottleneck

49

in research towards new therapies for many different diseases is a growing concern. Limited

50

knowledge on the mode of action (MoA) or polypharmacological effects of existing treatments could

51

be hindering the discovery of new compounds. Studying compound MoA is valuable to understand

52

how they could be improved, to propose combination therapies looking for synergistic actions and

53

also to determine possible toxic effects. For diseases where drug repurposing is a popular approach

54

e.g. for neglected tropical diseases, MoA studies are specifically important since compounds were

55

not originally designed to target the new disease type. Un-targeted approaches to study MoA are

56

useful when compounds are suspected to have polypharmacy effects beyond known targets and to

57

cluster compounds with the same MoA for improving the selection for further in vivo studies.

58

Metabolomics offers a valuable approach to clustering compounds on MoA1.

59

Neglected tropical diseases are a prime example where resistance to current treatments is

60

problematic, funding is limited and drug repurposing is popular. There is a requirement for novel

61

strategies in the drug discovery pipeline for medium throughput screening that combines a balance

62

on breadth and depth of knowledge on drug MoA. The leishmaniases are a spectrum of neglected

63

tropical diseases caused by protozoa of the genus Leishmania. Leishmania donovani provokes one of

64

the most severe forms that is visceral leishmaniasis2 and existing therapeutic options for this are

65

limited3. From a recent screening of 1.8 million compounds against the three kinetoplastid parasites

66

most relevant to human disease (Leishmania donovani, Trypanosoma brucei and Trypanosoma

67

cruzi), 192 non-cytotoxic active hits against Leishmania donovani were selected to be included in the

68

so called Leishmania box2. While some general hypotheses were generated relating to compound

69

structure, suggesting that many of them could target kinases, proteases, cytochromes and host-

70

pathogen interactions, the MoA of each is still unknown. Classification into MoA is an important

71

element in analysing activity data. To optimise prioritisation and accelerate discovery, a strategy to

72

cluster compounds based on more than chemical structure is required. Metabolomics and other hit-

73

to-screen assays can be powerful tools in the analysis of MoA4.

74

The metabolomics approach to study the MoA of compounds for drug discovery purposes has been

75

successfully applied in many fields. For recent reviews, see Vincent and Barrett 20155 for

76

parasitology; Armitage and Southam et al. 20166 for oncology; Rankin et al. 20167 for cardiology;

77

Adamski 20168 for diabetes; Atzori et al. 20129 for perinatology;

78

osteoporosis drug discovery; dos Santos et al. 201611 for antibacterial MoA of plant derived

Introduction

3 ACS Paragon Plus Environment

Gennari et al. 201510 for

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

79

products; Mikami et al. 201212 for updates specific to MS-based metabolomics and Hoerr et al.

80

201613 for updates specific to NMR-based metabolomics.

81 82

Studying compound MoA can be challenging, especially since it is difficult to distinguish drug effects

83

from generic stress responses. A way to overcome this is to study many compounds in the same

84

organism in parallel so that generic stress responses can be identified in all drug treated samples

85

(therefore not drug specific). Different approaches can be taken to study MoA and the ‘omic’

86

approaches can be particularly attractive due to their medium-high-throughput screening

87

capabilities combined with high sensitivity and coverage. Moreover, integration of omic data with in

88

silico network analysis is a systems pharmacology approach that can be used to identify compound

89

MOA on a multiscale14.

90 91

The metabolomics approach has recently been applied to study compounds of the Malaria box,

92

another tropical disease with similar unmet medical needs15,16. Metabolomics screening was applied

93

to reveal the metabolic perturbations induced by 90 of the almost 30,000 compounds that were

94

previously shown to selectively inhibit growth of cultured P. falciparum asexual red blood cell stages,

95

in addition to samples treated with known anti-malarials. The key features of the medium-high

96

throughput screen were the use of the 96-well format and the use of high-sensitivity LC-MS to

97

reproducibly detect 460 putatively annotated metabolites from a range of metabolic pathways.

98

Though the number of compounds and the number of metabolites detected was high, authors of

99

this study reported significant batch effects using this experimental design that were partially

100

overcome by normalisation of treated samples to untreated controls on each plate, but systematic

101

variation was still observed in a subset of the drug treatments. Moreover, single doses of 1µM were

102

studied for 5h of exposure, irrespective of growth inhibition rates, meaning that some treatments

103

did not elicit metabolic response under the conditions tested.

104 105

The Leishmania box contains a total of 192 compounds2. In the present research, lead compounds of

106

the Leishmania box have been screened using metabolomics. . Twenty-eight compounds (27

107

compounds or analogues from the box in addition to amphotericin B) have been studied in

108

Leishmania donovani axenic amastigotes, chosen as the most relevant in vitro model of human

109

leishmaniasis. Samples of parasites exposed to each compound were prepared in parallel with un-

110

treated control samples and analysed using an un-targeted metabolomics approach to reveal the

111

similarities and differences in the metabolome following treatment. The dose of compound and

112

exposure time was chosen based on individual kill kinetics[1, 14].

4 ACS Paragon Plus Environment

Page 4 of 26

Page 5 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

113 114

We aimed to reveal clusters of compounds with similar action against the parasite metabolome, as

115

shown previously for anti-malarials15. The identification of clusters allows selection of compounds

116

for further consideration in the drug discovery pipeline. Capillary electrophoresis – mass

117

spectrometry (CE-MS) was chosen as the analytical tool to study the metabolomes of treated

118

parasites, combined with definitive identification of the majority of the metabolic profile screened

119

(metabolomics standards initiative (MSI) level 118). Moreover, due to the scale of the study, analyses

120

were performed in batches and data were integrated for processing. As one of the largest scale

121

metabolomics studies employing CE-MS, important strategies were identified in data treatment that

122

build on previously observed limitations in metabolomics and could be useful in the field beyond the

123

scope of this research.

124 125 126

2

Results and discussion

127 128

2.1

129

Following filtration to remove features directly associated with specific compounds and therefore

130

likely metabolites of the compounds themselves, in addition to filtration by QC RSD (keeping those

131

features with RSD < 30%), 174 features remained and data were assessed for quality and batch

132

effect. Supplementary figure 1S shows an overview of the analyses from three analytical batches

133

considering the internal standard signal, total useful signal and total number of features. As shown,

134

certain samples had particularly low numbers of features and total useful signal. These samples

135

corresponded to five specific compound groups in addition to one anomalous sample from another

136

group. Parasite numbers calculated for each sample before and after washing were consulted to

137

confirm that these lower profiles did not occur because of lower parasite number in those samples.

138

Trends in signal were observed in the internal standard and total useful signal. Three methods of

139

normalisation were performed to observe how data quality could be improved to remove this batch

140

effect. Supplementary figure 2S shows the scores plots generated for the first two PCs before and

141

after normalisation by total useful signal, internal standard and a commonly used method in

142

metabolomics – locally estimated scatterplot smoothing (LOESS). Normalisation by internal standard

143

was deemed most appropriate since it did not skew the remaining samples based on the number of

144

features present as total useful signal normalisation did.

Assessment of data quality and overview of entire analysis

145

5 ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

146

Batch effects are common and often unavoidable in large-scale studies19,20. The challenge has been

147

addressed previously, mainly for gas chromatography – mass spectrometry (GC-MS) and liquid

148

chromatography – mass spectrometry (LC-MS) data21,22, although to our knowledge has not been

149

addressed for CE-MS based metabolomics. Moreover, CE-MS is often discounted based on its

150

reputation for irreproducibility, particularly in migration time, although advancements in technology

151

and methodology are making CE-MS increasingly popular23. In our experience, careful choice of

152

analytical method, experimental design and studying of raw data to find the best parameters for

153

alignment of analytical batches, makes CE-MS a robust and viable choice in multiple-batch studies,

154

especially for ionic and polar metabolites where the alternative mechanism would be to use HILIC

155

based LC-MS, that has a deeper complexity of issues surrounding robustness and reproducibility24.

156 157

2.2

Identification of Leishmania donovani axenic amastigote metabolic profile

158 159

Before further multivariate analysis, identification of the entire CE-MS profile was performed for un-

160

treated parasite samples. The 174 peaks following filtration and normalisation were first annotated

161

and from this 105 were found to be unique features. The remaining features were identified as

162

fragments, dimers or artefacts of other metabolites present in the data and were therefore removed

163

in all further multivariate analysis. All features passed filters for presence in un-treated parasites and

164

as such this identified profile serves as the first complete metabolic profile of Leishmania donovani

165

axenic amastigotes in CE-MS. A total of 36 metabolites could be definitively identified to MSI level 1,

166

determined through analysis of authentic standards and a further 10 were identified to level 2. A

167

network showing metabolic interactions is shown in Figure 2, where MSI level 1 identified

168

metabolites are highlighted in bold. KEGG enzyme numbers are shown. Yellow dots indicate

169

metabolites closely relating detected metabolites but that were not detected themselves.

170

Supplementary table 2 details the experimental m/z, migration time and MSI level of identification

171

for all 105 uniquely distinguished metabolites.

172 173

2.3

Metabolic clustering analysis by principal components

174 175

Multivariate analysis of data using PCA can be challenging, especially when the experimental and

176

biological complexity increases. For example, in the related study of metabolomics on malaria box

177

compounds, the first two PCs revealed only stochastic biological/experimental variation, while

178

usable information was embedded in PC3 onwards15, representing a low proportion of total

179

variability in the model. To explore this in our data, two approaches of metabolic clustering were 6 ACS Paragon Plus Environment

Page 6 of 26

Page 7 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

180

employed to study the similarities and differences in parasites treated with one compound

181

compared with un-treated cells. These approaches together revealed complementary information

182

on the likely MoA of different compound clusters that could be used to select a subset of

183

compounds for further analysis in the drug discovery pipeline for leishmaniasis. In both cases, all

184

features (identified, annotated or unidentified, as detailed in supplementary table 2) were included

185

in the multivariate analyses. In the first approach, a further filter to keep only those features with

186

RSD