Matched Molecular Series: Measuring SAR Similarity - Journal of

May 1, 2017 - Best Practices of Computer-Aided Drug Discovery: Lessons Learned from the Development of a Preclinical Candidate for Prostate Cancer wit...
0 downloads 18 Views 3MB Size
Article pubs.acs.org/jcim

Matched Molecular Series: Measuring SAR Similarity Emanuel S. R. Ehmki† and Christian Kramer* Chemical Biology/Therapeutic Modalities, F. Hoffmann-La Roche Ltd., Roche Innovation Center Basel, Grenzacherstrasse 124, 4070 Basel, Switzerland S Supporting Information *

ABSTRACT: Suggesting novel compounds to be made on the basis of similarity to a previously seen structure−activity relationship (SAR) requires a measure for SAR similarity. While SAR similarity has intuitively been used by medicinal chemists for decades, no systematic comparison of candidate similarity metrics has been published to date. With this publication, we attempt to close that gap by providing a statistical framework that allows comparison of SAR similarity metrics by their ability to rank series that provide the best activity prediction of novel substituents. This prediction is a result of a two-step process that involves (a) judging the similarity between series and (b) transferring the SAR from one series to the other. We tested several SAR similarity metrics and found that a centered RMSD (cRMSD) in combination with a lineaarregression-based prediction interpolation ranks SAR profiles best. Based on that ranking we can, with a given confidence, suggest novel substituents to be tested. The superiority of the cRMSD can be explained on the basis of experimental uncertainty of affinity data and measured affinity differences. The ability to measure SAR similarity is central to applications like matched molecular series (MMS) analysis, whose applicability depends on whether there is a potential for SAR transferability between series. With the new SAR similarity metric introduced here, we show how MMS can be used in a medicinal chemistry setting as an idea generator and a semiquantitative prediction tool.



INTRODUCTION Building substituent series is a key modus operandi of structure−activity relationship (SAR) exploration in medicinal chemistry. Once a suitable attachment point is identified, series of compounds that share the same scaffold and differ in one position are synthesized to screen for the optimal substituent. Once an activity profile for a specific position starts to emerge, the question arises “Which other substituent in the given position is worth trying?”, or in the words of O’Boyle et al., “What to make next?”.1 This decision can be inspired by a plethora of computational models such as quantitative SAR/ quantitative structure−property relationship (QSAR/QSPR), docking, interactive modeling, and free energy perturbation (FEP) that predict the effect of novel substituents. However, typically it is also largely influenced by the experience of the medicinal chemists, drawing on the similarity to the same substituent series in other projects. Since experience is biased by patents, publications, and SAR profile of one’s own past projects, it is tempting to support the process of finding molecular series with similar activity profiles by computational means. The key scientific question for this process, which we set out to answer in this contribution, is “How to measure SAR similarity?” Matched molecular series (MMS) is an SAR analysis concept residing between matched molecular pairs (MMP) and Free− Wilson analysis. In contrast to MMP, in MMS a series of several substituents at the same position is analyzed, yielding a kind of mini-SAR. The activity profile of series A can be © 2017 American Chemical Society

compared with those of other series of the same kind, and if series B has a similar activity profile, it may be worth testing additional substituents that improved the activity on series B on the scaffold of series A. Free−Wilson analysis also generates a local SAR, but it typically only generates knowledge that is transferrable within a given assay and scaffold.2,3 The MMS approach could be used as a pure idea generator for novel substituents within a given project by analyzing which other substituents have been made in projects where the same substituents were active.4 Here we propose that MMS can be extended into a semiquantitative activity prediction tool if the proper SAR similarity metrics are used. Unlike classic MMP analysis, which mostly works for continuum-type properties like logP and permeability,5−10 we posit that MMS can be used to semiquantitatively predict on-target properties: The activity profile of a series can be considered as a subpocket fingerprint, probed by different substituents. If substituents coincide with other series from another project and the series display similar activity profiles, we assume that knowledge can be transferred from one series to another.11 Considering the centrality of molecular series for SAR exploration, it is surprising that very few publications have dealt with them in a quantitative and statistical way. Topliss trees can be considered the first series-based SAR exploration scheme. In 1972, Topliss published a decision-tree-like structure that Received: November 22, 2016 Published: May 1, 2017 1187

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196

Article

Journal of Chemical Information and Modeling

(single cuts) or linkers and scaffolds (double or triple cuts), but in the remaining we work only with substituents. Activity Profile. Each compound has a pKi or pIC50 value for a given assay. If for example a series consists of six compounds, each compound and thus each substituent (since the compounds differ only by one substituent per definition of the series) will be associated with an activity. The set of substituents together with their activities form the activity profile. Matched Molecular Series. Two molecular series that share the same substituents but have different constant parts, have been measured in different assays, or differ in both constant part and assay (which is the normal case) are termed matched molecular series. Similarity Metrics. We compare six different metrics for measuring the similarity between two matched molecular series x and y. Pearson Correlation Coefficient. The Pearson correlation coefficient (RP) is defined as

suggests novel substituents to be tested on the basis of activity gain/loss of previous substituents.12 It was not until 2009 that the concept of molecular series was formally taken up again,13 when Wawer and Bajorath coined the term “matched molecular series” as an extension to MMP14,15 in which three or more substituents on the same position are matched between two series.16 In a series of follow-up papers, they characterized the occurrence of series within public databases and investigated their use in SAR transfer and visualization.11,16−20 The concept of using MMS as a source of ideas for novel substituents was published by Mills et al. and O’Boyle et al. Mills et al.4 used visual inspection of correlation plots between a series of TRPA1 antagonists and literature series with the same substituents to prioritize series with similar SAR, from which ideas for new substituents were obtained. Within the MATSY algorithm, O’Boyle et al.1 formalized the concept of SAR similarity further and found series with the same ranking of substituents. Within those series, additional substituents that frequently increased the activity even more were then identified and suggested. In the following, we introduce a new approach to compare SAR similarity metrics: We postulate that SAR similarity metrics are good if they are able to identify series with similar SAR profiles among a set of matched series. Given a set of matched series with known SAR transferability, the best metric then is the one that is best able to rank the series with decreasing SAR transferability. We test six different metrics and two different SAR transfer schemas with that framework. In order to compare the different metrics, we set up a test with 108 different series/query fragments and identify the bestperforming similarity metric with high statistical significance. Physicochemical arguments are given to support the observation of why certain similarity metrics are better than others. Simulating the effect of experimental uncertainty, we identify a similarity threshold below which the activity profiles of different series can be considered as almost identical. Finally, we demonstrate how to apply the SAR similarity framework to obtain semiquantitative suggestions for novel substituents in on-target programs.

n

∑i = 1 (xi − x ̅ )(yi − y ̅ )

RP =

n

n

∑i = 1 (xi − x ̅ )2 ∑i = 1 (yi − y ̅ )2

where n is the length of the series (number of substituents), xi is the activity of the compound with substituent i from series x, yi is the activity of the compound with substituent i from series y, and x̅ and y ̅ are the mean values of the respective series. Spearman Correlation Coefficient. The Spearman correlation coefficient (RSp) is defined as RSp = n

∑i = 1 [rnk(xi) − rnk(x)][rnk(yi ) − rnk(y)] n

∑i = 1 [rnk(xi) − rnk(x)]2

n

∑i = 1 [rnk(yi ) − rnk(y)]2

This is the same as the Pearson correlation coefficient except that the activity values are replaced by their ranks (rnk). Manhattan Distance. The Manhattan distance (MD) is defined as



METHODS Data. ChEMBL 2121 was used to generate the substituent series. Only pKi and pIC50 data with an assay confidence score of 9 were used. Series were only generated among compounds that were measured in the same assay. Compounds with multiple values were omitted. MMS Database Setup. A database optimized to query series by the fragments contained was set up. The database was implemented in MySQL. The series were assembled using a modified and extended version of the MMP algorithm of Hussain and Rea22 available within RDKit.23 Query and Analysis Setup. Tools to query the MMS database were implemented using Python, making extensive use of the Pandas and NumPy modules.24,25 Statistical analysis was carried out using Python and R,26 and plots were made using the ggplot2 package.27 MMS Wording Clarification. Molecular Series. A molecular series consists of at least three compounds that share the same scaffold and differ at one position. Using Hussain−Rea-type fragmentation and a modified indexing algorithm for creating the series, we generated single, double, and triple cuts where the variable part has fewer than 11 heavy atoms. In principle, the variable parts can be substituents

n

MD =

∑ |xi − yi | i=1

The MD is the distance between two points expressed as the sum of their absolute differences. MD is conceptually very similar to the mean absolute deviation (MAD), which is more frequently used in cheminformatics and computational chemistry. Conversion of MD to MAD requires only division by the number of pairs considered. Since we here always use the same number of pairs, MD is practically equivalent to MAD. Root-Mean-Square Deviation. The root-mean-square deviation (RMSD) is defined as RMSD =

1 n

n

∑ (xi − yi )2 i=1

The RMSD is similar to the MD and MAD, but it puts a stronger weight on pairs that have larger distances. In terms of the equation used, it is identical to the RMS error (RMSE). However, since we do not use it for analyzing prediction results, we prefer “deviation” over “error”. 1188

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196

Article

Journal of Chemical Information and Modeling

If Δps is close to zero, the SAR is transferable for substituent s with the used transfer schema. Comparing Different SAR Similarity Metrics. SAR transfer has two aspects: (1) calculation of the SAR similarity and (2) predicting activities in a new series on the basis of the SAR profile of the old series. While the two aspects are not independent of each other, the first can be separated from the second for a deeper analysis. With a given SAR transfer schema and a given set of matched series, different SAR similarity metrics can be compared to each other by their ability to identify series where Δps is close to zero. If all available matched series are compared against each other, this becomes a ranking exercise: The best SAR similarity metric is the one that is able to best rank pairs of matched series where pairs with good SAR transferability (i.e., Δps close to zero) come first. Plots of Δps values versus similarity rank show quite some scatter that makes quantitative comparison hard (see Figure 2a). In order to enable quantitative comparison, we therefore convert the scattered values into a continuous metric by calculating σ(Δps)exp, the expanding standard deviation of Δps from the most similar series up to the series of the last rank. The principle of calculating σ(Δps)exp is illustrated in Figure 1.

Centered Manhattan Distance. The centered Manhattan distance (cMD) is defined as n

1 n

cMD =

∑ |(xi − x ̅ ) − (yi − y ̅ )| i=1

The cMD is similar to the MD, but the x and y series are centered before the differences are calculated. This compensates for a different offset due to the different core, assay, or interaction of the compound with the target. Centered Root-Mean-Square Deviation. The centered root-mean-square deviation (cRMSD) is defined as 1 n

cRMSD =

n

∑ [(xi − x ̅ ) − (yi − y ̅ )]2 i=1

The cRMSD is similar to the RMSD, but as in the case of cMD, the x and y series are centered before the differences are calculated. Transferring SAR between Series. We compare two different schemas for transferring SAR between series: difference to the mean (diffMean) and linear regression (linReg). Difference to the Mean. The diffMean is the simplest possible model for predicting the activity of a new compound from series x with substituent s: when two molecular series x and y are matched on substituents 1 to n, the activity of core x with a new substituent s that exists within series y (as compound y−s) but not yet within series x (as x−s) is predicted as

px − s ,pred = px + Δpy − s ̅

where px̅ is the average activity of all compounds with substituents 1 to n of series x,

px = ̅

1 n

n

∑ px ,i

Figure 1. For a given query series, all matched series from a database are ranked by the similarity metric and numbered with IDs by increasing dissimilarity (here measured by cRMSD). Starting from the third series, a standard deviation σ3 can be calculated. Further standard deviations are calculated by expanding the window of series included in the calculation of the standard deviations σ4 to σn.

i=1

and Δpy−s is the activity gain or loss due to substituent s on series y, calculated from the mean of substituents 1 to n on series y, Δpy − s = py − s −

1 n

n

∑ py ,i i=1

If a similarity metric works well, the standard deviation is small for similar series (i.e., for low ranks) and becomes larger as more and more dissimilar series are included (see Figure 2b). Plots of σ(Δps)exp versus similarity rank allow similarity metrics to be visually compared because the similarity metrics are better where σ(Δps)exp increases more slowly (see Figure 2c). In order to compare different similarity metrics, the increase can be quantified by calculating the area under the curve (AUC) of the σ(Δps)exp plot. If the same SAR transfer schema has been applied, all similarity metrics will even converge to the same final σ(Δps)n value. For each similarity metric, we thus calculate the AUC by triangulating along the x axis. It should be noted that RSp needs some special treatment to be comparable to the other metrics because it can only adopt a distinct number of values for a series of a few substituents. Use of the AUC of the σ(Δps)exp versus similarity rank curve allows different similarity metrics to be quantitatively compared because the best similarity metric for a given query series has the smallest AUC. To visualize the similarity metric comparison across

Linear Regression. In contrast to the previous schema, this approach for predicting the activity of a new compound with core x and substituent s takes into account the different slopes of series x and y. Here, px−s,pred is calculated as px − s ,pred = apy − s + b

where py−s is the activity of compound y−s and a and b are slope and intercept of the line of best fit between the corresponding activities of compounds 1 to n from series x and y. Validating the Prediction of New Substituents (SAR Transferability). If the activity of x−s, px−s,meas, is known but has not been used so far, it can be used to assess how well the SAR of series y is transferrable to series x by calculation of the difference between px−s,pred and px−s,meas: Δps = px − s ,pred − px − s ,meas 1189

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196

Article

Journal of Chemical Information and Modeling different series, we finally normalize them by dividing by the number of series considered, n, and the final standard deviation when all of the series are added to the pool, σ(Δps)n. Different SAR transfer schemas lead to different final σ(Δps)n values because the predicted Δps values now differ for the same pair of matched series. Combinations of SAR similarity metric and SAR transfer schema can also be compared with each other by comparison of the AUC values because a lower AUC indicates better SAR transfer. However, for the reason outlined above, here the AUC can be normalized only by the number of series considered and not by the final σ(Δps)n value.



RESULTS Can We Identify Series with SAR Transferability by MMS? As a first example, we use the query series H, C, Cl, CN, and NO2 and test how well we can predict the effect of replacing the substituent with OCH3 using the linReg SAR transfer schema. Overall, this series occurs 53 times in ChEMBL 21. Figure 2a shows a plot of Δps versus cRMSD for all pairwise comparisons. For this plot, all of the series containing the six listed fragments were extracted from the MMS database. Each series was used as a query once, resulting in n(n − 1) = 53·52 = 2756 comparisons. For each query series, Δps for OCH3 and the Δps values with respect to all of the other series in the pool were calculated. From this plot, one can assume that the spread of Δps is small with small cRMSD, but overplotting renders a qualitative visual inspection ambiguous. Figure 2b shows a plot of σ(Δps)exp versus cRMSD. The spread is small for small cRMSD values and then grows with larger cRMSD values. The value of σ(Δps)exp varies more strongly for small ranks and small cRMSD and then stabilizes for larger ranks. This is due to the stochastic nature of the calculation of σ(Δps)exp which is based on very few numbers only for small ranks. Figure 2c shows a comparison of all six metrics on the same query versus the corresponding σ(Δps)exp. For this specific query, cRMSD gives the best ranking, whereas the other metrics appear to be unable to enrich series with similar responses to OCH3 substitution. For this specific query, cRMSD and cMD work best, but do they outperform the other metrics in general? Statistical Comparison of SAR Similarity Metrics. We set up an MMS database as described in Methods. From the MMS database, we extracted all series containing the substituents shown in Table 1, together with their pActivity values. The test series were selected in order to cover different types of chemical properties (aliphatic chains, aliphatic rings, small substituents, phenyl substituents, heteroaromatics, mixed). Within each class, selection was driven by a balance between frequently occurring series and diversity in substituents. Series can partially overlap, i.e., two series with partially overlapping substituents can come from the same original publication. The mixed class is composed of the most frequent substituents. For each set of substituents, the same analysis procedure was applied as for the analysis shown in Figure 2. Within each series, each fragment was once used as query fragment. For each query fragment within each series, Δps values were then calculated using both the diffMean and linReg SAR transfer schemas against all series with the same substituents that could be extracted from the database. Overall, this yielded 18 series × 6 substituents = 108 different sets of Δps values versus similarity values for both SAR transfer schemas. For each similarity type,

Figure 2. Using H, C, Cl, CN, and NO2 as a query series to predict the effect of OCH3 substitution. (a) Δps vs cRMSD. (b) σ(Δp)exp vs cRMSD. (c) Comparison of different similarity metrics: RP (red), RSp (green), MD (light blue), RMSD (orange), cRMSD (deep blue), and cMD (purple).

we calculated the AUCs for enriching similar series, with good SAR transferability at the top of the list. Figure 3 shows a summary of the normalized AUCs across the set of 108 query series using the diffMean SAR transfer schema. Figure 4 shows a summary of the normalized AUCs across the set of 108 query series using the linReg SAR transfer schema. The exact numbers for the AUC can be found in the Supporting Information. The number of times a given similarity metric produces the lowest AUC within a given SAR transfer schema is given in Table 2. Overall, the cRMSD has the lowest AUC in 62 out of 108 comparisons for the diffMean schema. For the linReg schema, the cRMSD has the lowest AUC in 37 out of 108 cases. For the two different SAR transfer schema, the superiority of cRMSD is highly significant according to a two-tailed binomial 1190

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196

Article

Journal of Chemical Information and Modeling Table 1. Substituent Series Used To Validate SAR Similarity Metrics against Each Other

test (p ≤ 2.2 × 10−16 for diffMean and 7.60 × 10−6 for linReg). This shows that cRMSD is the best SAR similarity metric for both SAR transfer schemas. For the linReg SAR transfer schema, cMD is best in 32 out of 108 cases, which is very close to the result for cRMSD. Conceptually, cMD and cRMSD are

very similar, with a difference only in the weight on larger discrepancies. In order to assess whether the combination of cRMSD and diffMean or the combination of cRMSD and linReg works better, the AUC values normalized by series count only (and not by σ(Δps)n, the final standard deviation) can be compared. 1191

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196

Article

Journal of Chemical Information and Modeling

Figure 3. Plot of AUCs across all 108 query types for the diffMean SAR transfer schema. The order of the series is the same as in Table 1.

Figure 4. Plot of AUCs across all 108 query types for the linReg SAR transfer schema. The order of the series is the same as in Table 1.

activity profile is influenced by experimental uncertainty, we simulated the cRMSD for 10 000 identical series with six substituents that differ only by experimental uncertainty. For the experimental uncertainty, we assumed a Gaussian distribution with zero mean and a standard deviation of 0.2 log units, which is backed by recent publications as a general estimate for the uncertainty in affinity data.28,29 The R script to run these calculations can be found in the Supporting Information. Figure 5 shows the distribution of cRMSD obtained for two identical series of six substituents with an experimental uncertainty of 0.2 log units. The distribution peaks around 0.25. Additional experiments (not shown here) showed that the longer the series gets, the narrower the peak becomes at a value of 0.2√2 ≈ 0.28. The factor of √2 is needed since the

Table 2. Numbers of Tests in Which the Six Similarity Metrics Have the Lowest AUC Values for the Two Schemas schema

RSp

RP

MD

RMSD

cMD

cRMSD

diffMean linReg

7 13

6 18

9 5

9 8

23 32

62 37

Here the combination of cRMSD and linReg yields the lowest AUC in 101 out of 108 series. This again is highly significant, indicating that among all of the combinations of SAR similarity metrics and SAR transfer schemas analyzed here, cRMSD and linReg enable the best SAR transfer. Setting the Threshold for cRMSD. The cRMSD has a physicochemical interpretation: it equals the RMSE that would be obtained if one series was used to directly predict the activity of the other series (after correction for the offset). Since the 1192

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196

Article

Journal of Chemical Information and Modeling

the only series32 that also has this substituent has a cRMSD similarity of only 0.58, which is rather dissimilar. Although MMS prediction would have worked in this case, the prediction of the effect of introducing the NHAc substituent has to be considered an extrapolation. On the basis of our analysis, the first three substituents in Table 4 would be interesting suggestions to be made for the project. While the cRMSD is slightly above the threshold, the source series can still be considered somewhat similar. The compound with the chlorine substituent could be made, but it is unlikely to have a strong impact on the activity. The compound with the methylsulfone substituent is also not likely to improve the activity. If prioritizations would have to be made, the first three substituents would be selected for activity reasons.



DISCUSSION To the best of our knowledge, this is the first work where concepts like SAR similarity, parallel SAR, SAR transferability, and matched activity profiles are quantified. While these concepts have often been used qualitatively, to date putting a number on the degree of SAR similarity has remained elusive. The phrase “parallel” SAR indicates that SAR similarity has often been visually determined from correlation plots or some kind of network. We believe that there are other kinds of parallel SAR that cannot be identified easily from correlation plots and that there are cases where two different medicinal chemists would disagree on whether a SAR is similar. The cRMSD furthers the precision of the phenomenon of parallel SAR since it allows a hard number to be put on the problem. In the comparison of different SAR similarity metrics for the application in MMS analysis, we found that the cRMSD is the most suitable similarity metric for identifying matched series that allow SAR transferability on the basis of a similar Δps. Using the different schemas, cRMSD outperformed the other metrics tested in 62 and 37 out of 108 test cases, which is extremely significant. Its superiority can be rationalized by a number of reasons: If two parallel series A and B were to have a large spread, show evenly spaced affinities, and cover the same absolute activity range, all of the metrics examined would perform well. However, in reality the spread can be rather small (flat SAR),33 the spacing between the activities can be irregular, and there can be a huge offset in activities due to the different assays, binding sites, and other interactions that the two different scaffolds make with the protein. In addition, experimental uncertainty adds noise to the measured activities.

Figure 5. Simulated distribution of cRMSD for two series of length 5 with identical SAR profiles and an experimental uncertainty of σ = 0.2 log units.

calculations are based on differences between two measured values each of which contains uncertainty (similar to the calculation of experimental uncertainty from differences between measurements; see Kramer et al.30). As a rough threshold for MMS, we therefore suggest the use of a cRMSD of at least 0.25 since any cRMSD below this value can be completely explained by experimental uncertainty. For a given series, the best choice of the threshold can be higher than 0.25, but this depends on the specific chemistry of the series. Using the cRMSD To Prioritize Novel Substituents. Using a test series of p38 kinase inhibitors with different aromatic substituents and activities taken from Dumas et al.,31 we queried the MMS database based on CHEMBL data. The series is shown in Table 3. This series occurs 31 times in ChEMBL. Overall, 38 other substituents have been combined with the six substituents from the query, but on differing cores. Table 4 shows a selection of five other substituents selected out of those 38, together with the number of times each substituent occurs, the cRMSD value, and the value of Δps from the most similar series (according to cRMSD and diffMean). A full list with all 38 substituents and more statistics can be found in the Supporting Information. The only additional substituent for which an activity was published by Dumas et al. is the last one, NHAc. The compound with this substituent has an activity of >500 nM, which is in good agreement with the Δps of −3.07. However,

Table 3. Test Series of p38 Kinase Inhibitors, Taken from Dumas et al.31

1193

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196

Article

Journal of Chemical Information and Modeling Table 4. Selected Additional Substituents from the Model Query

If there is a small spread or some substituents yield very similar activities, the Spearman correlation coefficient will fail to detect similar series since the ranking can then easily be affected by experimental uncertainty. The Pearson correlation coefficient will also fail if there is experimental uncertainty and an overall small spread. On the contrary, both correlation coefficients may indicate a high similarity if the two series have very different slopes, i.e., if the activity ranges they cover are rather different. This is a problematic behavior since it means that if the exchange of two substituents has a very different effect on the free energy of binding, two series would still be considered similar if there is no impact on the order of the substituents in the two series. In contrast to the correlation coefficients, MD, RMSD, cMD, and cRMSD depend only on the absolute differences between the corresponding substituents of the two series. MD and RMSD are very similar both conceptually and in terms of performance, with RMSD performing a little bit better because of the higher weights that are put on larger differences. Both MD and RMSD are insensitive to the activity spread and a lot less sensitive to experimental uncertainty than correlation coefficients. If the two series have different gaps (e.g., if the activity cliffs do not agree), MD and RMSD will take large values and indicate SAR dissimilarity, which is good. However, MD and RMSD can also take huge values in the case of different offsets, which is a disadvantage. Centering compensates for the different offsets. Among cMD and cRMSD, the centered RMSD performs a little better because of the higher weights that are put on larger differences. Overall, cRMSD performs best, followed by the conceptually very similar cMD. SAR transfer requires both a similarity metric and a schema to transfer the SAR from one series to the other. In this work, we analyzed two schemas, which we termed linReg and diffMean. Activity prediction by the diffMean method is a simple transfer of the difference with respect to the mean of a database series to the query series for a given fragment. linReg is more sophisticated in that it uses a linear regression to fit the predicted value of a new substituent. cRMSD is the best SAR similarity metric in both cases, with the conceptually similar cMD performing only a little worse for the linReg schema. In terms of overall performance, the combination of linReg and cRMSD yields the smallest AUC values, indicating that this is

the best combination of all SAR similarity and transfer schemas tested. Since the cRMSD is based on absolute differences of measured values, the effect of experimental uncertainty can be quantified unequivocally. Simulations with a normally distributed experimentally uncertainty model with σ = 0.20 show that the cRMSD of two series that would otherwise be identical approaches a sharp peak around 0.20√2 (≈0.28) as the length of the matched series increases. From these simulations, a lower threshold of the cRMSD can be set at around 0.25, where all series with values below this threshold are indistinguishable as a result of the influence of the experimental error. Those should then be prioritized for predictions. Among all matched series analyzed for this contribution, 124 880 out of 636 492 matched series had a cRMSD below 0.28, corresponding to 19.6% of the series. In practice, MMS analysis can be used as an idea generator and as a semiquantitative prediction tool. It closely resembles aspects of medicinal chemistry intuition (“This substituent worked before in a series with similar SAR”), but instead of being limited to the number of SAR series one brain can capture and store, it can draw on the entirety of series generated from in-house databases, public databases like ChEMBL, and patents. The larger the databases, the more similar series can be found. This means that as databases grow bigger and bigger, (a) more substituents can be identified that other people tried in similar chemical environments (from the ligands’ point of view) and (b) for each substituent, more similar series can be found, which will allow a more sophisticated statistical analysis among the Δps values for one substituent. One could for example imagine that if a substituent were found 10 times among other series with a cRMSD below 0.25, the standard deviation of its Δps values would indicate whether a prediction based on the average of the 10 Δps values is an interpolation or an extrapolation. Furthermore, small standard deviations can indicate very similar protein environments. However, in our analyses based on ChEMBL21, we only rarely found a larger set of series with cRMSD below 0.25 for any query series and an additional substituent. Thus, while MMS at the moment should probably be considered more as an idea generator, we believe that as databases grow it will transition from being a prioritization tool into a semiquantitative approach for activity prediction. 1194

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196

Article

Journal of Chemical Information and Modeling

on-target data, making it highly attractive for further use and investigation. We believe that with this paper we are helping to set one of the cornerstones for the future development of MMS analysis.

MMS as a semiquantitative prediction tool comes along with a number of opportunities. First, it will allow us to identify redundant substituents that do not add to the overall understanding of SAR. Those would be frequently occurring substituents with a low standard deviation and almost no gain in affinity. There is no point in synthesizing these compounds for activity exploration only. MMS will also allow statistical prioritization of the most information-rich substituents to be tried next. These are the ones that often occur in series with a low cRMSD but have a huge standard deviation, i.e., it is unclear how they will affect the activity. Finally, there will always be a group of rare substituents with high cRMSD, where MMS serves as pure idea generator. Ideally, each substituent suggestion can be partnered with a predicted change in physicochemical and ADME/toxicity properties, data that at least in big pharma settings are readily available from standard QSAR models and MMP analyses. MMS can also be used as a hint to synthetic accessibility since it directly points at cases where other people have been able to put an interesting substituent in a position that can be modified by the same substituents as the current series. We also believe that we need to evolve our understanding of the biochemical factors that lead to parallel SAR. Some people for example may argue that it is theoretically impossible to use an approach like MMS because every binding (sub)pocket is different and even different series that bind to the same pocket are different in the sense that the attachment vector pointing into a subpocket is always slightly different. Empirically, however, there clearly are cases of parallel SAR and cases where it has been used.4 The entire motivation for (sub)pocket comparison is based on the assumption that not only the (sub)pockets are similar but also the ligands, or at least some functional groups hitting specific pharmacophore points, that will bind to those (sub)pockets. Since the concept of parallel SAR is an integral part of successful medicinal chemistry thinking, it is tempting to develop an understanding of the structural conditions required for succcessful SAR transfer. MMS as a tool is particularly attractive in ligand-based design projects where no protein crystal structures are known. If protein crystal structures are available, other methods for activity prediction such as free-energy perturbation (FEP) can be used. However, FEP is computationally rather costly. MMS predictions, in contrast, could be used as cheap ligand-based null models for FEP calculations. There even is a conceptual similarity between MMS and the currently most prominent FEP technology sold by Schrodinger, which uses the “FEPMapper”:34 both techniques summarize transformations from different starting points to the substituent whose effect on activity is to be predicted. These days, MMP analysis has evolved into a valuable tool that is used across the pharmaceutical industry.8,9 However, it is only applicable for the prediction of physicochemical and some ADME/toxicity properties. MMP analysis generally does not help in predicting on-target properties since standard MMP cannot take into account where in the three-dimensional binding pocket substituents are pointing to. MMS has the potential to overcome the deficiencies of MMP analysis, leveraging the huge amounts of compound property and affinity data that are available these days. It not only enables us to identify chemical trends but also allows the detection of activity cliffs and phenomena that originate in changes in the interaction of the ligand with a host. It is one of the few tools that can directly generate actionable knowledge from big



SUMMARY We have presented a comparison of six different metrics for measuring SAR similarity, namely, the Pearson and Spearman correlation coefficients, MD, RMSD, centered MD, and centered RMSD, and two different schemas to calculate SAR transfer from one series to the other. Assuming that SARs in two series are parallel if the activity of a new substituent in series A can be predicted from series B, we defined a validation experiment with 108 different query series to compare the different metrics and schemas. Very clearly, cRMSD in combination with linear-regression-based prediction (linReg) emerged as the similarity transfer concept that is best suited to rank series by SAR transferability. This finding can be rationalized by considering the mathematical properties of the different candidate metrics, as for each of the other metrics situations can be created where they fail to identify similar SAR. Using simulations that take into account experimental uncertainty, we have identified a cRMSD threshold of at least 0.25 below which two series can be considered indistinguishable. We have also illustrated how the cRMSD can be used within an MMS query to identify and prioritize substituents for novel compounds to be made. MMS is a unique ligand-based method that leverages the knowledge stored in huge and growing bioactivity databases to suggest novel promising compounds to be made. With the present study, we intend to push MMS analysis from an idea generator concept into a semiquantitative analysis method that closely resembles medicinal chemistry thinking.



ASSOCIATED CONTENT

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.6b00709. AUC values for the 108 query series, R code to simulate cRMSD values, and Query output for Dumas’ example query (PDF) AUC values for the 108 query series (XLSX) R script to simulate cRMSD for identical SAR profiles with different experimental uncertainties (TXT) Output of Matched Series query on ChEMBL DB for Dumas et al. series (TXT)



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Christian Kramer: 0000-0001-8663-5266 Present Address †

E.S.R.E.: Center for Bioinformatics, Universität Hamburg, Bundesstrasse 43, 20146 Hamburg, Germany.

Notes

The authors declare no competing financial interest.



ABBREVIATIONS ADME, absorption, distribution, metabolism, and excretion; AUC, area under the curve; cMD, centered Manhattan 1195

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196

Article

Journal of Chemical Information and Modeling

(19) Zhang, B.; Wassermann, A. M.; Vogt, M.; Bajorath, J. Systematic Assessment of Compound Series with SAR Transfer Potential. J. Chem. Inf. Model. 2012, 52, 3138−3143. (20) Zhang, B.; Hu, Y.; Bajorath, J. SAR Transfer across Different Targets. J. Chem. Inf. Model. 2013, 53, 1589−1594. (21) Bento, A. P.; Gaulton, A.; Hersey, A.; Bellis, L. J.; Chambers, J.; Davies, M.; Krüger, F. A.; Light, Y.; Mak, L.; McGlinchey, S.; et al. The ChEMBL Bioactivity Database: An Update. Nucleic Acids Res. 2014, 42, D1083−D1090. (22) Hussain, J.; Rea, C. Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets. J. Chem. Inf. Model. 2010, 50, 339−348. (23) Landrum, G. A. RDKit: Open-Course Cheminformatics Software, version 2016.03. http://www.rdkit.org. (24) McKinney, W. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference (SciPy 2010); van der Walt, S., Millman, J., Eds.; pp 51−56. (25) Van der Walt, S.; Colbert, S. C.; Varoquaux, G. The NumPy Array: A Structure for Efficient Numerical Computation. Comput. Sci. Eng. 2011, 13, 22−30. (26) R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2016; http://www.R-project.org. (27) Wickham, H. ggplot2: Elegant Graphics for Data Analysis; Springer: New York, 2009. (28) Kalliokoski, T.; Kramer, C.; Vulpetti, A.; Gedeck, P. Comparability of Mixed IC50 Data − A Statistical Analysis. PLoS One 2013, 8, e61007. (29) Kramer, C.; Dahl, G.; Tyrchan, C.; Ulander, J. A Comprehensive Company Database Analysis of Biological Assay Variability. Drug Discovery Today 2016, 21, 1213−1221. (30) Kramer, C.; Kalliokoski, T.; Gedeck, P.; Vulpetti, A. The Experimental Uncertainty of Heterogeneous Public Ki Data. J. Med. Chem. 2012, 55, 5165−5173. (31) Dumas, J.; Hatoum-Mokdad, H.; Sibley, R.; Riedl, B.; Scott, W. J.; Monahan, M. K.; Lowinger, T. B.; Brennan, C.; Natero, R.; Turner, T.; et al. 1-Phenyl-5-Pyrazolyl Ureas: Potent and Selective p38 Kinase Inhibitors. Bioorg. Med. Chem. Lett. 2000, 10, 2051−2054. (32) Hertzog, D. L.; Al-Barazanji, K. A.; Bigham, E. C.; Bishop, M. J.; Britt, C. S.; Carlton, D. L.; Cooper, J. P.; Daniels, A. J.; Garrido, D. M.; Goetz, A. S.; et al. The Discovery and Optimization of PyrimidinoneContaining MCH R1 Antagonists. Bioorg. Med. Chem. Lett. 2006, 16, 4723−4727. (33) Ghosh, A.; Dimova, D.; Bajorath, J. Classification of Matching Molecular Series on the Basis of SAR Phenotypes and Structural Relationships. MedChemComm 2016, 7, 237−246. (34) Wang, L.; Wu, Y.; Deng, Y.; Kim, B.; Pierce, L.; Krilov, G.; Lupyan, D.; Robinson, S.; Dahlgren, M. K.; Greenwood, J.; et al. Accurate and Reliable Prediction of Relative Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J. Am. Chem. Soc. 2015, 137, 2695−2703.

distance; cRMSD, centered root-mean-square deviation; diffMean, difference to the mean; FEP, free energy perturbation; linReg, linear regression; MD, Manhattan distance; MMP, matched molecular pair; MMS, matched molecular series; RMSD, root-mean-square deviation; RP, Pearson correlation coefficient; RSp, Spearman correlation coefficient; SAR, structure−activity relationship



REFERENCES

(1) O’Boyle, N. M.; Boström, J.; Sayle, R. A.; Gill, A. Using Matched Molecular Series as a Predictive Tool to Optimize Biological Activity. J. Med. Chem. 2014, 57, 2704−2713. (2) Kubinyi, H. Free Wilson Analysis. Theory, Applications and Its Relationship to Hansch Analysis. Quant. Struct.-Act. Relat. 1988, 7, 121−133. (3) Patel, Y.; Gillet, V. J.; Howe, T.; Pastor, J.; Oyarzabal, J.; Willett, P. Assessment of Additive/nonadditive Effects in Structure-Activity Relationships: Implications for Iterative Drug Design. J. Med. Chem. 2008, 51, 7552−7562. (4) Mills, J. E. J.; Brown, A. D.; Ryckmans, T.; Miller, D. C.; Skerratt, S. E.; Barker, C. M.; Bunnage, M. E. SAR Mining and Its Application to the Design of TRPA1 Antagonists. MedChemComm 2012, 3, 174− 178. (5) Hajduk, P. J.; Greer, J. A Decade of Fragment-Based Drug Design: Strategic Advances and Lessons Learned. Nat. Rev. Drug Discovery 2007, 6, 211−219. (6) Gleeson, P.; Bravi, G.; Modi, S.; Lowe, D. ADMET Rules of Thumb II: A Comparison of the Effects of Common Substituents on a Range of ADMET Parameters. Bioorg. Med. Chem. 2009, 17, 5906− 5919. (7) Papadatos, G.; Alkarouri, M.; Gillet, V. J.; Willett, P.; Kadirkamanathan, V.; Luscombe, C. N.; Bravi, G.; Richmond, N. J.; Pickett, S. D.; Hussain, J.; et al. Lead Optimization Using Matched Molecular Pairs: Inclusion of Contextual Information for Enhanced Prediction of hERG Inhibition, Solubility, and Lipophilicity. J. Chem. Inf. Model. 2010, 50, 1872−1886. (8) Dossetter, A. G.; Griffen, E. J.; Leach, A. G. Matched Molecular Pair Analysis in Drug Discovery. Drug Discovery Today 2013, 18, 724− 731. (9) Griffen, E.; Leach, A. G.; Robb, G. R.; Warner, D. J. Matched Molecular Pairs as a Medicinal Chemistry Tool: Miniperspective. J. Med. Chem. 2011, 54, 7739−7750. (10) Kramer, C.; Fuchs, J. E.; Whitebread, S.; Gedeck, P.; Liedl, K. R. Matched Molecular Pair Analysis: Significance and the Impact of Experimental Uncertainty. J. Med. Chem. 2014, 57, 3786−3802. (11) Wassermann, A. M.; Bajorath, J. A Data Mining Method to Facilitate SAR Transfer. J. Chem. Inf. Model. 2011, 51, 1857−1866. (12) Topliss, J. G. Utilization of Operational Schemes for Analog Synthesis in Drug Design. J. Med. Chem. 1972, 15, 1006−1011. (13) Sisay, M. T.; Peltason, L.; Bajorath, J. Structural Interpretation of Activity Cliffs Revealed by Systematic Analysis of Structure-Activity Relationships in Analog Series. J. Chem. Inf. Model. 2009, 49, 2179− 2189. (14) Sheridan, R. P. The Most Common Chemical Replacements in Drug-Like Compounds. J. Chem. Inf. Comput. Sci. 2002, 42, 103−108. (15) Sheridan, R. P.; Hunt, P.; Culberson, J. C. Molecular Transformations as a Way of Finding and Exploiting Consistent Local QSAR. J. Chem. Inf. Model. 2006, 46, 180−192. (16) Wawer, M.; Bajorath, J. Local Structural Changes, Global Data Views: Graphical Substructure-Activity Relationship Trailing. J. Med. Chem. 2011, 54, 2944−2951. (17) De la Vega de León, A.; Hu, Y.; Bajorath, J. Systematic Identification of Matching Molecular Series and Mapping of Screening Hits. Mol. Inf. 2014, 33, 257−263. (18) Gupta-Ostermann, D.; Hu, Y.; Bajorath, J. Systematic Mining of Analog Series with Related Core Structures in Multi-Target Activity Space. J. Comput.-Aided Mol. Des. 2013, 27, 665−674. 1196

DOI: 10.1021/acs.jcim.6b00709 J. Chem. Inf. Model. 2017, 57, 1187−1196