Group Additivity in Ligand Binding Affinity: An ... - ACS Publications

Nov 7, 2017 - Analysis of a large data set of protein−ligand binding affinities (Ki) for diverse targets shows that in general ligand ... analysis, ...
1 downloads 9 Views 4MB Size
Article Cite This: J. Chem. Inf. Model. XXXX, XXX, XXX-XXX

pubs.acs.org/jcim

Group Additivity in Ligand Binding Affinity: An Alternative Approach to Ligand Efficiency Charles H. Reynolds* and Ryan C. Reynolds Gfree Bio, LLC, 3805 Old Easton Road, Doylestown, Pennsylvania 18902, United States S Supporting Information *

ABSTRACT: Group additivity is a concept that has been successfully applied to a variety of thermochemical and kinetic properties. This includes drug discovery, where functional group additivity is often assumed in ligand binding. Ligand efficiency can be recast as a special case of group additivity where ΔG/HA is the group equivalent (HA is the number of non-hydrogen atoms in a ligand). Analysis of a large data set of protein−ligand binding affinities (Ki) for diverse targets shows that in general ligand binding is distinctly nonlinear. It is possible to create a group equivalent scheme for ligand binding, but only in the context of closely related proteins, at least with regard to size. This finding has broad implications for drug design from both experimental and computational points of view. It also offers a path forward for a more general scheme to assess the efficiency of ligand binding.



much more aggressive in filtering this large, diverse, and somewhat messy data set, but the criteria above do remove many of the most extreme structures. The Ki values were converted to free energies (pKi could also be used) using the standard temperature of 298 K. This database contains approximately 174 000 binding affinity data points for more than 1000 target proteins. Molecular descriptors and the distribution of free energies with respect to size were generated using MOE and VORTEX.15 The overall distribution of ΔG in kcal/mol versus the number of non-hydrogen atoms (HA) is shown graphically in Figure 1. The most important observation is that there are many very high affinity binders (better than −11.5 kcal/mol) across the full range of ligand sizes. While there may be a slight trend for less potent ligands at very small HA values, in general the data in Figure 1 show that very high affinity ligands are numerous across the full range of molecular sizes. An analysis of the distribution of ligand affinities with respect to atom size, binned by the number of non-hydrogen atoms, is given in Table 1. This is consistent with earlier work by Kuntz et al.5 and illustrates a problem that has been highlighted elsewhere,3 namely, that traditional ligand efficiency (i.e., ΔG/HA) is intrinsically very biased by molecular size (Figure 2). Restated, the best ligand efficiencies attainable for any value of HA inevitably exhibit an approximately 1/HA relationship since the maximal obtainable ΔG is more or less constant across the range of ligand sizes most relevant to small-molecule drug discovery. The idea that thermochemical properties are often additive is a powerful concept that has been applied broadly in chemistry.16−18 One of the first and most successful examples

INTRODUCTION Medicinal chemists are keenly interested in any quantity, measured or computational, that provides useful insight into the quality of compounds being developed as potential drug candidates. General concerns about molecular size inflation and the emergence of fragment-based drug design1 have been the impetus for various approaches to evaluate the “efficiency” of ligand binding.2−9 Efforts to develop a better understanding of the thermodynamics of protein−ligand interactions and the physical properties responsible for driving different thermodynamic profiles have also gained traction.10−12 In the final analysis, a key question that often confronts drug discovery scientists, whether at the lead identification, lead optimization, or even clinical phases, is how optimally a given ligand binds to a particular target. The goal of our work is to provide a simple and theoretically justified answer to that question as well as to address the role of functional group additivity in drug discovery more globally.



RESULTS AND DISCUSSION Ligand Affinity Database. A large database was assembled to analyze ligand binding affinities for a wide variety of ligands and proteins across a range of sizes and potencies using the publicly available BindingDB database (www.bindingdb.org).13 Compounds were selected from BindingDB that had reported Ki values between 50 μM and 1 pM. This raw data set was subjected to limited filtering using MOE-calculated descriptors14 in order to remove very unusual structures. The computed logPo/w was limited to a range of −4 to 10, the number of basic or acidic atoms (COOH counts as two acidic atoms) was capped at 8, the total charge was constrained to range from −3 to +3, and the number of non-hydrogen atoms was capped at 80. Obviously, it would have been possible to be © XXXX American Chemical Society

Received: June 21, 2017

A

DOI: 10.1021/acs.jcim.7b00381 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 1. Plot of ΔG versus the number of heavy atoms (HA). The “optimal” affinity band is highlighted in the blue box.

challenge, whether using a single group equivalent, ΔG/HA, or many (e.g., Andrews), is addressing the nonlinear relationship in Figure 2. Defining Optimal Affinity. In setting out to construct a group additive model for ligand−protein binding affinity, the first question is what to model. One might attempt to model average affinity across a wide range of ligands and targets. This was the Andrews choice. On the basis of the distribution of mean and median affinities for our data set, this might be doable. An alternative approach, which has been adopted here, is to define an optimal binding affinity across the range of ligand sizes as an ideal value for comparison. This allows examination of binding affinity on a per-atom basis for potent ligands, which presumably fit their respective targets very effectively. Of course, optimal is a subjective term. In this case, optimal binding affinity has been determined primarily on the basis of statistical measures of the data (e.g., mean, standard deviation, and percentiles) but also on the basis of drug discovery “common sense” that says in most programs singledigit nanomolar or better affinity is the goal. Examination of Table 1 shows that the definition of optimal affinity is very similar whether single-digit nanomolar, two standard deviations below the mean, or the fifth percentile is used as the criterion. There is some drift in these values between bins for the smallest and largest ligands, but overall the fifth percentile ranges from −11.6 kcal/mol for the 10−15 HA bin to −13.6 kcal/mol for the 50−55 HA bin. Additional data are provided in the Supporting Information. This analysis led to the selection of ΔG from −11.6 to −13.6 kcal/mol as the “optimal frontier” for this data set. The optimal frontier is highlighted with a blue rectangle in Figure 1. This band could be adjusted up or down in affinity or made narrower or broader, but this range encompasses affinities between approximately 0.1 and 3 nM. While there are certainly examples of more potent ligands, in the majority of drug

Table 1. ΔG Statistics for the Binned Number of NonHydrogen Atoms HAa 60 55 50 45 40 35 30 25 20 15 10 a

≤ ≤ ≤ ≤ ≤ ≤ ≤ ≤ ≤ ≤ ≤

HA HA HA HA HA HA HA HA HA HA HA

< < < < < < < < < <