Myeloid Cell Leukemia 1 Inhibition: An in Silico Study Using Non

Jun 6, 2018 - Using the same methodology, we attempt the calculation of the dissociation free energy of the BH3 domain from PUMA protein, binding Mcl-...
0 downloads 0 Views 3MB Size
Article pubs.acs.org/JCTC

Cite This: J. Chem. Theory Comput. 2018, 14, 3890−3902

Myeloid Cell Leukemia 1 Inhibition: An in Silico Study Using Nonequilibrium Fast Double Annihilation Technology Piero Procacci* Department of Chemistry, University of Florence, Via Lastruccia No. 3, Sesto Fiorentino I-50019, Italy

Downloaded via BOSTON COLG on July 13, 2018 at 13:11:47 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

S Supporting Information *

ABSTRACT: In this work, we compute, by means of a non-equilibrium alchemical technique (fast switching double annihilation methods, FSDAMs), the dissociation free energy for five recently discovered micromolar to sub-nanomolar inhibitors of the Myeloid cell leukemia 1 protein, a key regulator in cell survival and death, providing valuable clues in the chemical−physical determinants of Mcl-1 inhibition. Using the same methodology, we attempt the calculation of the dissociation free energy of the BH3 domain from PUMA protein, binding Mcl-1 in the α-helical state. The synthetic ligands have been parametrized using the recently released GAFF2 general force field [http://ambermd.org] by means of the automated assignment tool PrimaDORAC [Procacci, P. J. Chem. Inf. Model. 2017, 57, 1240]. As an important byproduct, this work constitutes hence one of the first and most challenging tryouts for the GAFF2 parameter set. Agreement with experimental measurements is found to be generally satisfactory, validating the GAFF2 parametrization of the ligands and foreseeing a possible role of FSDAM for industrial application in drug discovery.



INTRODUCTION The Myeloid cell luekemia 1 protein (Mcl-1) is a member of the Bcl-2 family of proteins and as such is one of the key regulators of the apoptosis process, i.e., the programmed death of unwanted cells promoting a balanced cell renewal in multicellular organism. In aberrant cells, apoptosis can be disregulated leading to abnormal cell survival and proliferation. Up-regulation of the antiapoptotic members of the Bcl-2 family of proteins (including Mcl-1) is one of the common mechanisms for abnormal cell proliferation in tumor growth. Indeed, amplification of the gene encoding the antiapoptotic Mcl-1 is one of the most common genetic aberrations in human cancer,1 and Mcl-1 overexpression in human cancers is associated with high tumor grade and poor survival.2−4 The biochemistry of the delicate balance in cell survival and death is regulated by the interactions among anti- and proapoptotic members of the Bcl-2 family, mediated by regions of a conserved sequence known as Bcl-2 homology (BH) domains. Mcl-1 acts as an antiapoptotic agent by binding and sequestering via the BH domains their proapoptotic counterparts, such as BAK, BAX, or the Bcl-2 homology 3 (BH3)-only proteins BAD and BIM.1.5 Inhibition of Mcl-1 can therefore represent a valid pharmaceutical route in cancer therapy. On the other hand, inhibition of Mcl-1 using small molecule drugs has been proven challenging, as this process competes with the extremely high affinity of the natural Mcl-1 BH substrates, involving a complex pattern of complementary and specific protein−protein interactions.5 Very recently, three seemingly unrelated (noncongeneric)6 potent peptido−mimetic inhibitors have been discovered, and © 2018 American Chemical Society

their Mcl-1 co-crystals have been structurally characterized by three different research groups, namely, compound no. 26 (PDB code 5MES7), compound S63845 (PDB code 5LOF8), and compound no. 5 (PDB code 5IF49). As summarized in Figure 1 and shown in the X-ray structures of the complexes reported in the Supporting Information, the common pattern of the ligand−protein interactions for these three compounds is based on some recurring hydrophobic and polar contacts that emulate the BH3-Mcl-1 protein−protein interactions. However, in the development of compound 26 (5MES) and of compound 5 (5IF4), a decisive improvement in potency was obtained by modifying their corresponding peptido−mimetic synthetic precursors via macrocyclization or rings fusion, i.e., by limiting the accessible conformational space of the inhibitors to structures that are supposedly close to the binding conformation. In the case of S63845 (5LOF), the rigidity was apparently imparted via strong intraligand stacking hydrophobic interactions, an expedient that was already used in the recent development of nanomolar FKBP12 inhibitors.10 In the framework of all atoms molecular dynamics with explicit solvent, in the past 2 years we have developed a new alchemical methodology, termed fast switching double annihilation method (FSDAM),11,12 based on the production of a swarm of fast non-equilibrium (NE) trajectories that are initiated from a canonical sampling of the bound state and of the solvated free ligand. The annihilation free energies of the ligand when bound to the receptor and in bulk solvent are Received: March 27, 2018 Published: June 6, 2018 3890

DOI: 10.1021/acs.jctc.8b00305 J. Chem. Theory Comput. 2018, 14, 3890−3902

Article

Journal of Chemical Theory and Computation

Figure 1. Mcl-1 ligand. The net charges for the ligand are as follows: 4hw3, −1; 5iez, −1; 5mes, 0; 5if4, −1; 5lof, −1.

obtained using an unbiased estimate, firmly based on the Crooks theorem for driven NE processes,13 in the assumption that the observed annihilation work distributions can be described by a mixture of Gaussian components.14 The dissociation free energy, except for a standard state volume correction,11,15 is recovered as the difference between the annihilation free energies of the bound and free ligand. FSDAM has been successfully applied to the determination of the binding free energies in a number of cases, from host− guest16 systems and metal-ion binding,15 to ligand−protein binding.12,17 At variance with the standard alchemical theory, where the annihilation must be conducted quasi-statically (i.e., performing a series of long equilibrium simulations on each of the nonphysical alchemical states defining the ligand annihilation path18−20), in FSDAM, the NE annihilation trajectories are done at fast speed in many concurrent simulations lasting at most a few hundred of picoseconds. This makes this inherently parallel technique much less sensitive than its equilibrium counterparts to the arbitrary choice of the ligand annihilation protocol defining the alchemical path.21 Indeed, the need for the optimization of the so-called thermodynamic length,22 that is, of choosing the alchemical protocol so that the total uncertainty for the transformation is the one which has an equal contribution to the uncertainty across every point along the alchemical path, de facto limits the equilibrium approaches to computations of relative binding free energies, involving only small changes of free energies between ligands belonging to strictly congeneric series.23 In this work we apply FSDAM to the calculation of the absolute binding free energies of the three above outlined tight binding Mcl-1 inhibitors, along with two other much less potent inhibitors (indicated in Figure 1 with the PDB code, 4hw324 and 5iez,9 of the corresponding co-crystal structures). Using this important example, we show how FSDAM can be reliably used to estimate the absolute dissociation free energies of bulky ligands, with a controllable and path insensitive confidence interval straightforwardly provided by the variance of the distributions. We also apply, for the first time, FSDAM to the challenging case of protein−protein interactions described at the full atomistic level, by attempting computation of the dissociation free energy of the Mcl-1 protein with the BH3 domain from the PUMA protein in its α-helical conformation.5 The work is organized as follows. In Methods, we provide the simulation details and system parametrization and briefly describe the FSDAM theoretical background. The ligands have been parametrized using the recently released GAFF2 general force field.25 As an important byproduct, this work constitutes hence one of the first and most challenging tryouts for the

GAFF2 parameter set. In Results we report in detail the outcome of the FSDAM calculations, providing careful analysis on the nature of annihilation work distributions. In the Discussion, we collect all data to form the FSDAM prediction for the standard dissociation free energies of the five ligands and the BH3 domain, making a critical comparison to the experimental observation and evidencing the central role played by the protonation state of the bound ligand and the of so-called ligand reorganization energy26 in the potency of the inhibitors as well as in the tight binding of the BH3-Mcl-1 system. In Conclusions, we highlight the virtues and the challenges of the FSDAM technology foreseeing a possible role of the method for industrial application in drug discovery.



METHODS Simulation Details and Sample Preparations. Hereafter, the Mcl-1 ligands will be indicated using the associated PDB code for the co-crystal structure in lowercase. We shall use uppercase letters when referring to the complex. We adopted the amber99SB27 force field for the protein systems with ildn corrections.28 The amber99SB was used also for the parametrization of BH3 domain in the BH3-PUMA/Mcl-1 complex (PDB code 2ROC). Atomic type assignment and partial atomic charges on the ligands were computed using the PrimaDORAC web interface.29 PrimaDORAC computes the AM1-BCC charges30 on the AM1 optimized geometry31 of the ligand and assigns the atomic types according to the recently released GAFF2 general parametrization for organic molecules.25 The solvent was treated explicitly using the TIP3P model.32 Long-range electrostatic interactions were treated using the smooth particle mesh Ewald method,33 with an α parameter of 0.37 Å−1, a grid spacing in the direct lattice of about 1 Å, and a fourth order B-spline interpolation for the gridded charge array. As needed, the resulting net charge in the preparation of the solvated complexes and free ligands (vide infra) was neutralized using a uniform background plasma of opposite charge.34 Bond constraints were imposed to X−H bonds only, where X is a heavy atom. All other bonds were assumed to be flexible. The pressure was set to 1 atm using a Parrinello−Rahman Lagrangian for orthorhombic boxes35 with isotropic stress tensor while temperature was held constant using three Nosé−Hoover thermostats coupled to the translational degrees of freedom of the systems and to the rotational/internal motions of the solute and of the solvent. The equations of motion were integrated using a multipletime-step r-RESPA scheme36 with a potential subdivision specifically tuned for biomolecular systems in the NPT ensemble.35,37 The long-range cutoff for Lennard-Jones interactions was set to 13 Å in all cases. 3891

DOI: 10.1021/acs.jctc.8b00305 J. Chem. Theory Comput. 2018, 14, 3890−3902

Article

Journal of Chemical Theory and Computation

species could be deprotonated because of the proximity of K234. Further evidence in support of this hypothesis will be provided in Results, where we show that the N-protonated ligands 5mes and 5lof bind Mcl-1 much less strongly than their N-deprotonated partners. FSDAM Calculation. The theoretical background of FSDAM has been thoroughly described elsewhere.11,12,15−17,42,43 Here we provide the technical details used in implementing this methodology for the Mcl-1 systems. Like its equilibrium counterpart (using either free energy perturbation (FEP) or thermodynamic integration), this alchemical methodology accomplishes the determination of the binding free energy by computing the difference between the decoupling free energies of the ligand in the solvated complex and in bulk solvent. Each of these two independent calculations is in turn done in two steps: (i) a replica exchange simulation with torsional tempering (REST stage) of the fully coupled ligand states, aimed at harvesting canonical (equilibrium) configurations of the systems; (ii) a transformative stage based on NE simulations (FNE stage), whereby the ligand (in the bound state and in bulk), starting from the configurations sampled in the REST stage, is rapidly annihilated and the annihilation work is computed for each NE trajectory obtaining a work distribution. REST Stage: Preparation of the Starting Fully Coupled States. In the REST simulation of the complex, where bound states are sampled, the ligand was prevented from drifting off the binding site by a loose harmonic restraint 1 potential of the kind F = 2 K (r − r0)2 , where r is the distance between the centers of mass of the ligand and of the receptor. The harmonic force constant, K ≃ 0.025 kcal mol−1 Å−2, of this restraint was chosen so as to obtain44 a ligand allowance volume Vr = (2πRT/K)3/2 corresponding to that of the standard state (1661 Å3), while the equilibrium distance r0 was set to the experimental distance between the ligand−receptor centers of masses. As noted by Zhou and Gilson, such choice of the restraint potential is equivalent to setting the concentration of the ligand to a standard value of 1 M.45 The torsional and 1−4 interactions of the whole ligand and of the protein side chains within minimal distance of 4.5 Å (computed from the experimental structures) between any chain atom and any ligand atom were scaled from 1 (target state) up to 0.2 corresponding to the hottest replica with a torsional temperature of 1500 K along a progression of eight replica states using the default ORAC scaling protocol, described in ref 46. This scaling, defining a hot region such as that in FEP-REST,23 reduces by a factor of 5 the height of all involved torsional barriers, hence boosting the conformational kinetics and allowing for a very efficient sampling of alternative poses or ligand/protein conformations in the binding site and of conformational states of the free ligand in solution. Independent quadruplets of such 8-replica REST simulations were launched in parallel, saving at regular time intervals a few hundreds of configurations of the target states in a total time of 24 ns both for the bound state and for the free ligand. As a byproduct, the REST simulations were also used to estimate the binding site volume Vsite (vide infra) from the distribution of the distances between the centers of mass of the ligand and the receptor in the bound state. FNE Stage: Annihilation Free Energies via Fast NE Simulations. In the FNE stage a swarm of fast independent ligand annihilation trajectories were started from the phase-

The preparation of the bound states for the systems was done starting from the experimental PDB Mcl-1 co-crystals (5IF4,9 5MES,7 5LOF,8 4HW3,24 5IEZ,9 and 2ROC5). In the case of the 5MES co-crystal, the Cl atom in the para position on the phenyl moiety of ligand 29 was replaced by a SO2Me group to obtain compound 26 of ref 7. From these PDB files, we first generated the all atoms Mcl-1 protein-only model (no heteroatoms) using the extension “automatic PSF builder” provided by the program vmd.38 Since all experimental measurements were done at physiological pH, we used the default protonation state for ionizable amino acids. The all atoms models of the ligands in the complexes were generated using the OpenBabel39 program on the experimental coordinates excerpted from the PDB files. These structures were fed to the PrimaDORAC program for force field assignment, specifying the ligand net charges as reported in Figure 1. The OpenBabel atom model of the ligand and the PSF generated protein structure were joined together to obtain the starting complex structure conformation. These structures, containing only the protein and the ligand with heavy atoms Cartesian coordinates corresponding to the original PDB coordinates, underwent a fast conjugate gradient minimization in vacuo with loose tolerance (0.01 kJ mol−1) to optimize the hydrogen atoms positions. Each protein−ligand system was then oriented along the inertia frame of the complex, and explicit water molecules at a density of 1 g/cm3 were added in a orthorhombic MD box whose side lengths were computed so that the minimum distance between protein or ligand atoms belonging to neighboring replicas was larger than 15 Å in any direction. After removal of overlapping water molecules, the bound state systems contained about 3000 solvent molecules. For water and box volume equilibration, a preliminary 50 ps constant pressure, constant temperature simulation was run for each of the so prepared solvated complexes. The free ligands in bulk solvent were prepared in a similar manner. MD boxes for ligands 5if4, 5mes, 5lof, 4hw3, and 5iez were all cubic and contained approximately 1000 water molecules. For the 2roc solvated α-helical BH3 domain ligand, the box was orthorhombic and the end-to-end distance between the N terminal atom of GLU1 and the C terminal atom of MET27 was kept around the experimental length of the α-helical BH3 domain using an harmonic potential with force constant K = 10 kcal mol−1 Å−2. In this fashion, FSDAM will attain the binding free energy of the α-helical conformation of the otherwise unstructured5 free BH3 domain. All simulations are done using the program ORAC.17 We conclude this section with some remarks on the protonation state of the bound ligands. For the ligands 4hw3, 5iez, 5if4, and 5lof, the carboxylic group, according to the PDB co-crystal structures, is interacting closely with the positively charged R263 and its deprotonated state in the complex is out of question. For the ligands 5mes and 5lof, there might be doubts regarding the species that actually binds Mcl-1 in physiological conditions. The secondary amminic nitrogen (pKa ≃ 10:11)40 of the ligand in the 5MES complex does not appear to be involved in intramolecular hydrogen bonds and is relatively close to the positively charged R263. These combined effects appear to be more compatible with a binding deprotonated species. In the 5LOF complex, the situation is similar. The (1−4)piperazine moiety of the ligand should be mostly singly protonated in physiological conditions with a pKa ≃ 8.5 only slightly above physiological pH.41 However, also in the 5LOF complex as in 5MES, the binding 3892

DOI: 10.1021/acs.jctc.8b00305 J. Chem. Theory Comput. 2018, 14, 3890−3902

Article

Journal of Chemical Theory and Computation

Figure 2. Work distributions for the bound state (red curves) and unbound state (black curves). The histograms have been evaluated using 1770 ligand annihilation NE work values for the unbound state and 1000 ligand annihilation NE work values for the unbound state. The Anderson− n 2i − 1 Darling test A2 is defined as A2 = ∑i = 1 n [ln(Φ(wi)) + ln(1 − Φ(wn+1−i))], where Φ is the Gaussian cumulative distribution function with sample mean and variance and wi are the work values sorted in ascending order.

depend on the rate of the NE process, the estimates of eq 1 should not depend on the duration time of the NE process. In practice,14 there is a system dependent lower bound for the NE duration of the process. For NE rates exceeding this threshold, eq 1 no longer applies as the work distributions cease to be normal, typically exhibiting an asymmetric shape with a fatter left tail and negative skewness.11 This asymmetry is common to all processes involving the exit from a funnel-shaped free energy surface, such as in ligand annihilation and in the folding processes.14 As in past FSDAM studies on ligand−receptor or host− guest, the duration of each of NE independent decoupling trajectories adopted in this work is of a few hundreds of picoseconds (from 90 to 720 ps) for both the solvated bound state and the free ligand in bulk solvent. The corresponding annihilation rates, as we show in Results, are in most cases well above the threshold for all ligand−receptor systems except (quite expectedly) for the bound and unbound states of the αhelical BH3 domain where the associated work distributions are a complex mixture of many components, making, in this case, the estimate of the annihilation free energies critical. The annihilation protocol of the τ lasting NE processes was common to all ligands and stipulates that the electrostatic interactions between the ligand and the environment are linearly brought to zero at t = τ/2, while the Lennard-Jones interactions are switched off in the range τ/2 < t < τ using a soft-core Beutler potential47 regularization as λ is approaching zero. Normality of the distribution is of paramount importance in FSDAM as not only it allows the straightforward determination of the annihilation free energies via the unbiased Gaussian estimate eq 1 but also provides an a posteriori assessment of the error based on the knowledge of the mean and variance of the work collection. The mean and variance for normally distributed samples are in fact independent random variables and follow the t-statistics and χ2 distribution, respectively.48 By inverting these distributions, we can get an estimate of the

space points sampled in the REST stage. Note that these NE simulations are performed with the same Hamiltonian of the target state in REST, including the harmonic restraint potential. The total number of independent NE trajectories were 1770 for the decoupling of the ligand in the bound state and 1000 for the decoupling of the ligand in bulk solvent. The alchemical work along the alchemical path was computed as described in ref 42. From the collection of annihilation works, two normalized work histogram probabilities were computed, namely, Pb(W) and Pu(W), referring to the decoupling work distribution for the bound and free ligand, respectively. These distributions may or may not be normal. If they are normal, Crooks theorem13 (CT) implies that the corresponding annihilation free energies can be obtained according to the following unbiased11 estimate: ΔGb / u = ⟨Wb / u⟩ −

1 β σ2 2 b/u

(1)

where the indices b and u refer to the bound or unbound states and where ⟨Wb⟩ and ⟨Wu⟩ are the mean values of the ligand decoupling work in the bound state and in bulk solvent, and σb and σu are the corresponding variances. Equation 1 leads11 to the FSDAM expression for the standard dissociation free energy: ij V yz ΔG0 = ΔGb − ΔGu + RT lnjjj site zzz j V0 z k {

(2)

where V0 is the standard state volume and Vsite is connected to 1 the binding site volume. In eq 1, the σ-related energies 2 βσb / u 2 have a straightforward physical interpretation: they represent the dissipation in the NE process of the annihilation of the ligand in the bound and unbound state. Hence, the wider are the normal distributions, the more dissipative the NE annihilation process is. In principle according to Crooks theorem,13 while the moments ⟨W⟩ and σ of a Gaussian NE work distribution 3893

DOI: 10.1021/acs.jctc.8b00305 J. Chem. Theory Comput. 2018, 14, 3890−3902

Article

Journal of Chemical Theory and Computation

Table 1. Mean, Variance and Standardized Third (Skewness) and Fourth (Kurtosis) Moments Obtained from the NE Annihilation Work Distributions Reported in Figure 2 μ (kcal mol−1)

liganda

± ± ± ± ± ±

4hw3 (43) 5iez (74) 5mes (110) 5if4 (88) 5lof (93) 2roc (455)

103.50 117.75 63.94 127.97 113.21 533.77

0.09 0.09 0.08 0.09 0.10 0.54

4hw3 (43) 5iez (74) 5mes (110) 5if4 (88) 5lof (93) 2roc (455)

83.31 ± 0.04 95.21 ± 0.07 39.93 ± 0.12 98.48 ± 0.06 83.06 ± 0.08 495.5 ± 0.82

σ2 (kcal2 mol−2) Bound State 12.64 ± 0.43 14.57 ± 0.50 11.30 ± 0.39 13.32 ± 0.46 16.69 ± 0.57 499.7 ± 17.14 Unbound State 1.62 ± 0.07 4.77 ± 0.21 13.97 ± 0.62 3.26 ± 0.15 6.90 ± 0.31 669.9 ± 29.91

μ3/σ3

μ4/(σ4 − 3)

−0.10 −0.12 0.07 −0.11 −0.03 0.34

± ± ± ± ± ±

0.01 0.07 0.09 0.09 0.08 0.09

−0.04 −0.22 −0.14 −0.12 −0.12 0.06

± ± ± ± ± ±

0.22 0.17 0.14 0.16 0.14 0.17

0.29 −0.02 0.23 −0.01 0.08 0.03

± ± ± ± ± ±

0.09 0.10 0.16 0.08 0.08 0.10

−0.05 −0.22 0.25 −0.15 −0.32 −0.07

± ± ± ± ± ±

0.18 0.17 0.40 0.22 0.16 0.22

a In parentheses the number of atoms of the ligand is reported. Errors for μ and σ2 correspond to the confidence intervals (α = 0.05) given in eq 3. The errors for the standardized skewness and kurtosis have been computed by bootstrapping samples of 452 out of 1770 work values for the bound states and 250 out of 1000 work values for the unbound states.

than the critical value at the level α = 0.05. We see that the normality of the work distribution functions occurs in most cases for bound and unbound states, with the exception of 5iez (A2 slightly above the critical value of 0.752 for α = 0.05) for the bound state and of 4hw3 (unbound state) and 2roc (PUMA-BH3 bound state) departing more decisively from normality. These results are confirmed by the calculation of the standardized skewness and kurtosis of the distributions, reported in Table 1. As a matter of fact, while 5iez (bound) distribution is only marginally non-normal, the 4hw3 and 2roc distributions (unbound and bound state, respectively) exhibit a statistically significant positive skewness indicating fatter right side tails. As discussed elsewhere,11,12,20,42 the fast annihilation of a ligand in a matter of a few hundreds of picoseconds is a surprisingly low dissipation process, often producing rather narrow Gaussian-shaped work distributions, even for the annihilation of bulky ligands (100 atoms) as shown in Table 1. In this regard, it has been pointed out20,42 that a “normal” standard deviation σa per atom for the annihilation of a ligand in a duration time of τ = 360 ps in standard conditions in the unbound state is of the order 0.1 kcal mol−1 or less. These values for σa are expected to be even smaller for the annihilation of the ligand in pure solvent. This hypothesis is broadly verified in Table 1, where average σa are about 0.05 (bound state) and 0.03 kcal mol−1 (unbound state) for all ligands. An exception to this trend is seen for the annihilation of 2roc in the unbound state, yielding a σa value of 0.06 kcal mol−1 significantly above the average. Bridging these observations with the DA tests shown in Figure 2 and the moments reported in Table 1, we may state that unusually large σ values are indicative of a work distribution made of a mixture of normal components, connected either to competitive alternate binding poses for the bound state or to different ligand conformations in pure solvent. In such cases, eq 2 is no longer valid and estimates based on Gaussian mixtures11,14 should be adopted. The reliability of the unbiased estimate eq 2 based on the normality assumption can be further verified by examining the 1 potential of mean force ΔGb / u(λ) = ⟨Wb / u(λ)⟩ − 2 βσb2/ u(λ) along the alchemical coordinate 0 < λ < 1, with λ = 0 and λ = 1

confidence intervals, thus providing an overall error for the Gaussian estimator of eq 1 as δ ΔGb / u = zα /2

σb / u nb / u1/2

1 ijj 2 yzz zz β jj 2 jk nb / u z{

1/2

+

σb / u 2 (3)

where nb/u is the size of sample work values and 1 − α is the confidence level, meaning that the true value of ΔG falls inside the given range of eq 3 with probability 1 − α. zα/2 is the abscissa for which the standard normal distribution with μ = 0 2 ∞ α and σ2 = 1 yields 2 = ∫ e−x /2 dx . For example for a zα /2

confidence level of 95% in eq 3, we have α = 0.05 and zα/2 = 1.96. We stress that eq 3 provides the methodological error only. Force field deficiencies can of course be responsible for unknown additional errors.



RESULTS Normality of the Fast Switching Work Distributions. In Figure 2, we report the work distribution functions obtained by annihilating the five synthetic Mcl-1 ligands, 4hw3, 5iez, 5mes, 5if4, and 5lof as well as the BH3-PUMA α-helical domain (2roc) in the bound and unbound states. The annihilation time τ was set to 360 ps for all ligands except for the bulky BH3-PUMA, where τ was 750 ps. The annihilation work distributions appear to be normal in most cases. In general, we observe a wider work distribution for the bound state, indicating that the latter, for τ = 360 ps, is a more dissipative process with respect to NE annihilation in pure solvent. This is due to the heterogeneous nature of the environment surrounding the ligand in the bound state. For the BH3-Mcl1 complex, distributions are exceptionally broad, implying that, even for τ = 750 ps, the annihilations in the bound and unbound states are still extremely dissipative. Normality of the work distributions can be assessed using the Anderson−Darling (AD) quadratic test49−51 with emphasized weight on the tails of the distribution. AD has been recently shown52 to be the most stringent normality test among many popular alternatives including the Kolmogorov− Smirnov and the Wilk−Shapiro tests. The value of A2 for the AD metric is reported in green color in Figure 2 when it is less 3894

DOI: 10.1021/acs.jctc.8b00305 J. Chem. Theory Comput. 2018, 14, 3890−3902

Article

Journal of Chemical Theory and Computation

Figure 3. Upper panels: PMF for the annihilation of the ligand in bulk (left) and in the complex (right) obtained using different annihilation rates. The insets are an enlarged view of the PMF in the final stages λ ≃ 1 of the alchemical parameter λ. Lower panel: Work distributions computed at various annihilation rates for the ligand in bulk (left) and in the complex (right). The corresponding Gaussian distributions, evaluated using the first two moments of the collection of 480 and 216 FS-NE works (unbound and bound states, respectively), are shown in green color.

representing the fully coupled and fully decoupled states, respectively. We have said that, while ⟨W⟩ and σ are both a function of the rate of the NE annihilation process, the Gaussian estimate at any λ should be independent of the rate of the NE process, provided that this rate does not exceed an unknown system dependent threshold for normality deviation. In particular, for any two NE processes with mean annihilation rates 1/τ1 and 1/τ2 below such threshold and with normal work distributions along the entire λ interval, the PMFs must be identical at all λ and the ratio r=

σ 2(λ , τ2) − σ 2(λ , τ1) ⟨W (λ , τ2)⟩ − ⟨W (λ , τ1)⟩

(4)

is constant and equal to 2kBT. This is illustrated in Figure 3 (upper panels), where we show, as an example, the annihilation free energy ΔG(λ) or PMF for the unbound state and for the bound state of 5if4 for various annihilation times ranging from 90 to 720 ps (unbound state) and from 180 to 720 ps (bound state). In the same Figure 3 (bottom panels), we show the final annihilation work distributions evaluated at λ = 1 for the 5if4 decoupled ligand using various annihilation rates in bulk and in the complex. The corresponding superimposed normal distributions computed using the first two moments ⟨W(1,τ)⟩ and σ(1,τ) are also shown in the figure. In most cases, taking as reference τ1 in eq 4 for the 720 ps process, we obtain a ratio r around 2kBT at all λ so that the Gaussian assumption appears to be fully justified. This is shown for the case of the annihilation of 5if4 in bulk, in Figure 4, where the ratio r has been reported as a function of the alchemical coordinate, for τ2 = 90, τ2 = 180, and τ2 = 360 ps. Note that significant departures from the theoretical value 2kBT are seen for τ2 = 360 ps for λ < 0.8. This is due to the fact that in this case the differences Δσ2(λ)(τ2,τ1) and Δ⟨W(λ)⟩(τ2,τ1) are

Figure 4. Ratio r (eq 4) in kBT units for the annihilation of 5fi4 in bulk solvent. The reference state is provided by the NE process of duration τ1 = 720 ps. The green dotted line corresponds to the theoretical expected value of the ratio if the work distributions are normal for all λ.

small so that errors are larger. These errors are mitigated at the end of the annihilation process, since for λ > 0.8, i.e., when the dispersive−repulsive interaction ligand−solvent are brought to zero, the spread of the distribution gets larger, uncertainty decreases, and the theoretical value is recovered. Figures 3 and 4 represent a spectacular demonstration of the validity of a Crooks theorem based unidirectional estimate, eq 2, and of the Gaussian assumption. As prescribed by the Crooks and Jarzynski theorems,13,53 the invariance of the PMF to changes in the NE annihilation rate is observed at any λ for the annihilation of the ligand in bulk. For the annihilation of the ligand in the complex, the dissipation Wdiss = βσ2b/2, is 3895

DOI: 10.1021/acs.jctc.8b00305 J. Chem. Theory Comput. 2018, 14, 3890−3902

Article

Journal of Chemical Theory and Computation significantly larger and a bias error54 of B = βσb2/(2nb) may be observed for the fastest annihilation process (τ = 180 ps) and for λ approaching 1 (fully decoupled state), very likely due to undetected left tails if the DA test of the work distribution. As a final remark in this subsection, we may say that further indications of the correctness of the normality hypothesis, and hence of the Gaussian estimate, come from the comparison of the Jarzynski averages ΔGb/u = ⟨exp(−βWb/u⟩, which is exact in any case when n → ∞, and of the Gaussian estimate, which is exact if the distribution is normal. For normally distributed finite samples and if the work distributions are not excessively wide, the two estimators should be close one to another,14 with the Jarzynski average being systematically higher due to the well-known bias error55,56 in the statistical evaluation of exponential averages on finite samples. As discussed in ref 54, given n, the bias error increases with increasing variance, i.e., with increasing departure from reversibility. Table S2 of the Supporting Information, where we report the Jarzynski and the Gaussian estimates for all bound and unbound annihilation free energies, fully confirms (with the only exception being 2roc), these expectations showing in general Jarzynski estimates close to, but systematically higher than, corresponding Gaussian estimates and with discrepancies increasing in the bound state, i.e., when the NE process is more dissipative. Binding Site Volume Calculation. As suggested in refs 20 and 57 in this study we have computed Vsite entering the FSDAM estimate eq 2, from the distributions of the ligand− target COM-COM vector distance. This has been accomplished by numerically evaluating the volume enclosed in a van der Waals-like surface defined by spheres with centers at the sampled COM-COM vector points and radius 0.1 Å. As such, this choice for the elusive Vsite, while apparently reasonable, still remains arbitrary to some extent. Fortunately, the logarithmic dependence of the Vsite in eq 2 significantly tames the error of the volume correction. In Table 2, we report

the volumes obtained by monitoring, in a total time of 24 ns, the ligand−target vector COM-COM distance in the equilibrium REST stage for the bound states. The measured binding site volumes imply a mean oscillation radius of the ligand COM in the biding site between 1.5 and 2.5 Å, at most spanning a solid angle of a few degrees. These volumes yield a binding site correction ranging from 2.4 to 3.0 kcal mol−1. The binding site volume determination is undoubtedly the weak point of the FSDAM approach. However, the binding site volume determination is the rather undervalued weak point of any computational approach based on the definition of “bound state”.58 These include also methodologies aimed at the determination of relative binding free energy which all implicitly (and arbitrarily) assume a constancy of the binding site volume upon transmutation of the bound ligand into another bound parent compound. In FSDAM theory, the volume represents the probability of hitting the binding site in the hypothetical inverse fast growth process, whereby the decoupled ligand is switched on in a random place in the MD box volume or in the restraint volume Vref set by the COMCOM harmonic constraint. This unknown probability defines the binding site volume as p = Vsite/Vref and, for fast growth process, should be weakly dependent on the rate of the NE process, so long as τ ≪ τon, where τon = Vref/kon is the average mean time for the ligand to bind the target at the concentration 1/Vref. Finite Size Correction for Charged System. For the charged system, in the PME Ewald treatment one must add to the electrostatic energy a term corresponding to the direct space interaction of the excess charge of the system with a uniform neutralizing background plasma. Such a term is given by34 −|∑iqi|2π/(2α2V), where V is the volume of the box, the sum is extended to all charges in the box, and α is the Ewald convergence parameter. If the ligand bears a net charge, a finite size correction must be added to account for the change in the free energy related to the annihilation of the net charge. This correction can be computed as

Table 2. Volumes Obtained from the COM-COM Distance Vector during the REST Equilibrium Stage for the Bound State and Corresponding Volume Correction Termsa ligand 4hw3 5iez 5if4 5lof 5mes 2roc

Vsite (Å3) 12.5 18.9 11.6 16.9 26.9 30.8

± ± ± ± ± ±

6.1 9.5 5.8 8.5 13.5 15.4

ΔGfs = −

kBT ln(Vsite/V0) (kcal mol−1) −2.9 −2.7 −3.0 −2.7 −2.5 −2.4

± ± ± ± ± ±

0.3 0.3 0.3 0.3 0.3 0.3

l [Q 2 − (Q + Q )2 ] o Q L2 | π o o o R R L − m } 2o o ( b ) ( u ) o 2α o V V BOX BOX ~ n

(5)

where QR and QL are the net charge on the receptor and ligand, respectively, and V(b/u) BOX are the MD box volumes of the bound and unbound states. Equation 5 applies, in principle, to 4hw3, 5iez, 5if4, 5lof, and 2roc. In practice, this correction is small (below 1 kcal in all cases; see Table S3 in the Supporting Information) and is neglected altogether in this study.

a

The errors on the volumes are arbitrarily set to 50%.

Table 3. Experimental (ΔGexp) and Calculated (ΔGcalc) Standard Dissociation Free Energies for the Mcl-1 Ligandsa ligand

ΔGexp

4hw3 5iez 5if4 5lof 5mes 2roc

8.9124 10.59 12.89 13.48 >11.67 12.05

ΔGcalc 8.1 11.6 18.0 17.1 18.1 46.1

± ± ± ± ± ±

0.9 1.2 1.1 1.2 1.2 10.9

ΔGb − ΔGu

est

± ± ± ± ± ±

G G G G G J

11.0 14.3 21.0 21.9 26.2 48.5

0.6 0.9 0.7 1.1 1.2 10.5

ΔGvol −2.9 −2.7 −3.0 −2.7 −2.5 −2.4

± ± ± ± ± ±

0.3 0.3 0.3 0.3 0.3 0.3

ΔGpKa

−2.1b −5.6c

a The entries in the “est” column indicate the estimate used for evaluating the annihilation free energies ΔGb/u with “G” and “J” standing for Gaussian and Jarzynski, respectively. ΔGvol is the volume contribution and ΔGpKa is the correction due to the protonation state of the bound ligand. All reported data are in kilocalories per mole. bΔGpKa has been computed using pKa = 8.5 for 5lof.41. cΔGpKa has been computed using pKa = 11 for 5mes.40.

3896

DOI: 10.1021/acs.jctc.8b00305 J. Chem. Theory Comput. 2018, 14, 3890−3902

Article

Journal of Chemical Theory and Computation



DISCUSSION We now use the results reported in Tables 1 and 2, by applying eqs 1 and 2 in order to obtain the FSDAM standard dissociation free energies. We use the Gaussian estimate for all ligands, except for 2roc. We use the Gaussian estimate also for the bound state of 5iez, only slightly above the DA critical value, but with a correlation coefficient in the quantile− quantile plot of 0.9993 (see Figures S3 and S4 in the Supporting Information), and for the unbound state of 4hw3. The non-normality of the 4hw3 unbound distribution, which appears nonetheless very narrow, lies in the fatter right tail marked by an overall moderate positive skewness (see Table 1). Fatter right tails, while having a small impact on the variance and hence on the Gaussian estimate, are far less dangerous than fatter left tails. This is best rationalized in terms of the Jarzynski average ΔGb/u = ⟨exp(−βWb/u⟩, which depends crucially on the work values sampled in the left tail and is totally insensible to high values of the work.56 In the case of 2roc, the huge variance of the bound and unbound distributions makes the Gaussian estimate in essence unreliable (see the error on the associated variance reported in Table 1), and we hence apply the bare Jarzynski average, ΔGb/u = ⟨exp(−βWb/u⟩, in the expectation that the unavoidable bias error54 will cancel in part when evaluating the difference ΔG = ΔGb − ΔGu. In Table 3, we report the computed and experimental dissociation free energies for all six ligands. Before proceeding in the discussion, a few words are in order concerning the estimates of the dissociation free energies for 5mes and 5lof. As discussed in Methods, the ligands 5lof and 5mes in their bound state were assumed to be deprotonated on the piperazine nitrogen atoms (5lof) and on the amminic nitrogen (5mes). To gain further support for this hypothesis, we have repeated the computation for 5lof and 5mes, both in the bound and unbound states, using a ligand parametrization with the protonated amminic nitrogen in 5mes and protonated N(4) on the piperazine ring in 5lof. The protocol for the REST and FNE stage for these two additional computations was identical to that discussed in Methods. We found that Nprotonation strongly perturbs the bound state both in 5mes and 5lof, thus decisively pointing to a significant weakening of binding. Results of these additional calculations are collected and discussed in the Supporting Information. When computing the standard dissociation free energies for N-deprotonated 5mes and 5lof reported in Table 3, we evaluate by means of eq 2 the annihilation free energies of the N-deprotonated species in both the bound and unbound states, hence referring to the chemical equilibrium LR → L + R where L indicates the ligands 5mes and 5lof in their N-deprotonated states and R the Mcl-1 receptor, and with the standard dissociation free energy of the L species being related to the (L) dissociation constant as K(L) d = exp(−βΔG0 ). In solution, at physiological pH, both ligand 5mes and 5lof should actually be N-protonated. The inhibition constant is determined in both cases7,8 using competitive binding assays by adding aliquots of ligand solution to a pH buffered solution containing Mcl-1 at fixed known concentration and by determining the IC50, i.e., the ligand concentration yielding half inhibition of the receptor. As shown in the Supporting Information, if the protonation state of the binding species is different from that in solution, then the IC50 at pH = 7 should be related to the true Kd as IC50 ≃ Kd[1 + 10−(pH−pKa)], where pKa is referred to the amminic nitrogen. It then follows that the apparent

(experimental) dissociation constant at physiological pH (L) should be computed as ΔGexp = ΔG(L) − RT ln[1 + −(pH−pKa) ]. 10 Going back to Table 3, we notice that the computed dissociation free energies for the five synthetic ligands are in fair agreement with the experimental measurements. Results are indeed beyond initial expectations, given the size and complexity of the ligands and considering that the GAFF2 force field parametrization for the ligands was obtained directly using the PrimaDORAC29 assignment, with no further adjustment. FSDAM calculation correctly predicts 4hw3 as the weaker binder, with a computed value almost matching the experimental counterpart, and correctly predicts 5iez as a much less potent binder than 5lof, 5mes, and 5if4. For all of these three ligands, FSDAM appears to overestimate significantly the dissociation free energy. However, we should stress that, for 5mes, the experimental inhibition constant was found to be “less” than 3 nm, hence providing only a lower bound for the corresponding dissociation free energy. For tight binding inhibitors, the underestimation of the standard dissociation free energy is a very likely event in an ordinary fluorescence assay for the determination of IC50. In competitive binding assays such as those described in ref 7 (5mes), in ref 8 (5lof), and in ref 9 (5if4), the added inhibitor L shifts the labeled substrate S (usually a FITC-labeled BH3 peptide) from the enzyme Mcl-1. The IC 50 value is then given by 1 IC50 = K iapp + 2 RT with RT being the total concentration of the Mcl-1 receptor in the assay. The apparent inhibition constant is in turn given by Kapp = Ki/(1 + [S]/KM), where [S] i is the substrate concentration at half inhibition and KM is the known substrate−enzyme dissociation constant. The procedure for obtaining the Ki from the IC50 and the experimental setup is provided in ref 59 and available online on the very resilient and popular Web site60 entitled “The Ki Calculator for Fluorescence-Based Competitive Binding Assays”. For tight binding inhibitors such as 5if4, 5mes, and 5lof, the theoretical limit60,61 of the measured IC50 is given by RT/2 regardless of the Ki value. In standard experimental setups, this regime, the so-called “zone 3” or “zone C” in competitive inhibition,61,62 occurs whenever Ki ≪ RT. Very likely, this regime was reached in ref 7 (5mes), in ref 8 (5lof), and in ref 9 (5if4), where RT/2 in the experimental setup was in all cases significantly above the measured Ki. The true values of the dissociation free energies for these three ligands may hence be closer to the FSDAM computed values. In Figure 5 we report the PMF along the alchemical coordinate for the five synthetic ligands for which normality was assumed. At λ = 1, these PMFs yield the dissociation free energy apart from the volume and pH correction. Hence these functions provide fundamental information on the nature of binding. We can see that, with the exception of 5mes, the electrostatic interactions contribute negatively to binding, implying that these interactions stabilize more the ligand when surrounded by solvent with respect to the ligand in the binding site. For the relatively weak binders 4hw3 and 5iez, we note that the electrostatic contribution is particularly negative (≃6 kcal mol−1 in both cases). In all cases a decisive reversal in favor of binding is obtained only when the Lennard-Jones parameters are switched off, with a gain in binding free energy ranging from a minimum of ≃15.4 (4hw3) to a maximum of ≃22.5 kcal mol−1 (5mes). Following the functions of Figure 5 from the right to left (i.e., in the sense of the switch on of the 3897

DOI: 10.1021/acs.jctc.8b00305 J. Chem. Theory Comput. 2018, 14, 3890−3902

Article

Journal of Chemical Theory and Computation

As discussed in refs 20 and 58, the entropy in the free energy balance of a binding process bears the dependence on the standard state and can be interpreted as a volume loss of the ligand upon binding, for volume of any kind, translationally, rotationally, and conformationally. As such, the entropy always contributes negatively to binding. Unlike the rototranslational contribution to volume loss, the conformational part has in general a very disparate and interesting behavior in synthetic drug molecules, connected to their intrinsic flexibility and to the so-called reorganization energy.20,26 Here we qualitatively assess the conformational contribution by evaluating the entropy loss in going from the accessible conformational space of the ligands in solution compared to the limited conformational space of the ligand in the bound state. This has been accomplished by performing an analysis of the ligand conformations in bulk and in the bound state according to the quality threshold clustering (QTC)63 (see the Supporting Information for more details) and defining an entropy index as :(N , Q ) = ∑i pi ln pi , where pi is the population of the ith conformational cluster. The quantity RT ln(pBC), where pBC is the probability of the binding conformation in bulk, can be related to the conformational contribution to the dissociation free energy. Such contribution is zero when the ligand has the same conformation in bulk and in the bound state and is negative otherwise. Results of this analysis are collected in Table 4. Ball and stick representative structures of the most populated cluster for the bound state and in bulk solution are reported for all five synthetic ligands in the Supporting Information. Quite surprisingly, the conformational entropy index reported in Table 4 appears to be negatively correlated to the binding free energy. The micromolar inhibitor 4hw3 exhibits minimum conformational entropy loss upon binding, with the binding conformation (BC) falling in the corresponding most populated structural cluster in bulk. On the other hand, the sub-nanomolar 5mes and 5lof binders are characterized in bulk solution by a large accessible conformational space connected to the rotatable bonds involved in the chloro-phenyl-phenethyl and phenyl-sulfone moieties in 5mes

Figure 5. Difference between the PMF of the ligands in the complex and in bulk along the alchemical coordinate. The final value ΔPMF(λ=1) represent dissociation free energy (except for a standard state volume and pH correction).

ligand), we see that the maximum affinity is reached when the ligand Lennard-Jones potential has been fully turned on and the charged on the ligand are still close to zero (with the only exception of 5mes), implying that the binding to Mcl-1 is driven mostly by hydrophobic forces. These observations can be quite straightforwardly rationalized by inspection of Figure 6 as well as 1 and of Figure S1 and Table S1 in the Supporting Information where the number and the nature of the ligand−protein contacts are reported. In general, we may roughly say that the electrostatic contribution is essentially related to the stability of the interaction between the positive ARG263 in Mcl-1 and the electron-rich ligand counterpart (the sulfone moiety in 5mes and the carboxy group in all other cases), while the Lennard-Jones contributions appears essentially related to the size of the ligand. For all five synthetic ligands, the nature of the binding appears to be driven mostly by the hydrophobic interactions.

Figure 6. Ligand-Mcl-1 atom−atom contacts as a function of the residue number for the five synthetic inhibitors. The y-scale corresponds to the computed mean number of contacts between any atom of the ligand and any atom of the side chain observed in the REST stage of the bound state. 3898

DOI: 10.1021/acs.jctc.8b00305 J. Chem. Theory Comput. 2018, 14, 3890−3902

Article

Journal of Chemical Theory and Computation Table 4. Conformational Analysis in Bulk Solutiona ligand

:(Q , N )

bound cluster

probability

4hw3 5iez 5if4 5lof 5mes

0.807851 3.34395 2.85467 4.53642 5.04099

1 1 3 none none

0.44 0.07 0.17