A Template Based Method for Conformation Generation and Scoring

1 day ago - In this paper, we report a method that uses the 3D conformation of a lead compound in complex with a protein as template to generate ...
0 downloads 0 Views 1MB Size
Subscriber access provided by UNIV AUTONOMA DE COAHUILA UADEC

Computational Chemistry

A Template Based Method for Conformation Generation and Scoring for Congeneric Series of Ligands E. Prabhu Raman J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.9b00032 • Publication Date (Web): 02 May 2019 Downloaded from http://pubs.acs.org on May 2, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

A Template Based Method for Conformation Generation and Scoring for Congeneric Series of Ligands

E. Prabhu Raman BIOVIA, Dassault Systemes, 5005 Wateridge Vista Drive, San Diego, CA 92121

Corresponding author: [email protected] ABSTRACT Physics based prediction of protein-ligand binding affinities for congeneric series of ligands in lead optimization requires their geometries as a first step. In this paper, we report a method that uses the 3D conformation of a lead compound in complex with a protein as template to generate conformations of a series of related analog compounds. The method uses the Maximal Common Substructure (MCS) computed between lead and analog ligands to assign coordinates for the atoms shared between the ligands. For the differing atoms, a conformation generation procedure is implemented that results in a diversity of conformations. Thus generated conformations are sorted using a score based on Molecular Mechanics and Generalized Born with Solvent Accessible Surface Area contribution (MMGBSA) method. The accuracy of the generated conformations is tested retrospectively using a cross-validation approach applied to four datasets obtained from the Drug Design Data Resource (D3R) by measuring the RMSD of the top scored conformation with respect to the crystallographic pose. The scoring ability of the method is independently assessed using data for the same protein targets to test the rank ordering ability and separating active and inactive ligands. We tested the effect of protein flexibility during structural optimization, and scoring approaches with and without strain energies. Retrospective validation on datasets comprising 4 targets show that the method outperforms random selection for all targets and outperforms a molecular weight based null model in 3 out of 4 targets in separating active and inactive compounds. Therefore, the presented method is expected to be of utility in lead optimization for rapidly screening analog ligands and generating initial conformations for use in more detailed physics-based binding affinity prediction methods.

INTRODUCTION Small molecule Lead Optimization (LO) is the most expensive and time-consuming of the different stages involved in preclinical drug discovery and design.1 LO involves chemically

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

modifying a lead compound to improve binding affinity and a host of other drug-like properties including specificity, bioavailability, solubility and ADMET properties. While affinity is just one of the many properties, its accurate and efficient estimation in response to a changing chemotype provides opportunities to optimize the other properties while monitoring affinity. Towards this goal, physics-based methods have seen increasing adoption over the past decade particularly for binding affinity optimization.2, 3 Unlike knowledge based methods, physics-based methods do not require target-specific experimental training data and offer physical insights that could be used for rational optimization. The chemical space exploration needed to achieve the desired improvements in drug-like properties may require screening a large number of compounds. Thus, it is desirable to have efficient methods that can screen tens to hundreds of thousands of ligands. Among physics-based methods, free energy methods3, 4 offer an accurate route to calculating relative binding affinities, which is desired in LO, but the computational cost of tens of nanoseconds of MD simulation is formidable for large sets of compounds, even with the current advanced state of CPU and GPU computing. It is thus desirable to have methods that can screen large numbers of analogs in the early stages of LO to allow for a relatively larger exploration of the chemical space around the lead compound. A smaller list of compounds shortlisted this way using an approximate method could subsequently be subject to more accurate but time-consuming free energy methods for prioritization with higher accuracy. To enable such a two-step physics-based workflow for congeneric virtual screening, there is another requirement, which is described as follows. For computation intensive free energy methods, it is critical that the initial protein-ligand complex geometry be accurate. This is especially the case for relative binding free energy calculations involving a few tens of nanoseconds of sampling or less, which assume that the overall binding mode of the ligands involved does not change significantly. This assumption enables shorter simulation times, and when true can result in accurate predictions. However, it requires that the starting binding mode be similar or kinetically accessible to the thermodynamically important state. One approach to obtaining initial geometries is docking, which is suited to the problem as it often identifies the correct binding mode,5, 6 and is automatable for highthroughput applications. However, as observed in recently conducted Drug Design Data Resource (D3R) open docking challenges, it is not very uncommon for docking methods to fail in identifying the crystallographic pose as the top scoring pose among many generated for each ligand.5, 6 This poses a challenge to obtaining predictive results. For LO applications that involve screening of congeneric series involving an invariant scaffold, one possible alternative approach to obtaining initial geometries is to use the crystallographic binding mode of the lead compound where available as a template. In this paper, we describe such a method, which we term Generate Analog Conformations (GAC). Our approach detailed in this paper circumvents the docking problem and uses information from experimentally known binding mode of one ligand in the series to generate 3D conformations of analogous ligands. The method detects the set of common atoms between lead and analog ligands using the Maximal Common Substructure (MCS) method.7 The

ACS Paragon Plus Environment

Page 2 of 30

Page 3 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

coordinates for the common set in the analog ligand are directly assigned from the lead ligand; for the different set a limited conformational search is implemented (see below). Such a minimalistic approach was chosen keeping in mind small chemical changes that characterize congeneric series. While being template based, our approach is distinct from other template based methods that use 3D structural alignment to generate geometries8, 9 that have shown promise in recent D3R challenges in generating accurate geometries.5, 6 We validate the accuracy of the generated conformations by comparing them with crystallographic geometries in cross-docking calculations using data from D3R. Then, using the binding score that is obtained as a byproduct of the conformation sorting procedure, we test the ability of the method to rank order compounds and separate active ligands from inactive ones in congeneric sub-datasets constructed from datasets collected from D3R. The results show that the template based conformation generation method is sufficiently accurate when 65% or more of the analog ligand is shared with the lead. Secondly, affinity scoring applied to the generated poses outperforms random selection and Molecular Weight (MW) based null model in separating active compounds from inactive ones, and therefore be of utility in lead optimization. METHODS Protein structure preparation: Protein structures were downloaded from the Protein Data Bank (PDB).10 Crystallographic waters and co-solvents were removed, and the ligand was retained. The structure was then prepared using the “Prepare Protein” protocol in BIOVIA Discovery Studio.11 Missing loop regions of less than or equal to 8 residues were initially built using MODELER12 and further refined using the LOOPER method.13 Longer loops were not built; the disconnected residues in such loops were capped with neutral patches. For atoms with multiple occupancy, the first one was chosen; this arbitrary choice ended up having an effect on the interpretation of results which is discussed later in this paper. The protein preparation protocol assigns amino acid protonation states at pH of 7.4 using Generalized Born electrostatics model, assuming a protein dielectric constant of 10 and ionic strength of 0.145 M, as detailed previously.14 Upon preparation, for each dataset, all protein structures were aligned using the “Superimpose Proteins” tool in Discovery Studio. Ligand preparation: Ligands obtained from D3R were in SMILES format. A “rule based” method was used to assign a single protonation state for the ligands, followed by assignment of 3D coordinates. And although the presented method is capable of handling molecules with chiral centers, chiral ligands were not included in order to keep comparison with experimental affinity data unambiguous. The “Prepare Ligands” protocol in Discovery Studio was used to accomplish this task. Analog structure generation: In a lead optimization project, there typically are one or more protein-ligand co-crystal structures available. We term the ligands in these structures as “lead” in the following discussion. The goal of the analog structure generation step is to obtain the 3D conformations of ligands similar to the lead compounds, which we refer to as “analogs”. For each analog compound, the structure generation method involves six steps:

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 30

(1) MCS generation – The maximal common substructure (MCS) method is used to obtain the common substructure between the analog and each of the lead compounds. The details of the MCS algorithm are described elsewhere.7 Hydrogens are removed prior to MCS computation. The overlap between the analog and lead compound is quantified by the fraction of common atoms defined as 𝑁𝑀𝐶𝑆

𝐹𝐶 = 𝑁𝑎𝑛𝑎𝑙𝑜𝑔

(1)

Where 𝑁𝑀𝐶𝑆 and 𝑁𝑎𝑛𝑎𝑙𝑜𝑔 are the number of heavy atoms in the MCS and the analog ligand, respectively. Thus, this metric quantifies the fraction of the analog ligand that exactly matches the lead. For the subsequent steps, the lead compound with the largest 𝐹𝐶 is selected as that would provide the best possible template for conformation generation. The MCS also contains the atom-atom mapping between the lead and the analog. At this stage, if the MCS is found to contain atoms that differ in chirality between the chosen lead and the analog, a minimal number of smallest possible groups are removed from the mapping that would enable the structure of the generated analog to have the correct chirality. (2) Assign coordinates of common atoms – The coordinates of atoms in the common set are assigned exactly as the corresponding lead atom coordinates. (3) Generate coordinates of different atoms – The coordinates of atoms that do not belong to the MCS are generated using internal coordinates. Internal coordinates (IC) for each such atom are calculated from a single 3D conformation of the analog that is independently generated. At this stage, the geometry of the analog compound is fully obtained. For certain molecules involving rings, it is possible that the IC based conformation generation may result in incorrect chirality; such molecules are rejected by obtaining the chirality for each stereo center in the molecule and comparing it against the same from the aforementioned newly generated 3D conformation. (4) Selective diverse conformation generation – To introduce diversity in the coordinates of non-MCS atoms, the generated conformation is subjected to a “Diverse Conformation Generator” method that uses the CAESAR algorithm,15 with the core MCS atoms being constrained. Constraints on MCS atoms within 2-bonds connectivity distance from any non-MCS atoms are removed to allow for additional conformational flexibility. The constraints ensure that the binding mode of the MCS region is preserved. A maximum of 15 conformations are generated. A diverse subset is selected from the generated conformations such that no two conformations have pairwise RMSD less than 1Å, which is computed only for the non-constrained atoms. After this step, Hydrogens are added to the molecule. The protein structure was not used in this step. (5) Binding Energy based ranking – The conformations are merged with the protein and a binding energy is calculated for each using the MM-GBSA method which uses the Generalized Born (GB) module16 implemented in CHARMm.17 Conformations are sorted in the increasing order of binding energy, from which the top 10 are chosen. Since the goal of this step is to efficiently screen out high energy conformations, the faster GB variant that does not involve the molecular volume term was used. Other details of the GB calculation used in this step are the same as described in the End point binding energy scoring section below. (6) In-situ minimization and re-ranking – The 10 selected conformations are subject to an in-situ energy minimization with a distance-dependent dielectric to treat electrostatic

ACS Paragon Plus Environment

Page 5 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

interactions. In one variant of the method, the protein is held rigid, whereas in another the binding site residues are unconstrained. The flexible binding site residues are chosen based on proximity to the lead compound in the initial geometry; residues with any atom within 3Å of the lead compound are designated as flexible. This way the same set of atoms is chosen to be flexible for all analog ligands, so as to not have the trends in computed binding energies be influenced by the choice of the flexible region. The minimized structure is subject to a single point MM-GBSA calculation that uses the more accurate GBMV module13,14 implemented in CHARMm.16-19 Details of the GBMV parameters used are described in the End point binding energy scoring section below. For each conformation, the total energy is calculated for the protein-ligand complex, the free protein and the free ligand. The scoring approach is described next. The combined structure generation and scoring protocol is implemented in BIOVIA Discovery Studio as a protocol named “Generate Analog Conformations”. End point binding energy scoring: The energy components computed in the previous step are combined using the following two expressions to obtain the final binding energy score. In one approach, a binding energy is calculated for each conformation and the minimum of the binding energy over all conformations is termed as the binding energy score (BE). 𝐵𝐸 = 𝑚𝑖𝑛𝑟{𝐸𝑃𝐿(𝒓) ― 𝐸𝑃(𝒓) ― 𝐸𝐿(𝒓)}

(2)

Where, 𝐸𝑋(𝒓) is the total energy of a system X in conformation r. X can be either the protein-ligand complex PL, the free protein P or the free ligand L. The Energy of each molecule is a sum of intra-molecular terms, the Generalized Born energy and the solventaccessible surface area energy computed as follows. 𝐸(𝒓) = 𝐸𝑀𝑀(𝒓) + 𝐸𝐺𝐵(𝒓) + 𝐸𝑆𝐴(𝒓)

(3)

Non-bonded interactions were switched off smoothly in the range 10−12Å. The key parameters used in the Generalized Born Molecular Volume (GBMV) solvation energy calculation are as follows. The dielectric constant used for the solute and solvent were 2 and 80, respectively. The van der Waals radii were used as input for computing SASA and molecular volume. The salt concentration was set to 0 mM. The term corresponding to the solvent accessible surface area (SASA) is computed as follows. 𝐸𝑆𝐴(𝒓) = 𝛼 𝑆𝐴𝑆𝐴 + 𝛽 (4) where α = 0.00542 kcal mol-1 Å-2 and β = 0.92 kcal mol-1. An alternative approach was tested for scoring that was aimed at accounting for reorganization energy change upon binding. In this approach, the minimum energy among the different conformations was calculated separately for each molecule P, L and PL. Accordingly, the Rescored Binding Energy (RBE) is calculated as

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

𝑅𝐵𝐸 = 𝑚𝑖𝑛𝑟{𝐸𝑃𝐿(𝒓)} ― 𝑚𝑖𝑛𝑟{𝐸𝑃(𝒓)} ― 𝑚𝑖𝑛𝑟{𝐸𝐿(𝒓)}

Page 6 of 30

(5)

By choosing the minimum energy in each ensemble, RBE does not involve cancelling of the intra molecular energies as in BE and therefore is aimed at including the reorganization energy in the free P and L states. The approach described in Eqn. 2 above is implemented in BIOVIA Discovery Studio as a protocol named “Generate Analog Conformations”. Validation metrics: Root mean square deviation (RMSD) was used to monitor the structural deviation between the predicted and crystallographic geometries. The RMSD was computed over all heavy atoms including both the common and different atoms, and accounted for topological symmetry. For validation of scoring capability, the Area Under Curve (AUC) of the Receiver Operator Characteristic (ROC) curve was calculated. In addition, Predictive Index (PI) was computed to judge the directional correctness quantitatively. PI is defined as follows.20 PI =

∑j > i∑iwijCij

(6)

∑j > i∑iwij

Where, the sum in the numerator and denominator is over all unique pairs of ligands (i,j) in a dataset. Each pair has an associated weight defined as (7) wij = |E(j) ― E(i)| E(i) and P(i) correspond to experimental and predicted binding affinities of ligand i, respectively. The directional correctness for each pair of ligands (i,j) is computed as [E(j) ― E(i)] 1, if[P(j) ― P(i)] > 0 (8) Cij = ―1, if[E(j) ― E(i)] < 0

{

[P(j) ― P(i)]

}

0 , if [P(j) ― P(i)] = 0

PI quantifies the directional correctness of predictions in a dataset weighted by the difference in experimental values, and ranges from -1 to 1. Docking: Two docking programs namely LibDock21 and CDOCKER22 were used to compare against the performance of the presented method. LibDock uses the alignment of physicochemical features computed for protein binding sites (“hotspots”) with the corresponding properties of ligands groups to guide docking. By contrast the program CDOCKER uses a force-field representation and molecular dynamics (MD) based simulated annealing method to obtain docked poses. Default parameters as implemented in BIOVIA Discovery Studio were used for the docking calculations. For LibDock, key parameters used are as follows. A total of 100 polar and apolar hotspots were calculated for the protein. A RMSD tolerance of 0.25Å was used to select structurally distinct conformations. A maximum of 1000 starting conformations were used for posing. For CDOCKER, key parameters used are as follows. 10 random initial conformations were generated using high temperature (1000K) MD. For each conformation, 10 random rotations were performed for posing. The poses were subject to 2000 steps of heating MD to a target temperature of 700K, followed by simulated annealing MD cooling over 5000

ACS Paragon Plus Environment

Page 7 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

steps to a target temperature of 300K. The binding site was defined as a sphere that encompasses all atoms of the ligand that the protein was co-crystallized with. For a given cross-docking calculation, a single top ranked pose was obtained for each method and its RMSD computed against the corresponding crystal geometry. RESULTS AND DISCUSSION Dataset: Our primary goal in this study was to test the ability of the Generate Analog Conformations (GAC) method in generating geometries of congeneric series of ligands given the geometry of one member in the series as a template. A secondary goal was to test the scoring ability of the method to reproduce experimental trends in binding affinity in congeneric sets. To achieve this two-fold validation goal, we selected 4 datasets (Table 1) with structure and binding affinity data that are publicly available in the Drug Design Data Resource (D3R, https://drugdesigndata.org), which is a result of a community-wide data sharing initiative involving industrial and academic groups.5, 6 This resource also contains datasets collected through the Community Structure Activity Resource (CSAR) initiative.23 These datasets contain structure and affinity data for a large number of ligands, some of which fit the description of “congeneric” sets such that the ligands involve variations of groups on a common scaffold as typically is the case in LO. Our choice of the datasets among those available was motivated by the availability of a number of crystal structures for congeneric ligands to provide sufficient data to validate the structure generation method. Another factor was the availability of both active and inactive compounds in the dataset so as to test the enrichment ability of the scoring method. In particular, we chose 4 datasets that had at least 30 ligands and more than 6 inactive compounds. For all sets except HSP90, inactive compounds were already identified in the datasets. For HSP90, an affinity cutoff of -6 kcal/mol was chosen to identify inactive molecules. Table 1. Dataset used for retrospective validation. The protein target, number of ligands in the dataset, number of active ligands, source of data and the complexes with solved crystal structures are listed. The number in parentheses in the second column indicates the total number of ligands in the dataset, some of which were not processed by the method due to lower than 50% similarity with any of the template ligands. Protein Ligands Active Source PDB IDs CDK2 33 (101) 24 CSAR 4EK4 4EK5 4FKG 4FKI 4EK6 4FKL 4EK8 3SW4 3SW7 4FKO 4FKP 4FKQ 4FKR 4FKS 4FKT 4FKU 4FKV 4FKW Urokinase 36 (39) 28 Abbott 4FUD 4FUE 4FU7 4FU8 4FU9 4FUB 4FUC CHK1 95 (95) 69 Abbott 4FSM 4FSW 4FT5 4FSN 4FSZ 4FT7 4FT0 4FT9 4FTA 4FTC 4FSQ 4FSR 4FST HSP90 148 (172) 113 AbbVie 4YKR 4YKQ 4YKT 4YKW 4YKX 4YKZ 4YKY 4YKU

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analog structure generation: The primary goal of our study was to validate the presented template based protein-ligand 3D structure generation method in the context of lead optimization. Since the subsequent scoring for binding affinity involves many approximations, it is important to assess the accuracy of the generated conformations independent of binding affinities. To this end, we tested our method’s ability to reproduce crystallographic geometries of protein-ligand complexes. A leave-one-out cross docking approach was undertaken where, for each target one protein structure and the cognate lead ligand were used as a template to generate the geometries of the remainder of the ligands, referred to here as “test ligands”. The RMSD of the top scoring geometry for these test ligands was computed with respect to their crystallographic geometries, which were obtained from the alignment of all protein structures described above. For a protein with M crystal structures, this test repeated M times would result in a maximum of M x (M-1) RMSD values. Some tests involved template ligands that belonged to a different chemical series than the test ligands where the fraction of common atoms, Fc would be less than 0.5, thus making those ligands outside the scope of the method. A combined total of 177 RMSD values were obtained from the cross docking tests performed for the four proteins. We tested the sensitivity of the results for two variations of the method: (i) Rigid vs. flexible local residues, and (ii) BE and RBE used for scoring (see the Methods section). Keeping the protein rigid during in-situ minimization (step 6 of the analog structure generation algorithm) results in a significant speed increase for the calculations, but this approach would not model induced-fit effects. On the other hand, a structure minimization would not cross conformational barriers that may exist between relevant conformational states involved in induced fit binding. Therefore, the benefit of including protein flexibility especially in ranking congeneric series of ligands is not a-priori apparent, which motivated our testing of its effect on the datasets. Secondly, in many applications of MM-GBSA based end-point scoring, the unbound ensemble is not modeled separately, which is motivated by the prospect of benefiting from error cancellation. In practice this is implemented either using Eqn. 2 or a similar approach involving computing averages24 instead of choosing the minimum energy. In such approaches the conformational ensemble obtained in the complexed state serves as the only source of conformations, which results in the exclusion of any changes in intramolecular bonded energies during binding. In this study, we modeled this effect not by explicitly modeling unbound states, but by choosing the minimum energy conformation out of many in the bound state, which is implemented in Eqn. 6. The method variants resulted in a total of 4 combinations: Rigid-BE, Rigid-RBE, Flex-BE, and Flex-RBE, where Rigid/Flex refers to exclusion/inclusion of local protein flexibility, and BE/RBE refers to the scoring approach (see the Methods section). For the four datasets combined, the number of predictions with RMSD > 2Å were 45, 45, 47, and 43 out of 177 for Rigid-BE, Rigid-RBE, Flex-BE, and Flex-RBE, respectively. For RMSD > 3Å, the corresponding numbers were 25, 27, 26, and 24. Thus, the prediction quality of the four different methods is similar.

ACS Paragon Plus Environment

Page 8 of 30

Page 9 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1. Probability distribution of RMSD computed for the top scoring conformation with respect to crystal. The three sets of results correspond to restricting the dataset to different values of similarity between the template ligand and the analog ligand, quantified by the fraction of common atoms, Fc. A significant number of predictions diverge from the crystal geometry. The RMSD distribution for the Flex-RBE set is depicted in the leftmost panel of Figure 1. Given the fact that our method relies on chemical similarity to the lead compound for structure generation, it is relevant to observe the accuracy of the generated top pose as a function of similarity. Therefore, we restricted the RMSD comparison to Fc ≥ 0.6 and Fc ≥ 0.65; these results are shown in the center and right panels of Figure 1. The number of structures with a large deviation from the crystal pose significantly drops when the Fc floor is raised from 0.5 to 0.65. For example, for the Flex RBE set, the number of RMSD values > 3Å reduces from 24/177 to 1/97 and number of RMSD values > 2Å reduces from 43/177 to 10/97 (these cases are discussed in greater detail below). The results for Rigid BE, Rigid RBE and Flex BE follow the same trend (data not shown). Figure S1 in the Supporting Information shows the same data plotted in Figure 1 (right panel), but separately for each of the four targets. Figure 2a shows the RMSD as a function of Fc for the Flex RBE variant for the four datasets. The results for the other three methods are similar and not shown here. This analysis reinforces the observation made here and shows the dependence of reliability of pose prediction on the similarity between the lead and analog compounds quantified by the fraction of common atoms, Fc. Based on this analysis, the applicability domain of the method is in the FC ≥ 0.65 range. Figure 2b shows the RMSD distribution of all conformations generated including the top scoring one. The deviation from experimental pose increases with decreasing Fc as expected. However, a comparison of panels a and b of Figure 2 shows that the scoring function deprioritizes high RMSD conformations and assigns favorable scores to ones that are < 2Å deviation from the experimental pose, in most cases.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. RMSD (Å) computed for the generated conformations with respect to crystal, as a function of the fraction of common atoms Fc. Panel a shows the RMSD distribution for the top ranked pose only. Panel b shows the same for all conformations generated by the method. Values are color coded based on the protein dataset.

Figure 3. Two examples showing the lead and analog ligands chemical structures, and the conformations generated. Panel a shows the conformation generation for ligand 3SW7 using 3SW4 as template and b shows conformation generation of ligand 4FKV using 4FKQ as template. The image on the extreme right in both rows shows the overlay of the top scoring conformation with the crystal geometry.

ACS Paragon Plus Environment

Page 10 of 30

Page 11 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

We now compare the above results with those obtained from molecular docking using the LibDock21 and CDOCKER22 programs. LibDock is a fast docking engine that uses the alignment of polar and apolar interaction sites mapped on the receptor with ligand functionalities as described in detail elsewhere.21 CDOCKER by contrast uses a force-field based potential energy grid and simulated annealing molecular dynamics (MD) simulations to compute poses, also detailed previously.22 The choice of distinct docking approaches to compare the performance of the template based method helps to set a benchmark that is independent of the docking approach. The cross-docking calculations described above were performed with the docking programs with the same set of protein structures and ligands. The proteins were held rigid during docking and for this reason we compare them against the template based method using a rigid receptor; though the latter results were found not to be sensitive to modeling protein flexibility. Below, we compare the results obtained with the docking programs for the cross dockings which fall into the applicability domain of the template-based method, i.e. for cases where Fc > 0.65. For the Rigid BE set, the template based approach resulted in cross docking with RMSD values greater than 2Å and 3Å for 12/97 and 1/97 cases, respectively. The corresponding rates of large RMSDs were much higher for both LibDock and CDOCKER based cross dockings. For LibDock, the number of RMSD values above 2Å and 3Å were 47/97 and 32/97, respectively. For CDOCKER the number of RMSD values above 2Å and 3Å were 39/97 and 28/97, respectively. Thus CDOCKER has a higher cross-docking success rate than LibDock. However, while the docking methods generate accurate poses in the majority of cases, there are a large number of cases where poses significantly deviate from experiment. Figure S2 in the Supporting Information plots the RMSD as a function of Fc (analogous to Figure 2a) for LibDock and CDOCKER, which reinforces the analysis above. Figure S2 also shows that the number of high RMSD values increases with decreasing Fc; though unlike the template based method the high RMSD values are not exclusively restricted to the low similarity regime. This observation suggests that docking can be more accurate when using a protein structure that was co-crystallized with a ligand that is similar to the one being docked. While the template based approach benefits from both having the ligand template and a holo-like protein structure, docking only benefits from the latter. It is thus evident that for Fc > 0.65, there is a clear benefit to using the template based method to generate reliable geometries. These findings are consistent with other recent studies that used a template based approaches and outperformed docking when using similar template.25, 26 We now further elaborate on certain key features and limitations of the method. Being a template based structure generation method, it avoids the de-novo docking problem, which has a greater potential to lead to incorrect binding modes. The GAC method takes a conservative approach to conformation generation where the conformational search is limited to the chemical groups that are different between the analog and the lead compound. This aspect is illustrated in Figure 3a where the lead (3SW4) and analog (3SW7) ligands differ by a nitro and an amine group. The nitro group being in the meta position in 3SW7 can map in two unique ways with the 3SW4 ligand, which are detected by the method resulting two generated conformations as shown in the 3rd column. A comparison of the top scoring conformation shows the overlap between the 3SW7 crystal geometry and the conformation generated by the method showing that the GBMV-based scoring correctly

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ranks the experimentally observed conformer among the two generated. Another example illustrated in Figure 3b exemplifies a scenario where the chemical similarity between the lead 4FKQ and analog 4FKV is much lower at 𝐹𝐶 = 0.69, and therefore many of the conformations generated are very different from each other, with the diversity largely arising from the parts that are uncommon between two ligands. The right most image in Figure 3b shows the comparison of the top ranked conformation with the crystal geometry (in ball and stick), with an RMSD of 1.49Å, showing that the score correctly ranks the pose that is closest to the experimental one. These two examples thus illustrate two features of the method that include finding alternative mappings, and conformational sampling of the analog chemical group that differs from the lead ligand. In the RMSDs computed for the dataset described above, although a majority of the geometries in the FC < 0.65 range still result in low RMSDs, a significant number of values > 3Å show lack of reliable prediction for dissimilar ligands. We now analyze the 10 out of 97 predictions that involve FC ≥ 0.65 and RMSD > 2Å as these cases would be within the applicability domain of the method. Nine of these are from the CDK2 dataset and one from the CHK1 dataset. In the following discussion, we refer to each prediction using PDB IDs of the lead and analog ligands. For the 9 CDK2 cases, RMSDs range from 2.06 to 2.80Å. Interestingly, for 4EK8-4FKL, 3SW7-4FKL, and 4FKO-4FKL, 3SW7-4EK8, FC = 1, and so the relatively high RMSDs of 2.4-2.8 are unexpected. The chemical structures of these are depicted in Figure 4a. The 4EK8-4FKL, 3SW7-4FKL, and 4FKO-4FKL cases involve the 4FKL ligand, which is a substructure of the template lead ligands 4EK8, 3SW7 and 4FKO, and is significantly smaller. For the 4FKL ligand, this leads to a full substructure match and therefore FC=1. This applies to 3SW4-4FKL also for which 𝐹𝐶 = 0.93. The template lead ligand in all three cases has an additional substituted phenyl ring, which apparently leads to a significantly different binding mode. We computed a score 𝐹𝐶∗ which quantifies the overlap of the lead compound with the MCS. This metric, which is analogous to 𝐹𝐶 is defined as 𝑁𝑀𝐶𝑆

(7) 𝐹𝐶∗ = 𝑁𝑙𝑒𝑎𝑑 𝐹𝐶∗ quantifies the extent of the lead compound that is covered by the MCS. For the predictions discussed here, 𝐹𝐶∗ is computed as 0.61, 0.54, 0.58, and 0.57 for 4EK8-4FKL, 3SW7-4FKL, 4FKO-4FKL, and 3SW4-4FKL, respectively. Thus, the relatively high RMSD is explained due to an insufficient overlap of the MCS with the lead compound. The use of 𝐹𝐶 to determine the applicability of the method to a given analog ligand is appropriate because the metric directly quantifies the fraction of the ligand for which the structure is not determined (1-𝐹𝐶); such a direct relation does not exist with 𝐹𝐶∗ , which relates to the lead compound instead. While it may be possible to impose a threshold based on 𝐹𝐶∗ , it would reduce the applicability domain of the method, because of which this additional restriction was not imposed. Besides, for the few cases where the fraction of the lead ligand covered is low, the RMSDs are still below 3Å.

ACS Paragon Plus Environment

Page 12 of 30

Page 13 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 4. Illustration of the cases for which deviation from crystal geometry is greater than 2Å. Panel a shows the ligands discussed. Panel b shows the overlay of the alternate crystal geometries of ligand 4EK8 in ball and stick representation, and the generated conformation in stick representation. Panel c shows the conformation generation of ligand 4FKT using 4FKP as template. The protein surface is depicted in yellow.

The deviations for the other 6 predictions in the CDK2 dataset are due to ambiguities in the crystal structures. RMSDs in the range 2.40-2.63Å for 3SW4-4EK8, 3SW7-4EK8, 4FKO-4EK8, and 4EK8-3SW7 are a result of alternate crystallographic geometries for the 4EK8 ligand that place the nitro group differently owing to the rotation of the phenyl ring. Figure 4b shows the two crystallographic geometries of 4EK8 ligand in ball and stick representation. The predicted geometry shown in stick representation overlaps with one of the alternate crystal structures (cyan), but not with the one arbitrarily chosen as reference for RMSD computation. The results for all four of the cases can be explained by the alternate crystallographic conformations. Another ambiguity in the crystal structures explains the deviations involved in the results for the 4FKT ligand, observed in 4FKP4FKT, 4FKR-4FKT predictions, for which the RMSDs are 2.51 Å and 2.06 Å, respectively. The method generates 10 conformations that place the flexible amino-alkyl group in the 4FKT ligand in different locations as shown in Figure 4c. Analysis of the B-factors of the atoms associated with the 4FKT ligand shows the flexible nature of the amino alkyl group; the average B-factor of the 5 heavy atoms comprising the group is 54.8, whereas the average value for the remainder of the ligand is only 23.1. Therefore, this discrepancy is expected. The 2.06 Å RMSD for 4FKR-4FKT case is explained by the same reason. Thus, these deviations are not due to a limitation in the method. Finally, for the one case in the CHK1 dataset namely, 4FT7-4FSW, the RMSD is 4.17Å. This pair has an 𝐹𝐶 exactly equal to 0.65, and the two ligands involved belong to entirely different chemical series.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Being a template based method conformational sampling is adopted only for the differing regions. As described earlier, this approach avoids sampling of groups that are common between the lead compound and the analog. In certain cases, this approach may prove to be restrictively conservative and potentially miss generating relevant binding modes. One example is 3SW7-4EK8 (Figure 4a), which share the nitro group. The generated conformation for the 4EK8 ligand will not explore the alternate location of the nitro group since it is shared with the lead compound. In this particular example, the 4EK8 ligand was found with two occupancies representing the rotation of the nitro-phenyl ring, one of which matches the generated geometry. However, one could envision cases where the template based approach may miss the true conformation. Thus, while the conservative approach to conformer generation may be accurate as shown in the above discussions, potential inaccuracies may arise from the lack of conformational sampling of the shared groups. In summary, this analysis shows that when there is a modest amount of similarity between the lead ligand and those in the congeneric series, such that at least 65% of analog ligands have atoms common with the lead ligand, the method generates accurate geometries. Binding Affinity Scoring To test the ability of the method to score congeneric ligands in accordance with binding affinity, data from the same set of four target proteins was used as above. Table 1 lists the proteins in the dataset with the number of ligands with affinity data and the number designated as active in each set. Only a subset of the ligands (in parenthesis in Table 1, column 2) was processed in the calculations since a number of ligands in each set were far too different from the template ligands with a solved crystal structure, thus putting them out of the scope of the method (𝐹𝐶 < 0.5). In total, 312 ligands across the four targets were processed by the method for which results are discussed below. Since there are multiple co-crystal structures present in each set (Table 1, rightmost column) that could potentially be used as template for structure generation and scoring, all of them were used in separate runs. For each template protein-ligand co-crystal structure, the subset of the ligands chosen to be scored was based on the 𝐹𝐶 > 0.5 criterion. Each such test is representative of a lead optimization scenario where only one co-crystal structure is available, and virtual screening of analogs related to the lead compound crystallized with the protein is desired. On the other hand, often times in projects there are multiple cocrystal structures available, in which case predictions could be combined from using different protein structures. Here, we analyze both of these scenarios. Table 2. Prediction metrics for the 39 congeneric datasets. Area Under Curve (AUC) and Predictive Index (PI) are computed and reported using the four different calculation methods. For comparison, the AUC and PI are calculated for the Molecular Weight based null model. The cells containing AUC values that are above 0.5 are colored in shades of green, whereas below 0.5 in shades of red. The cells containing PI values that are above 0 are colored in shades of green, whereas below 0 in shades of red.

ACS Paragon Plus Environment

Page 14 of 30

Page 15 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 2 Footnote: nLigands indicates the total number of ligands in each subset. nActive indicated the number of ligands designated as Active. Scores were calculated for 39 congeneric sets, which are composed of 13, 7, 11, and 7 sets for CHK1, Urokinase, CDK2 and HSP90, respectively. The number of compounds in the sets ranged from 5 to 82, with the average set size being 28.5. Table 2 lists the Area Under Curve (AUC) and the Predictive Index (PI) metrics computed for each set. The AUC can

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

be used to assess the ability of the method to separate active compounds from inactive ones. In addition, since these sets are composed of analogous compounds, it is reasonable to expect the approximate scoring method used here to rank order ligands in accordance with binding affinity. To test these predictions, we used the Predictive Index (PI) metric as described in Methods to judge the quality of rank ordering. The PI metric ranges from -1 to 1 and is a measure of directional correctness of pairwise affinity predictions with respect to experiment weighted by the experimental difference in affinity. A detailed analysis of the AUC and PI values computed for each set is presented in Table 2. The AUC and PI computed using the null model and the four different variations of the method are tabulated for all the sets and are color-coded. Predictions better or worse than random are depicted in green or red, respectively. Of 39 congeneric sets, the Rigid BE, Rigid RBE, Flex BE and Flex RBE methods outperform the random model when comparing AUC (PI) in 35 (35), 36 (37), 33 (34), and 37 (36) sets, respectively. When comparing AUC (PI) against the MW-based null model, 28 (26), 30 (29), 24 (25), 27 (27) sets outperform using the Rigid BE, Rigid RBE, Flex BE and Flex RBE methods, respectively. We also compared the predictions using the Kendall Tau metric as reported in multiple studies.5, 6 When comparing Kendall Tau for the 39 datasets, the Rigid BE, Rigid RBE, Flex BE and Flex RBE methods outperform the random model for 35, 36, 34, and 35 datasets, respectively. When comparing against the MW-based null model, 24, 26, 25, 26 sets outperform using the Rigid BE, Rigid RBE, Flex BE and Flex RBE methods, respectively. Table S1 in the Supporting Information reports the individual Kendall Tau values for the datasets and color codes them analogous to Table 2. The qualitative similarity of the comparisons obtained from PI and Kendall Tau, show that the results are not very sensitive to the choice of rank ordering metric. To summarize the predictions and observe target dependence, we computed the weighted average of these metrics over all sets for each target and present them in Figure 5. The average AUC values range from 0.60 to 0.98 and are dataset dependent. Judged by the weighted average of AUC (Figure 5, left panel), all of the four methods on all four datasets perform better than random model and the MWbased null model. Similarly, as judged by PI (Figure 5, right panel), all comparisons with the exception of one (Flex BE for HSP90) perform better than the null model. While this comparison involves averaged metrics, the direct comparison across the 39 datasets shows that the method almost always outperforms random selection and in 62-77% of cases outperforms the MW based null model. Thus, based on these analyses the method shows potential for prioritizing congeneric design ideas. Secondly, the performance of the four methods is similar; no single method outperforms others across all 4 datasets. However, within the flexible and rigid classes, the RBE metric yields marginally better prediction than BE.

ACS Paragon Plus Environment

Page 16 of 30

Page 17 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 5: Weighted averages of AUC (left) and PI (right) calculated over congeneric datasets for each target. Predictions from the Random and Null model are also included for comparison. Next, we test the second scenario where predictions from multiple template structures, when available could be combined. It is not uncommon in LO projects to have more than one chemical series of compounds as drug candidates. Such a scenario can be modeled by the use of multiple template structures evaluated here. In the following analysis, we combine predictions obtained from the different template co-crystal structures. The score associated with the highest value of Fc was chosen for each ligand. This was done to maximize the overlap in ligand similarity and therefore increase accuracy of pose predictions. Additionally, this would result in the best matching protein structure and therefore the most reliable score. Combination of the predictions this way results in the sets becoming larger: CHK1, Urokinase, CDK2 and HSP90 to 95, 36, 33, and 148 ligands, respectively. However, this also leads to multiple chemical series of compounds being present in the same set, which makes the set challenging to predict. The previous tests described earlier involved trends in affinity prediction for closely related analogous compounds, which led to better opportunities for error cancellation. This may not be the case here with different scaffolds present in the same scoring set. Thus, it is reasonable to expect worse predictive ability for the heterogeneous datasets considered here.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 6: Area under the ROC curve (AUC, left) and Predictive Index (PI, right) calculated for the four datasets combined from predictions obtained from multiple template structures. Error bars depict 95% confidence interval obtained from 5000-fold bootstrap sampling. Predictions from the Random and Null model are also included for comparison. The left panel in Figure 6 shows the AUC computed using the four approaches. The AUC values resulting from the predictions range from 0.64 to 0.98 and are dataset dependent. Also shown are the predictions from the MW-based null and random model. To judge the statistical significance of the results, 5000 bootstrapping samples were generated by random selection of data points and 95% confidence interval error bars are depicted. As with the predictions in the individual sets the results for the combined sets are target dependent, but significantly better than random for all four methods and proteins. For Urokinase, CDK2 and HSP90, all four methods outperform the MW-based null model. For Urokinase, the difference is statistically significant; for CDK2, 3 out of 4 methods are significantly better. For HSP90, the flexible protein-based predictions are statistically separated from the MW-based null model, but this is not the case for the rigid-protein based predictions. Unlike these three target proteins, for CHK1, there is not a significant difference between our prediction and the null model. Therefore, while the scoring from all four variants is significantly better than random, this does not hold true for the MWbased null model. Thus, in heterogeneous datasets that are comprised of many different congeneric series of compounds, the method performs better than the null model in 3 out of 4 targets and as good in one, when assessed in ability to separate active from inactive compounds. However, when assessing the method by quantitatively rank ordering ligands by affinity, the performance is not better than the MW-based null model. In Figure 6, the right panel shows the Predictive Index (PI) obtained for the four methods on all four datasets. While the predictions are significantly better than random, they are statistically similar when compared to the null model for Urokinase, CDK2 and HSP90. The worse performance as judged by PI in these heterogeneous datasets is not surprising for an approximate scoring

ACS Paragon Plus Environment

Page 18 of 30

Page 19 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

method as used in due to the presence of multiple ligand series which does not allow for error cancelation and inclusion of noise in predictions, as noted above. Based on this analysis it can be concluded that the rapid scoring which is obtained as a byproduct of the conformation generation procedure is expected to be of utility in choosing active compounds from the pool of compound ideas with a modest level of additional enrichment as compared to the intuitive MW-based null model. More detailed calculations that incorporate the physics of protein-ligand interactions such as explicit solvent and dynamics were not undertaken in this study, which was aimed at establishing a baseline level of predictability. The reliable geometry generation validated in this work suggests that these can serve as a starting point for implementing detailed calculations including free energy methods for accurate binding affinity estimation. CONCLUSIONS The presented Generate Analog Conformations (GAC) method is designed to be used in lead optimization for the purpose of generating 3D geometries of congeneric series of compounds that share a scaffold with a lead compound with determined geometry in complex with the protein. The accuracy of the generated conformations as described above implies that the method can be used for accurate pose generation as long as 65% of the analog structure is shared with the lead compound. It was shown that within this applicability domain, the method outperforms docking. Additionally, the score calculated during the process of conformer generation can be of utility in selecting active compounds from a set of compound ideas. Validation performed in this study using four datasets shows that for closely related congeneric compound datasets, the method almost always outperforms random selection and in 62 to 77% of such datasets, it outperforms an intuitive molecular weight-based null model. Thus it expected that our method can be useful in shortlisting congeneric design ideas, and the resultant geometries can be used as initial conformation for compute intensive free energy methods to obtain more accurate affinity predictions. ACKNOWLEDGEMENTS The author thanks Lisa Yan, Hugues-Olivier Bertrand and Eric Hurd for a critical review of the manuscript, and Tedman Ehlers for stimulating discussions. AVAILABILITY The method as described in this manuscript is implemented as a protocol named “Generate Analog Conformations” in BIOVIA Discovery Studio package version 2018 or later. For further details, please see: https://www.3dsbiovia.com/products/collaborativescience/biovia-discovery-studio/ SUPPORTING INFORMATION Supporting material include the following files: (1) Protein structures in mol2 format, downloaded from PDB and prepared using BIOVIA Discovery Studio and used in the calculations. (2) SDF files containing the crystallographic conformation of lead ligands used as template for structure generation and validation. (3) SDF files containing the set of

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ligands for binding affinity validation. This material is available free of charge via the Internet at http://pubs.acs.org. REFERENCES 1. Paul, S. M.; Mytelka, D. S.; Dunwiddie, C. T.; Persinger, C. C.; Munos, B. H.; Lindborg, S. R.; Schacht, A. L., How to Improve R&D Productivity: The Pharmaceutical Industry's Grand Challenge. Nat Rev Drug Discov 2010, 9, 203-214. 2. De Vivo, M.; Masetti, M.; Bottegoni, G.; Cavalli, A., Role of Molecular Dynamics and Related Methods in Drug Discovery. J Med Chem 2016, 59, 4035-4061. 3. Cournia, Z.; Allen, B.; Sherman, W., Relative Binding Free Energy Calculations in Drug Discovery: Recent Advances and Practical Considerations. J Chem Inf Model 2017, 57, 2911-2937. 4. Knight, J. L.; Brooks, C. L., 3rd, Lambda-Dynamics Free Energy Simulation Methods. J Comput Chem 2009, 30, 1692-1700. 5. Gaieb, Z.; Liu, S.; Gathiaka, S.; Chiu, M.; Yang, H.; Shao, C.; Feher, V. A.; Walters, W. P.; Kuhn, B.; Rudolph, M. G.; Burley, S. K.; Gilson, M. K.; Amaro, R. E., D3R Grand Challenge 2: Blind Prediction of Protein-Ligand Poses, Affinity Rankings, and Relative Binding Free Energies. J Comput Aided Mol Des 2018, 32, 1-20. 6. Gathiaka, S.; Liu, S.; Chiu, M.; Yang, H.; Stuckey, J. A.; Kang, Y. N.; Delproposto, J.; Kubish, G.; Dunbar, J. B., Jr.; Carlson, H. A.; Burley, S. K.; Walters, W. P.; Amaro, R. E.; Feher, V. A.; Gilson, M. K., D3r Grand Challenge 2015: Evaluation of Protein-Ligand Pose and Affinity Predictions. J Comput Aided Mol Des 2016, 30, 651-668. 7. Hassan, M.; Brown, R. D.; Varma-O'brien, S.; Rogers, D., Cheminformatics Analysis and Learning in a Data Pipelining Environment. Mol Divers 2006, 10, 283299. 8. Kumar, A.; Zhang, K. Y. J., A Cross Docking Pipeline for Improving Pose Prediction and Virtual Screening Performance. J Comput Aided Mol Des 2018, 32, 163173. 9. Duan, R.; Xu, X.; Zou, X., Lessons Learned from Participating in D3R 2016 Grand Challenge 2: Compounds Targeting the Farnesoid X Receptor. J Comput Aided Mol Des 2018, 32, 103-111. 10. Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E., The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235-242. 11. BIOVIA Discovery Studio Modeling Environment, Release 2018; BIOVIA Dassault Systemes: San Diego, CA. 12. Eswar, N.; Webb, B.; Marti-Renom, M. A.; Madhusudhan, M. S.; Eramian, D.; Shen, M. Y.; Pieper, U.; Sali, A. Comparative Protein Structure Modeling Using Modeller. In Curr Protoc Bioinformatics; John Wiley & Sons. Inc.: 2006; Vol. 15, pp 5.6.1-5.6.30. 13. Spassov, V. Z.; Flook, P. K.; Yan, L., LOOPER: A Molecular Mechanics-Based Algorithm for Protein Loop Prediction. Protein Eng Des Sel 2008, 21, 91-100. 14. Spassov, V. Z.; Yan, L., A Fast and Accurate Computational Approach to Protein Ionization. Protein Sci 2008, 17, 1955-1970.

ACS Paragon Plus Environment

Page 20 of 30

Page 21 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

15. Li, J.; Ehlers, T.; Sutter, J.; Varma-O'brien, S.; Kirchmair, J., CAESAR: A New Conformer Generation Algorithm Based on Recursive Buildup and Local Rotational Symmetry Consideration. J Chem Inf Model 2007, 47, 1923-1932. 16. Dominy, B. N.; Brooks, C. L., 3rd, Methodology for Protein-Ligand Binding Studies: Application to a Model for Drug Resistance, the HIV/FIV Protease System. Proteins 1999, 36, 318-331. 17. Brooks, B. R.; Brooks III, C. L.; MacKerell Jr., A. D.; Nilsson, L.; Petrella, R. J.; Roux, B.; Won, Y.; Archontis, G.; Bartels, C.; Boresch, S.; Caflisch, A.; Caves, L.; Cui, Q.; Dinner, A. R.; Feig, M.; Fischer, S.; Gao, J.; Hodoscek, M.; Im, W.; Kuczera, K.; Lazaridis, T.; Ma, J.; Ovchinnikov, V.; Paci, E.; Pastor, R. W.; Post, C. B.; Pu, J. Z.; Schaefer, M.; Tidor, B.; Venable, R. M.; Woodcock, H. L.; Wu, X.; Yang, W.; York, D. M.; Karplus, M., CHARMM: The Biomolecular Simulation Program. J. Comput. Chem. 2009, 30, 15451614. 18. Im, W.; Lee, M. S.; Brooks, C. L., 3rd, Generalized Born Model with a Simple Smoothing Function. J Comput Chem 2003, 24, 1691-1702. 19. Lee, M. S.; Feig, M.; Salsbury, F. R.; Brooks, C. L., III, New Analytic Approximation to the Standard Molecular Volume Definition and Its Application to Generalized Born Calculations. J. Comput. Chem. 2003, 24, 1348-1356. 20. Pearlman, D. A.; Charifson, P. S., Are Free Energy Calculations Useful in Practice? A Comparison with Rapid Scoring Functions for the P38 MAP Kinase Protein System. J. Med. Chem. 2001, 44, 3417-3423. 21. Rao, S. N.; Head, M. S.; Kulkarni, A.; LaLonde, J. M., Validation Studies of the SiteDirected Docking Program LibDock. J Chem Inf Model 2007, 47, 2159-2171. 22. Wu, G.; Robertson, D. H.; Brooks, C. L., 3rd; Vieth, M., Detailed Analysis of GridBased Molecular Docking: A Case Study of CDOCKER-a Charmm-Based Md Docking Algorithm. J Comput Chem 2003, 24, 1549-1562. 23. Carlson, H. A., Lessons Learned over Four Benchmark Exercises from the Community Structure-Activity Resource. J Chem Inf Model 2016, 56, 951-954. 24. Brown, S. P.; Muchmore, S. W., Large-Scale Application of High-Throughput Molecular Mechanics with Poisson-Boltzmann Surface Area for Routine PhysicsBased Scoring of Protein-Ligand Complexes. J Med Chem 2009, 52, 3159-3165. 25. Xu, X.; Ma, Z.; Duan, R.; Zou, X., Predicting Protein-Ligand Binding Modes for CELPP and GC3: Workflows and Insight. J Comput Aided Mol Des 2019, 33, 367-374. 26. Ignatov, M.; Liu, C.; Alekseenko, A.; Sun, Z.; Padhorny, D.; Kotelnikov, S.; Kazennov, A.; Grebenkin, I.; Kholodov, Y.; Kolosvari, I.; Perez, A.; Dill, K.; Kozakov, D., Monte Carlo on the Manifold and MD Refinement for Binding Pose Prediction of Protein-Ligand Complexes: 2017 D3R Grand Challenge. J Comput Aided Mol Des 2019, 33, 119-127.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TOC Image: Caption: The Generate Analog Conformations method uses protein-lead co-crystal structure as template to generate binding poses for a related congeneric series of compounds

ACS Paragon Plus Environment

Page 22 of 30

Page 23 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 1. Dataset used for retrospective validation. The protein target, number of ligands in the dataset, number of active ligands, source of data and the complexes with solved crystal structures are listed. The number in parentheses in the second column indicates the total number of ligands in the dataset, some of which were not processed by the method due to lower than 50% similarity with any of the template ligands. Protein CDK2

Ligands 33 (101)

Active 24

Source CSAR

Urokinase

36 (39)

28

Abbott

CHK1

95 (137)

69

Abbott

HSP90

148 (192)

113

AbbVie

PDB IDs 4EK4 4EK5 4FKG 4FKI 4EK6 4FKL 4EK8 3SW4 3SW7 4FKO 4FKP 4FKQ 4FKR 4FKS 4FKT 4FKU 4FKV 4FKW 4FUD 4FUE 4FU7 4FU8 4FU9 4FUB 4FUC 4FSM 4FSW 4FT5 4FSN 4FSZ 4FT7 4FT0 4FT9 4FTA 4FTC 4FSQ 4FSR 4FST 4YKR 4YKQ 4YKT 4YKW 4YKX 4YKZ 4YKY 4YKU

Table 2. Prediction metrics for the 39 congeneric datasets. Area Under Curve (AUC) and Predictive Index (PI) are computed and reported using the four different calculation methods. For comparison, the AUC and PI are calculated for the Molecular Weight based null model. The cells containing AUC values that are above 0.5 are colored in shades of green, whereas below 0.5 in shades of red. The cells containing PI values that are above 0 are colored in shades of green, whereas below 0 in shades of red.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 30

AUC Target CHK1 CHK1 CHK1 CHK1 CHK1 CHK1 CHK1 CHK1 CHK1 CHK1 CHK1 CHK1 CHK1 Urokinase Urokinase Urokinase Urokinase Urokinase Urokinase Urokinase CDK2 CDK2 CDK2 CDK2 CDK2 CDK2 CDK2 CDK2 CDK2 CDK2 CDK2 CDK2 HSP90 HSP90 HSP90 HSP90 HSP90 HSP90 HSP90 HSP90

PDB

nLigands

nActive

Null

4FSM

21

15

4FSW

29

4FT5

34

4FSN

PI

0.37

Rigid BE 0.61

Rigid RBE 0.66

Flex BE 0.53

Flex RBE 0.68

19

0.57

0.62

0.65

0.52

22

0.73

0.57

0.54

0.63

20

14

0.38

0.61

0.62

4FSZ

31

21

0.66

0.65

4FT7

35

23

0.72

0.54

4FT0

47

34

0.79

4FT9

22

12

0.66

4FTA

34

22

Null -0.13

Rigid BE 0.31

Rigid RBE 0.45

Flex BE 0.29

Flex RBE 0.55

0.55

0.33

0.15

0.12

-0.04

-0.04

0.49

0.14

0.17

0.14

0.10

0.02

0.54

0.64

-0.14

-0.04

-0.04

0.02

0.18

0.67

0.66

0.69

0.52

0.34

0.31

0.29

0.29

0.49

0.66

0.57

0.16

0.33

0.35

0.18

0.38

0.80

0.80

0.73

0.76

0.61

0.58

0.52

0.49

0.49

0.50

0.49

0.64

0.57

0.21

0.21

0.24

0.21

0.13

0.73

0.58

0.53

0.64

0.61

0.14

0.36

0.39

0.22

0.31

4FTC

21

10

0.65

0.75

0.66

0.65

0.65

0.31

0.58

0.41

0.25

0.25

4FSQ

25

16

0.43

0.70

0.76

0.73

0.75

-0.06

0.48

0.58

0.47

0.55

4FSR

21

15

0.37

0.62

0.76

0.48

0.63

-0.13

0.39

0.61

0.13

0.50

4FST

17

12

0.32

0.67

0.67

0.38

0.52

-0.07

0.39

0.43

-0.02

0.12

4FUD

26

19

0.82

0.83

0.75

0.88

0.93

0.54

0.34

0.28

0.48

0.55

4FUE

30

23

0.84

0.99

1.00

0.98

0.99

0.55

0.77

0.79

0.78

0.77

4FU7

26

19

0.81

1.00

1.00

0.98

0.98

0.49

0.60

0.58

0.52

0.59

4FU8

26

19

0.81

0.95

0.95

0.97

0.98

0.49

0.49

0.50

0.52

0.53

4FU9

36

28

0.85

0.98

0.98

0.98

0.98

0.64

0.45

0.49

0.66

0.71

4FUB

30

23

0.84

0.97

0.97

0.98

0.99

0.55

0.74

0.71

0.69

0.74

4FUC

35

28

0.85

0.98

0.98

0.98

0.98

0.63

0.60

0.64

0.67

0.74

4EK4

5

4

0.00

1.00

1.00

1.00

1.00

-0.67

0.73

0.44

0.73

0.44

4EK5

7

6

0.00

0.17

0.33

0.00

0.00

-0.55

0.31

0.24

-0.04

0.01

4FKG

7

6

0.00

1.00

1.00

1.00

1.00

-0.55

0.39

0.39

0.44

0.44

4FKI

8

7

0.14

0.71

0.71

0.14

0.71

-0.05

0.04

0.06

0.13

-0.13

4EK6

8

7

0.14

0.86

0.86

0.57

0.71

-0.05

0.19

0.19

0.03

0.06

4FKV

15

9

0.52

0.48

0.57

0.41

0.52

0.51

0.49

0.57

0.29

0.39

4FKU

12

7

0.66

0.66

0.83

0.89

0.94

0.67

0.68

0.88

0.79

0.94

4FKW

12

7

0.66

0.74

0.89

0.63

0.91

0.67

0.76

0.89

0.63

0.91

4FKP

11

5

0.70

0.87

0.90

0.93

0.97

0.48

0.73

0.76

0.82

0.91

4FKT

11

5

0.70

0.43

0.57

0.40

0.63

0.48

-0.15

0.06

-0.25

0.19

4FKQ

11

6

0.77

0.87

0.93

0.83

0.83

0.69

0.68

0.85

0.63

0.65

3SW7

10

8

0.88

1.00

1.00

1.00

1.00

0.76

0.66

0.61

0.74

0.69

4YKR

82

65

0.69

0.56

0.55

0.58

0.61

0.51

0.31

0.32

0.17

0.27

4YKQ

82

65

0.68

0.62

0.63

0.63

0.62

0.51

0.44

0.43

0.24

0.35

4YKT

69

54

0.72

0.64

0.61

0.70

0.60

0.46

0.47

0.44

0.56

0.47

4YKW

51

39

0.67

0.78

0.79

0.75

0.79

0.27

0.48

0.55

0.34

0.48

4YKX

43

26

0.35

0.52

0.54

0.60

0.61

-0.22

-0.12

-0.15

0.14

-0.04

4YKZ

45

28

0.35

0.48

0.46

0.46

0.49

-0.21

-0.15

-0.17

-0.21

-0.14

4YKY

44

27

0.35

0.54

0.63

0.55

0.59

-0.20

-0.01

0.05

-0.08

0.01

4YKU

40

31

0.70

0.64

0.66

0.71

0.76

0.44

0.44

0.47

0.51

0.60

ACS Paragon Plus Environment

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Probability

Page 25 of 30

Journal of Chemical Information and Modeling

Fc ≥ 0.5

Fc ≥ 0.6

Fc ≥ 0.65

RMSD (Å)

RMSD (Å)

RMSD (Å)

Figure 1. Probability distribution of RMSD computed for the top scoring conformation with respect to crystal. The three sets of results correspond to restricting the dataset to different values of similarity between the template ligand and the analog ligand, quantified by the fraction of common atoms, Fc.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

a

b

10

10

8

CHK1

8

6

CDK2

6

4

HSP90

RMSD

RMSD

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 30

2

4 2

Urokinase

0 0.4

0.5

0.6

0.7

0.8

0.9

1.0

0 0.4

0.5

0.6

0.7

0.8

0.9

Fc

Fc

Figure 2. RMSD (Å) computed for the generated conformations with respect to crystal, as a function of the fraction of common atoms Fc. Panel a shows the RMSD distribution for the top ranked pose only. Panel b shows the same for all conformations generated by the method. Values are color coded based on the protein dataset.

ACS Paragon Plus Environment

1.0

Page 27 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

a

Lead: 3SW4

Analog:3SW7

3SW7: conformations

3SW7: comparison with crystal

Lead: 4FKQ

Analog: 4FKV

4FKV: conformations

4FKV: comparison with crystal

b

Figure 3. Two examples showing the lead and analog ligands chemical structures, and the conformations generated. Panel a shows the conformation generation for ligand 3SW7 using 3SW4 as template and b shows conformation generation of ligand 4FKV using 4FKQ as template. The image on the extreme right in both rows shows the overlay of the top scoring conformation with the crystal geometry.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 30

a

4FKL 4EK8

3SW7

4FKO

c

b

4EK8 (template:3SW4)

4FKP

4FKT

4FKT

Figure 4. Illustration of the cases for which deviation from crystal geometry is greater than 2Å. Panel a shows the ligands discussed. Panel b shows the overlay of the alternate crystal geometries of ligand 4EK8 in ball and stick representation, and the generated conformation in stick representation. Panel c shows the conformation generation of ligand 4FKT using 4FKP as template. The protein surface is depicted in yellow.

ACS Paragon Plus Environment

Page 29 of 30

1.0 1.0

0.6

0.5

0.4

0.0

PI

AUC

0.8

CHK1

0.2

Urokinase

CDK2

HSP90

-0.5

0.0 CHK1

Urokinase

CDK2

HSP90

-1.0

Figure 5: Weighted averages of AUC (left) and PI (right) calculated over congeneric datasets for each target. Predictions from the Random and Null model are also included for comparison.

1.0

1.0

0.8

AUC

0.5

0.6 0.4

PI

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

0.0

CHK1

0.2

Urokinase

CDK2

HSP90

-0.5

0.0

CHK1

Urokinase

CDK2

HSP90 -1.0

Figure 6: Area under the ROC curve (AUC, left) and Predictive Index (PI, right) calculated for the four datasets combined from predictions obtained from multiple template structures. Error bars depict 95% confidence interval obtained from 5000-fold bootstrap sampling. Predictions from the Random and Null model are also included for comparison.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TOC Image:

Caption: The Generate Analog Conformations method uses protein-lead co-crystal structure as template to generate binding poses for a related congeneric series of compounds

ACS Paragon Plus Environment

Page 30 of 30