Novel Consensus Docking Strategy to Improve Ligand Pose

Jul 25, 2018 - ... determine the best program combinations to improve the docking success rate. Using the complexes from PDBbind as a benchmark data s...
2 downloads 0 Views 1MB Size
Subscriber access provided by Kaohsiung Medical University

Computational Biochemistry

A Novel Consensus Docking Strategy to Improve the Ligand Pose Prediction Xiaodong Ren, Yu-Sheng Shi, Yan Zhang, Bin Liu, Li-Hong Zhang, Yu-Bo Peng, and Rui Zeng J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.8b00329 • Publication Date (Web): 25 Jul 2018 Downloaded from http://pubs.acs.org on July 26, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

A Novel Consensus Docking Strategy to Improve the Ligand Pose Prediction Xiaodong Ren,† Yu-Sheng Shi,*,‡ Yan Zhang,*,¶ Bin Liu,¶ Li-Hong Zhang,¶ Yu-Bo Peng¶ and Rui Zeng§ †

Department of Pharmacy, Guizhou Provincial People's Hospital, Guiyang, 550002, P.R. China



Key Laboratory of Biotechnology and Bioresources Utilization, Educational of Minister, College of Life Science, Dalian Nationalities University, Dalian 116600, P.R. China ¶

Jiamusi College, Heilongjiang University of Chinese Medicine, Jiamusi 154007, P.R. China

§

College of Pharmacy, Southwest University for Nationalities, Chengdu, 610041, P.R. China

Corresponding Author. *E-mail: [email protected] and [email protected]

ABSTRACT:

Molecular docking, which mainly includes pose prediction and binding affinity calculation, has become an important tool for assisting structure-based drug design. Correctly predicting the ligand binding pose to a protein target enables the estimation of binding free energy using various tools. Previous studies have shown that the consensus method can be used to improve the docking performance in respect of compound scoring and pose prediction. In this report, a novel consensus docking strategy was proposed, which uses a dynamic benchmark dataset selection to determine the best program combinations to improve the docking success rate. Using the complexes from PDBbind as a benchmark dataset, a 4.9 % enhancement in success rate was achieved compared with the best program.

■ INTRODUCTION Molecule docking is a computational approach that attempts to predict the binding pose between a protein receptor and a ligand by evaluating and ranking the binding affinity of different poses, which has become an important tool in assisting structure-based drug design in a fast and cost-efficient manner. Accurate prediction of the binding pose of a ligand to the target is an important prerequisite to investigate the interaction and estimate binding free energy using other methods such as Linear Interaction Energy,1 Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA),2 Molecular Mechanics/Generalized Born Surface Area (MM/GBSA),3 Free Energy Perturbation (FEP),4 Thermodynamic Integration5 and Mining Minima method.6 During the past two decades, a variety of docking programs have been developed for both commercial and academic use, such as GOLD,7 Glide,8, 9 Surex,10 FlexX,11 AutoDock,12 AutoDock Vina,13 PLANTS,14 rDock.15 However, prediction of the correct binding mode of a ligand into a protein target is still far from being perfect. Although some programs may outperform others on the whole, or on specific targets, each program has its strengths and limitations.16, 17 The consensus method in docking that takes advantage of results from multiple docking programs has been widely used recently. The main rationale behind this method is that although individual programs 1 / 14 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 19

may generate some misleading results in random, cancellation of the random errors by summation of multiple results can improve the predictions.18, 19 A variety of consensus scoring methods have been developed for compound scoring.18, 20-24 Moreover, consensus scoring was combined with machine-learning techniques in a recent report by Wildman et al.25 While a great deal of effort has been focused on consensus scoring, a few studies have evaluated the possibility of combining the results of different docking methods to obtain reliable ligand binding poses. Rognan et al.26 took advantage of three docking tools (Gold, FlexX, Dock) by their post-processing program ConsDock, which was tested on 100 protein-ligand complexes, and showed that the ConsDock outperforms single docking program with respect to the docking accuracy of the top-ranked pose. Wolf et al.27 combined two docking programs AutoDock and FlexX into the single AutoxX protocol, which raised the number of binding poses below an RMSD of 2.5 Å by 10. Plewczynski et al.28 developed a consensus docking method called VoteDock based on seven docking programs (mainly commercial) for prediction of protein-ligand interactions. After a test on 1300 protein-ligand pairs, VoteDock was shown to be able to correctly predict approximately 20% of pairs more than docking methods on average, and over 10% of pairs more than the best program. Tuccinardi et al.29 tested the reliability of a consensus docking protocol by combining ten different docking procedures and highlighted that consensus docking is able to predict the ligand binding pose better than the single programs. However, the present consensus docking methods for pose prediction have some drawbacks: (1) A relatively small benchmark datasets were tested; (2) Some methods are complicated in the procedure; (3) The methods are not easy to apply to new targets. Herein, we presented a novel strategy to maximize the pose prediction performance for a given set of docking programs.

■ METHODS Dataset. The benchmark dataset used in this study was adapted from the refined set of PDBbind v2017, which contains a total of 4154 protein-ligand complexes with high-resolution crystal structures and experimental binding affinity data30. The proteins containing binding metal were identified by an inhouse script and excluded, as most of the programs used in this study are not able to handle the metal atoms in the binding site. For the resulting 3555 complexes, 20 complexes which failed in docking because of various technical problems (e.g., Autodock has a torsion limit of 32) were also excluded. Protein and ligand preparation. For the proteins, all of the metal atoms and water molecules were removed and the resulting proteins processed by the UCSF Chimera31 "DockPrep" utility, which repaired truncated sidechains, added hydrogens and assigned the AMBER ff14SB partial charge to the proteins. For each ligand, a set of low-energy conformers was generated by Balloon 1.6.532 using MMFF94 force field and the one with the lowest energy was selected for use, which has different conformation or coordinates with the crystal ligand. The selected conformer of each ligand was used as the initial input coordinates for docking, while the crystal ligand was only used to calculate the root mean square deviation (RMSD). The partial charges of each ligand conformer were calculated using the AM1-BCC charge method implemented in Antechamber.33 Further ligand format conversion was conducted for each program if required. Molecular Docking. For the selection of docking programs applicable for this study, some rules were applied: (1) All programs are open-source or free to academic users; (2) They are either developed or 2 / 14 ACS Paragon Plus Environment

Page 3 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

still popular in recent years; (3) The programs can be run in command-line mode. Based on this, 11 programs including AutoDock, UCSF DOCK, GalaxyDock, LeDock, MDock, PLANTS, PSOVina, QuickVina, rDock, AutoDock Vina, and Smina were used in this study. For the parameters of each program, while the default parameters were utilized as much as possible, some parameters were adjusted to make sure that docking with each program can finish within 10 minutes. The binding site was determined based on 5.0 Å padding of crystal ligand unless otherwise stated. The internal default scoring function of each program was used unless otherwise stated. AutoDock 4.2.6.12 Protein and ligand files were converted to pdbqt format using the prepare_receptor4.py and prepare_ligand4.py provided in the AutoDockTools script, respectively. These pdbqt files were also used in the docking programs of AutoDock Vina, PSOVina, QuickVina and Smina. Lamarckian Genetic Algorithm (LGA) method was used for pose sampling. The population size and number of generations were set to 150 and 27,000, respectively. The number of energy evaluations was set to 250,000 and docking run set to 20. Vina family. The same default parameters (exhaustiveness = 8, num_modes = 9 and energy_range = 3 kcal/mol) were used for the programs of AutoDock Vina 1.1.2,13 PSOVina 1.0,34 QuickVina 235 and Smina 1.1.2.36 GalaxyDock 2.37-39 The “PreDock” protocol was used for all docking and GalaxyDock BP2 score used for scoring. LeDock 1.0.40 The protein files were prepared by LePro. The clustering RMSD was set to 1.0 Å and the number of binding poses set to 20. MDock 1.2.41 The clash_potential_penalty was set to 10.0 kcal/mol and the grid_spacing set to 0.3 Å. 8.0 Å grid_box_size was set for automatic creation of the grid box using MDock. The number of cycles for minimization was set to 3 to fully optimize the ligand orientation. PLANTS 1.2.14 Protein and ligand files were processed with SPORES 1.3 in complete mode to ensure their compatibility with PLANTS. The binding site space was a sphere defined from the center of mass of the crystal ligand with 5.0 Å padding, and search_speed was set to speed1. rDock 2013.1.15 The program rbcavity was used to generate the binding cavity based on the crystal ligand with a radius of 6.0 Å, and the standard docking protocol was used as previously published.15 UCSF DOCK 6.8.42 The files were prepared by the DOCK accessory programs unless otherwise stated. The solvent accessible surface of each receptor was calculated using the program DMS with a probe radius of 1.4 Å after deleting hydrogen atoms. The negative binding site space was then generated by sphgen_cpp, and the spheres within 8.0 Å of the crystal ligand were retained for docking. The grid files were generated with the grid_spacing set to 0.3 Å, and box dimensions were set to encompass the sphere set with an 8.0 Å padding. The ligands were treated as flexible based on the protocol in Rizzo et al.42 For a few complexes that failed in docking because of high ligand internal energy under the default protocol, the score cutoff parameter was changed from 100.0 to 1000.0 kcal/mol.42 Representative pose (REPPose) determination. After docking, the top-scoring poses from all programs were collected and used to determine the REPPose. For each ligand with n poses, the total 3 / 14 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 19

RMSD (tRMSD) for each pose with the other poses were calculated, and the REPPose was defined as the pose with the minimum tRMSD. The method to find the REPPose in this study is similar to the previously reported concept of 3DScore, which is equal to the arithmetical mean of RMSD values of different poses.28 The RMSD between two structures was calculated using the Hungarian (symmetrycorrected) heavy-atom RMSD method which is implemented in the UCSF DOCK6.43 Program combination determination. For the given n top-scoring poses generated from n programs, there are m non-repetitive program combinations (COMBs). As shown in the Equation 1, m equals to all of the combinations of taking i poses from n at a time without repetition. i starts from 3 because a REPPose can only be determined by at least 3 poses. In our case, a total of 1981 COMBs were obtained from 11 poses (one pose from each program).

  =    (1) 



As same as each program's pose, the REPPose for each COMB of all the complexes has a corresponding success rate and mean RMSD. With the 1981 REPPoses corresponding to all COMBs and for all complexes in hand, we were able to sort the COMBs by different metrics. In this study, we sorted the COMBs by both the success rate (a successful docking was defined as the RMSD between the docking pose and the native pose less than 2.0 Å) and mean RMSD, which enabled us to get the corresponding "best" COMB (the first entry of the sorted COMBs, which has either highest success rate or lowest mean RMSD), termed as COMBBestr and COMBBestm, respectively. In addition, we termed COMBAll as the combinations of all 11 programs. For each complex, there is a pose corresponding to COMBBestr, COMBBestm and COMBAll, which were termed as REPBestr, REPBestm and REPAll, respectively. Determining an appropriate dataset size for COMB determination. Although with the whole dataset we can get the “best” COMB for the whole dataset, we were more interested in getting the COMB information that performs well but based on a smaller dataset size, which is computationally less expensive. To explore if we can use a library of the N complexes to determine the "best" COMB, we studied the relationship of N value to the success rate. The procedure is shown and explained in Figure 1. The binding pocket size of each target was calculated using the POVME 3.0,44 with the InclusionBox set based on a 5.0 Å padding of the crystal ligand.

4 / 14 ACS Paragon Plus Environment

Page 5 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1. Procedure to calculate the success rate using a library of N complexes for COMB determination. For each complex, the binding pocket size of the protein was used to retrieve a library of N complexes (not including itself) by both the near and random method. The near method means that the complex library has the smallest difference in pocket size with the studied complex, while the random method means randomly selecting N complexes from the rest complexes. Based on the N complex library, COMBBestr and COMBBestm can be obtained, leading to the pose of REPBestr and REPBestm for the studied complex. After repeating this procedure for all the complexes, a final success rate can be calculated for an N value.

■ RESULTS AND DISCUSSION Performance of the programs and REPPoses. A total of 3535 complexes were successfully docked by all programs. After collecting all the top-scoring poses and calculating their RMSD to the crystal ligand, the success rate and mean RMSD for each program were obtained. The top-scoring poses were also used to conduct the representative pose analysis (REPAnalysis), which yielded the poses corresponding to REPAll, REPBestr, and REPBestm. Those REPPoses for all complexes can be treated as poses from a new approach and its own success rate and mean RMSD were also calculated. The cumulative distribution of RMSD is shown in Figure 2, which revealed the fraction of the complexes with the RMSD value of ligand poses lower than the predefined thresholds. The success rates of all 11 docking programs and the REPPoses are illustrated in Table 1. As shown in Table 1, the success rates for all the programs and the REPPoses are in the following order: REPBestr (58.5%) > REPBestm (58.3%) > REPAll (54.7%) > PLANTS (53.6%) > Smina (49.1%) > LeDock (48.7%) > AutoDock Vina (48.5%) > PSOVina (47.2%) > QuickVina (47.0%) > rDock (41.5%) > UCSF DOCK (37.9%) > AutoDock (37.0%) > GalaxyDock (30.6%) > MDock (22.1%). It was found that REPAll, which denotes the REPPose of all program's poses, is better than the best program (PLANTS). The REPBestr and REPBestm, which denote the best performance that the COMBs can achieve, were increased about 5% in success rate than PLANTS. The programs for REPBestr (COMBBestr) are AutoDock, UCSF DOCK, LeDock, PLANTS, rDock and AutoDock Vina, 5 / 14 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 19

while for REPBestm (COMBBestm) are AutoDock, LeDock, PLANTS, rDock, Smina. The retrospective results indicate that if we used the COMBBestr and COMBBestm for a docking task, we could achieve enhanced performance, compared with the best program. However, what are the COMBBestr and COMBBestm for a new target or a different set of programs? We will address the issue in the following study. It can also be found that although the AutoDock Vina and other three branches showed close performance in terms of success rate, they were not selected into COMBBestr or COMBBestm simultaneously, indicating that the REPAnalysis procedure is able to exclude similar algorithms. Actually, all the 1981 COMBs can be treated as the "COMBAll" for that subset of programs, and we compared the success rate of each COMB with the corresponding best program (the program with highest success rate within that COMB). Out of the 1981 COMBs, only 1512 COMBs have a success rate equal or higher than the best program, while the success rate of the other 469 COMBs are lower than the best program. This result indicates that the success rate of overall REPAll (54.7%) higher than the best program (53.6%), and the same phenomenon that was observed in other studies26-29 can only be considered as an event that occurs with high probability, not always. Combining poses from casually selected programs to predict the binding pose without a benchmark does not necessarily achieve a higher success rate.

Figure 2. Cumulative distribution of RMSD for all programs and the REPPoses.

Performance of different binding pocket size groups. Previously it was found that there were relationships between the success rate and the ligand characteristics such as molecular weight and the number of the rotatable bonds.17, 28 In this study, we tried to analyze the success rate of a group of complexes with similar characteristics of the targets. We envisioned that the binding pocket size would be an important characteristic to group the complexes, as generally, a bigger pocket usually can accommodate a ligand with a higher molecular weight and more rotatable bonds. Therefore, we calculated the binding pocket size of each target by using a 5.0 Å padding of the crystal ligand as the calculating box by the program of POVME 3.0.44 The minimum and maximum of the binding pocket size are 11 Å3 (PDB 1PB8) and 6933 Å3 (PDB 3IQU), respectively. We divided the whole dataset into 7 groups with 400 Å3 as the interval. The success rates of each group are also shown in Table 1. There is an obvious trend for all programs that the success rates decrease as the binding 6 / 14 ACS Paragon Plus Environment

Page 7 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

pocket size increases. It can be found that different programs perform differently in each group. Although the PLANTS performs best in 6 out of 7 groups, the second highest success rates in different groups were achieved by different programs. In each group, the REPBestr and REPBestm exhibited an increased success rate compared with the best program. The COMBBestr and COMBBestm for the whole dataset and each group are shown in Table S1(Supporting Information). Not surprisingly, the COMBBestr and COMBBestm varied in each group. It is also interesting to notice that the two programs with the lowest success rate (MDock and GalaxyDock) also contribute to achieving the best success rate in some groups. Table 1. Success ratea of all complexes and the different groups divided by binding pocket size. Poses

All

11-400

401-800

801-1200

1201-1600

1601-2000

2001-2400

2400-6933

(809)b

(733)

(828)

(567)

(291)

(140)

(167)

AutoDock 0.370 0.607 0.454 0.331 0.215 0.196 0.121 0.084 AutoDock Vina 0.485 0.682 0.554 0.432 0.416 0.337 0.243 0.180 GalaxyDock 0.306 0.539 0.359 0.255 0.175 0.165 0.114 0.042 LeDock 0.487 0.703 0.563 0.424 0.383 0.320 0.364 0.162 MDock 0.221 0.447 0.281 0.167 0.072 0.072 0.057 0.024 PLANTS 0.536 0.728 0.644 0.490 0.418 0.388 0.329 0.186 PSOVina 0.472 0.681 0.555 0.417 0.362 0.323 0.257 0.174 QuickVina 0.470 0.675 0.538 0.426 0.390 0.316 0.221 0.156 rDock 0.415 0.627 0.512 0.354 0.286 0.268 0.236 0.114 Smina 0.491 0.690 0.551 0.450 0.416 0.340 0.271 0.174 UCSF DOCK 0.379 0.601 0.460 0.329 0.240 0.206 0.179 0.132 REPAll 0.547 0.732 0.622 0.508 0.469 0.399 0.321 0.216 REPBestm 0.583 0.774 0.690 0.546 0.490 0.440 0.414 0.228 REPBestr 0.585 0.778 0.694 0.546 0.497 0.457 0.443 0.257 a The red color indicates the highest and the blue color indicates the second highest success rate in each group. b

The range above the parentheses is the binding pocket size range, while the number in the parentheses is the count of the group.

Determining an appropriate dataset size for COMB determination. The above work showed that the pose of REPBestr and REPBestm generated from the REPAnalysis have the possibility of increasing the success rate. The different COMBBestr and COMBBestr obtained for different binding pocket size groups aroused our thinking regarding what is the “best” COMB for a specific target? Although it is hard to tell the “best” COMB for only a single protein-ligand complex, statistically seeking a higher success rate for a group of complexes by using a single program or consensus docking method is still an ongoing goal in docking. COMB may achieve this goal. To get the COMBBest for a given set of programs, the benchmark work is inevitable. However, when one tries to dock a library of compounds to a target using a set of programs, within which some programs are not studied in this report (e.g., commercial programs and newly developed programs), it is undesirable to generate the COMBBest using the whole dataset (3535 complexes or more) for benchmark. It is necessary to develop a general strategy that can be used to determine the COMBBest 7 / 14 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 19

for a target with less computation cost, and more importantly, to make the COMBBest closer to the “best” for the studied target. Therefore, we explored whether a small library of complexes can be used to generate the COMBBest, which finally leads to an enhanced success rate. A library of N complexes were selected based on the near and random method for each complex to generate their COMBBest and REPBest, and the procedure was repeated for the whole dataset to calculate the success rate for N value (see METHODS). N was set from 100 to 1500 with increments of 100. The relationship between success rate and N value is shown in Figure 3 and Table S2 (Supporting Information). As is shown, there is a rough initial raise of success rate for each group when N