LEADS-PEP: A Benchmark Data Set for ... - ACS Publications

Dec 14, 2015 - LEADS-PEP benchmark data set sorted by peptide length (res, number of residues). For each peptide several physicochemical properties ...
2 downloads 0 Views 8MB Size
Article pubs.acs.org/jcim

LEADS-PEP: A Benchmark Data Set for Assessment of Peptide Docking Performance Alexander Sebastian Hauser and Björn Windshügel* Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Schnackenburgallee 114, 22525 Hamburg, Germany S Supporting Information *

ABSTRACT: With increasing interest in peptide-based therapeutics also the application of computational approaches such as peptide docking has gained more and more attention. In order to assess the suitability of docking programs for peptide placement and to support the development of peptidespecific docking tools, an independently constructed benchmark data set is urgently needed. Here we present the LEADS-PEP benchmark data set for assessing peptide docking performance. Using a rational and unbiased workflow, 53 protein−peptide complexes with peptide lengths ranging from 3 to 12 residues were selected. The data set is publicly accessible at www. leads-x.org. In a second step we evaluated several small molecule docking programs for their potential to reproduce peptide conformations as present in LEADS-PEP. While most tested programs were capable to generate native-like binding modes of small peptides, only Surflex-Dock and AutoDock Vina performed reasonably well for peptides consisting of more than five residues. Rescoring of docking poses with scoring functions ChemPLP, ChemScore, and ASP further increased the number of top-ranked near-native conformations. Our results suggest that small molecule docking programs are a good and fast alternative to specialized peptide docking programs.



INTRODUCTION Protein−peptide interactions are involved in numerous cellular processes and are estimated to account for up to 40% of all interactions within the cell.1 Therefore, it is not surprising that in recent years, the development of peptide-based therapeutics has gained increased interest in the pharmaceutical industry and this is expected to further grow in future.2−4 Between 2009 and 2013, 10% of the overall drug approvals were represented by peptides and several of these therapeutics are first-in-class drugs, such as boceprevir and telaprevir, both targeting the hepatitis C virus.5 As of today more than 100 peptide-based drugs have reached the pharmaceutical market and many more are currently investigated in clinical trials.3 Computational chemistry techniques have proven to successfully support the drug discovery process for small molecules, for example by virtual screening.6 The adaption of molecular modeling and docking methods for the prediction of peptide binding modes is currently under intensive development and evaluation.7,8 In particular peptide docking is challenging due to the large number of rotatable bonds and the resulting high flexibility of the molecule. On the other hand, peptides are composed of unique properties such as structural hierarchy and physical restrictions and simplicity that can be employed to improve protein−peptide docking strategies.7 So far, only few programs specifically designed for peptide docking have been developed. An example is Rosetta FlexPepDock. Several protocols are available that revealed good performance in terms of reproducing peptide conformations of different protein−peptide X-ray crystal structures.9,10 Another approach is DynaDock which has been shown to © XXXX American Chemical Society

perform well across a data set of 15 protein−peptide complexes.11 In addition to specialized tools also small molecule docking programs have been tested for peptide docking. AutoDock has been shown to dock very short peptides (2−4 aa length) with reasonable accuracy.12 Very recently, a modified version of the docking program Glide performed equally accurate as the Rosetta FlexPepDock ab initio protocol while being over 100 times faster.13 So far, all reports on peptide docking performance suffer from missing comparability of the results as the test sets used for assessment are not publicly accessible. Also it cannot be excluded that these data sets are biased toward a specific tool.7 Therefore, an independently constructed and publicly available benchmark data set of protein−peptide complexes is urgently needed in order to compare available docking programs and to support their further development. For small molecules, several benchmark data sets for docking and virtual screening exist. In order to evaluate the potential of docking programs to reproduce binding modes as determined by X-ray crystallography, the Astex Diverse Set comprising 85 high quality protein−ligand X-ray crystal structures can be utilized.14 For evaluation of virtual screening performance the Directory of Useful Decoys (DUD) is a popular benchmark data set.15,16 An alternative for virtual screening assessment is provided by the Demanding Evaluation Kits for Objective In silico Screening (DEKOIS).17 Received: April 24, 2015

A

DOI: 10.1021/acs.jcim.5b00234 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling In this study we present LEADS-PEP, the first representative of the Lessons for Efficiency Assessment of Docking and Scoring (LEADS) collection. LEADS-PEP is a publicly available benchmark data set that enables the evaluation of docking programs for their potential to reproduce peptide binding modes and to compare different methods and parameters. The collection consists of 53 protein−peptide complexes that have been prepared using a rational and unbiased workflow. In a second step we have utilized the data set for a detailed evaluation of several popular small molecule docking programs and scoring functions of which most have not been considered for peptide docking so far.

AutoDock Vina24 (version 1.1.2) are Open Source software available from the Scripps Research Institute. In addition to GOLD’s default scoring function ChemPLP, all other implemented scoring functions (ASP, ChemScore, and GoldScore) were also investigated. Peptide and Protein Preparation. In order to prevent any bias by using coordinates present in the protein−peptide Xray crystal structures, all peptides were generated in a linear conformation (backbone torsion angles of 180°) within SYBYL-X 2.1.1 with charged termini and minimized utilizing Powell method with default settings. It was ensured that the peptide coordinates of the linearized peptides do not align with coordinates of the peptide binding site. Protein structures were prepared using Protonate3D within MOE. All water atoms were removed. Docking Tools. Surflex-Dock. The protomol file for each complex was built based on all residues within 5 Å of the cocrystallized peptide using a threshold of 0.01 and a bloat of 0. For docking with standard accuracy (SA) the density of search (spindense) was set to 6.0, the number of spins per alignment to 12 (nspin), and the additional starting conformations per molecule (multistart) to 6. For high accuracy (HA) docking the following settings were used: spindense 9.0, nspin 24, multistart 12. The Surflex-Dock “Total_Score” was used as the native scoring function.25 AutoDock. AutoDockTools within MGLTools (version 1.5.6) were utilized in order to generate PDBQT format files of the receptor and ligand. Grids maps were calculated with AutoGrid. The grid box was defined based on the cocrystallized ligand using a python script within MGLTools. Grid dimensions were increased in all six directions by 13 points (4.9 Å). All dockings were performed using the Lamarckian genetic algorithm with the maximum number of energy evaluations set to 2 500 000 (SA) or 25 000 000 (HA). As AutoDock does not handle ligands with more than 32 torsion angles, for larger peptides a recompiled version allowing up to 64 torsion angles was used. AutoDock Vina. Grid dimensions were adopted from the preparation for AutoDock. The exhaustiveness was set to either 8 (SA) or 100 (HA), respectively. GOLD. The docking site was defined by all residues within 5 Å distance to the cocrystallized peptide. For each available scoring function (ChemScore,26 ChemPLP,27 ASP,28 and GoldScore29) a separate docking was performed. The early termination option was switched off. Pose Selection and RMSD Calculation. For all docking scenarios the number of docking runs was set to 20. As a measure of the peptide docking accuracy the root-mean-square deviation (RMSD) for backbone atoms (N, CA, C) was calculated using shell and SPL scripts. In order to evaluate external scoring functions, docking poses were rescored utilizing the rescoring option implemented in GOLD. All four scoring functions (ASP, ChemPLP, ChemScore, GoldScore) were tested with default settings. Only the nonminimized poses were analyzed. Figures with molecular representations were prepared using VMD30 and POV-Ray (www.povray.org).



EXPERIMENTAL SECTION Benchmark Data Set Generation. For generation of the LEAPS-PEP data set, a selection process was developed (Figure S1). At first, the Protein Data Bank (PDB)18 was queried for peptide-bound protein X-ray crystal structures with the following constraints: (i) the structure does not contain any DNA or RNA, (ii) it includes experimental data, (iii) the structure contains between two and four chains, (iv) at least one chain is between 2 and 15 amino acids long, (v) the resolution is < 2.0 Å, and (vi) the Rfree < 0.3. The query extracted 1376 PDB entries (as of 29/05/2015) that were downloaded. Each structure was split into its protein and peptide chains. Peptide chains were further filtered for structures that do not include any hetero atoms and are not covalently linked to the protein chain. Complexes containing hetero atoms within 4 Å of the interface between protein and peptide were removed from the set. Subsequently, PROCHECK19 was used to analyze the residue-by-residue geometry and stereochemical quality of the complexes. Structures containing atoms in close distance (30% of the peptide residues have less than three van der Waals contacts to the protein) and/or crystallization artifacts were excluded. For most peptide lengths (3−12 residues) between five and six complexes were chosen for the data set. It was further attempted to include a broad set of peptides with different characteristics such as acidic, basic, hydrophobic, hydrophilic, or aromatic entities. The final peptide docking benchmark data set contains 53 complexes. Docking Programs and Scoring Functions. Within this study, we utilized the docking programs GOLD, Surflex-Dock, AutoDock and AutoDock Vina for the evaluation of their potential to reproduce cocrystallized peptide binding modes. Surflex-Dock21 (version 2.706.13302) is included in SYBYL-X 2.1.1 (Certara L.P., St. Louis, MO, USA). GOLD22 (version 5.2.2) was licensed from Cambridge Crystallographic Data Centre, Cambridge, UK. AutoDock23 (version 4.2.5.1) and



RESULTS Benchmark Data Set. We set up a workflow resulting in an unbiased selection of protein−peptide complexes with great structural and functional diversity on the basis of all publicly available X-ray crystal structures. For both proteins and B

DOI: 10.1021/acs.jcim.5b00234 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Table 1. Overview of Peptides Included in the LEADS-PEP Benchmark Data Seta H-bond res

PDB

sequence

heavy atoms

rot bonds

ring count

acc

don

MW

log P

3 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10 10 10 10 10 11 11 11 11 11 12 12 12 12 12

1B9J 2OY2 3GQ1 3BS4 2OXW 2B6N 1TW6 3VQG 1UOP 4C2C 4J44 2HPL 2V3S 3NFK 1NVR 4V3I 3T6R 1SVZ 3D1E 3IDG 3LNY 4NNM 4Q6H 3MMG 3Q47 3UPV 4QBR 3NJG 1ELW 3CH8 4WLB 1OU8 1N7F 3OBQ 4BTB 2W0Z 4N7H 2QAB 1H6W 3BRL 1NTV 4DS1 2O02 1N12 2XFX 3BFW 4EIK 3DS1 4J8S 2W10 3JZO 4DGY 2B9H

KLK IAG WLF NIF IAG APT AVPI VTLV GFEP AVPA AIAV DDLYG GRFQV GETRL ASVSA DLTRP ARTKQ PQFSLW GQLGLF ALDKWD EQVSAV YPTSII VQDTRL ETVRFQS NPISDVD PTVEEVD ARTKQTA PQIINRP GPTIEEVD PQPVDSWV SLLKKLLD GAANDENY ATVRTYSC PTPSAPVPL PPPPPPPPP APPPRPPKP EAPPSYAEV KILHRLLQD SLNYIIKVKE ATSAKATQTD NFDNPVYRKT YAESGIQTDL GLLDALDLAS SDVAFRGNLLD VGYPKVKEEML DSTITIRGYVR SLARRPLPPLP ITFEDLLDYYGP RRLPIFNRISVS PPPRPTAPKPLL LTFEHYWAQLTS QLINTNGSWHIN RRNLKGLNLNLH

27 18 34 28 18 20 28 30 32 25 26 41 43 40 30 42 42 56 45 53 44 49 51 61 53 55 54 59 60 66 65 60 62 62 64 68 68 80 85 69 89 77 69 85 90 90 86 96 103 91 107 99 102

15 7 11 11 7 6 9 12 11 7 10 18 19 19 14 18 22 21 20 23 21 21 25 30 23 24 28 24 27 24 35 27 31 20 8 19 27 40 44 32 40 38 33 40 44 45 33 44 49 31 48 46 51

0 0 3 1 0 1 1 0 2 1 0 1 1 0 0 1 0 4 1 2 0 2 0 1 1 1 0 2 1 4 0 1 1 4 9 6 3 1 1 0 3 1 0 1 2 1 4 3 2 6 5 3 1

9 7 8 9 7 8 9 10 11 9 9 16 16 17 13 17 18 17 15 19 18 16 21 24 22 22 23 22 24 23 22 26 24 21 19 23 25 29 29 30 33 30 26 33 30 35 30 32 38 30 36 38 40

11 5 6 7 5 5 5 7 5 5 6 8 14 13 9 12 18 11 10 12 11 10 16 18 11 9 21 16 10 12 17 15 19 9 2 13 11 22 22 21 25 17 13 21 19 27 20 16 32 17 22 26 34

390 259 465 392 259 287 399 431 447 356 372 580 607 575 433 601 605 777 634 746 631 693 731 866 757 785 777 838 856 926 930 851 901 878 892 958 960 1136 1207 993 1254 1094 985 1205 1293 1281 1219 1345 1461 1286 1495 1397 1451

0.2 −0.4 3.5 0.1 −0.4 −1.5 1.0 1.0 −0.5 −0.4 0.6 −1.3 −1.5 −2.3 −3.3 −1.9 −4.1 0.9 −0.2 −0.5 −2.9 0.2 −2.8 −3.7 −4.0 −1.9 −5.5 −2.0 −2.7 −1.2 0.7 −6.4 −3.6 −1.2 0.0 −1.8 −2.3 −0.9 −0.4 −8.6 −4.6 −4.1 −1.3 −3.9 0.3 −3.6 −0.6 1.1 −2.5 −0.4 −1.7 −7.2 −5.7

a

LEADS-PEP benchmark data set sorted by peptide length (res, number of residues). For each peptide several physicochemical properties (calculated within MOE) are listed (acc, acceptor; don, donor).

peptides and/or peptides with missing atoms were discarded. LEADS-PEP includes proteins of not more than 30% sequence identity. For generation of the current release we mainly

peptides several quality measures (e.g., stereochemical properties) were considered for the selection. Complexes containing heteroatoms (e.g., buffer molecules) in close distance to the C

DOI: 10.1021/acs.jcim.5b00234 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 1. Overview of CPU time required by tested docking approaches for each peptide. Calculations were performed on an Intel Xeon E5-2620 CPU at 2.00 GHz. Abbreviations: SA, standard accuracy settings; HA, high accuracy settings.

settings (5.9 min.). However, it should be noted here that Vina is parallelized and was run using 8 threads in this study. Computing times for peptides using GOLD with scoring functions ChemScore (CS, 8.1 min.) and ASP (8.2 min.) were slightly higher. GoldScore (GS) revealed as slowest GOLD scoring option (27.1 min.). Using SurflexSA the computing time for a peptide was approximately 13 min. With high accuracy (HA) settings the computing time increased to 42.4 min per peptide which is only slightly longer than for AutoDockSA (40.9 min.). AutoDock with high accuracy settings required by far most computing time per peptide (419.2 min.). The standard measure for assessing the accuracy of redocking performance is the root-mean-square deviation (RMSD) between docked pose and the experimentally determined conformation. Here, a docking pose was considered as nearnative conformation once its backbone RMSD is ≤2.5 Å.7 At first, we investigated the RMSD of top-scored docking poses. An overview of the deviation from the experimentally determined peptide coordinates for all programs tested on the LEADS-PEP data set is given in Table 2. Considering the median RMSD over the whole benchmark data set for the top-ranked pose, GOLD with GS scoring function revealed as most accurate docking approach (4.5 Å), closely followed by SurflexSA (4.8 Å), GOLD:CS (4.9 Å), and SurflexHA (5.0 Å). For all other tested docking approaches the median RMSD was above 6 Å. Most programs were capable to reconstruct conformations of shorter peptides (3−4 residues) quite accurately, while longer peptides often caused problems (Table 2). For 4 peptides all docking approaches correctly reproduced the experimentally determined binding modes (1B9J, 1TW6, 4C2C, 4J44), while for 18 others all programs failed to identify a native-like conformation. Using the number of near-native poses as assessment criterion, SurflexSA performed best with 38% of the 53 top-ranked docking poses adopting a near-native conformation. The program not only correctly placed drug-like short peptides (7 out of 11) but also successfully reproduced

concentrated on peptides adopting turn or coil conformations. Ten peptides contain secondary structures. The percentage of residues with secondary structure in these peptides ranges between 33 and 82%. More detailed information on the work flow is shown in Figure S1. The outcome of our selection procedure was a data set comprising 53 high-resolution protein−peptide complexes with peptides composed of 3 to 12 residues and having between 7 and 51 rotatable bonds. Table 1 provides an overview on the data set along with some molecular properties of the peptides. Only peptides possessing between 2 and 4 residues revealed drug-like properties as defined by Lipinski’s “Rule-of-Five”,31 which is often used as a probability criterion in drug discovery to estimate oral bioavailability. All peptides possessing more than four residues featured several “Rule-of-Five” violations. In order to ensure a neutral starting structure, all peptides to be docked were generated as extended conformations (φ/ω torsion angles adopting 180°), and it was ensured that the atomic coordinates do not overlap with the binding sites. Evaluation of Small Molecule Docking Programs. In a second step the LEADS-PEP benchmark data set was utilized for a detailed analysis of the peptide docking performance of several popular docking tools, namely AutoDock, AutoDock Vina (hereafter termed Vina), Surflex-Dock (hereafter termed Surflex), and GOLD. None of these programs has been specifically designed for handling peptides or other highly flexible ligands. Settings of the programs were not significantly changed compared to those usually used for small molecule docking. In particular, this included a limited number of docking runs (20). First of all, we analyzed the CPU time required by each program (Figure 1 and Table S1). In general, the computing time increased with residue length and the time difference between shortest and longest peptide reached up to 2 orders of magnitude for the same program. With a median CPU time of 5.6 min, GOLD with ChemPLP (CP) emerged as fastest program, closely followed by Vina using standard accuracy (SA) D

DOI: 10.1021/acs.jcim.5b00234 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

reached 23% success rate. AutoDockSA and both Vina approaches were capable to reproduce 19% of the 53 peptides correctly. GOLD:ASP (15%) and AutoDockHA (17%) showed worst performance. Only VinaHA, Surflex using standard and high accuracy settings as well as GOLD:GS were capable to identify native conformations of peptides containing 10 or more residues. Application of high accuracy settings for AutoDock, Vina, and Surflex did not result in improved overall performance. The median RMSD over the whole benchmark data set using AutoDockHA was almost identical compared to the approach using SA settings and the number of near-native conformations even slightly dropped. For VinaHA and SurflexHA the median RMSD was marginally higher compared to results obtained for standard accuracy settings. The number of near-native poses was identical for both Vina settings but declined by four when using SurflexHA instead of SurflexSA. The number of peptide conformations reproduced correctly by both SA and HA settings was 6 for AutoDock, 8 for Vina, and 13 for Surflex. For 2XFX the application of VinaHA and SurflexHA resulted in near-native conformations of docking poses while both programs failed to produce accurate peptide conformations when using SA settings. In case of 3UPV, 3NJG, and 4DS1, SurflexHA outperformed the same program using standard accuracy settings. For a number of protein−peptide complexes docking programs with standard accuracy settings revealed near-native poses but failed to identify a correct pose among the top-ranked conformations when used with HA settings. Two such incidences occurred when using Vina (2OY2, 3LNY), four in case of AutoDock (2OXW, 2HPL, 3D1E, 4BTB), and even seven when applying Surflex (2HPL, 3NFK, 3IDG, 4NNM, 1OU8, 2W0Z, and 1H6W). Figure 2 shows selected examples of near-native peptide conformations produced by different docking programs. VinaSA docked the largely solvent-exposed 3-mer peptide of 3BS4 correctly and the backbone RMSD compared to the X-ray crystal structure was just 0.6 Å (Figure 2A). Only the Nterminus of the peptide was not correctly placed, resulting in a larger deviation of the asparagine side chain. Although the backbone RMSD of the 2HPL pentapeptide docked with AutoDockSA is reasonably low (1.7 Å), the position of the Nterminal residue was less accurate (RMSD = 4.0) (Figure 2B). Nevertheless the docking pose revealed complete reproduction of the intermolecular hydrogen bond interaction pattern, and only the intramolecular hydrogen bond shared between the aspartate side chain and glycine backbone revealed as shifted toward the tyrosine backbone nitrogen (data not shown). The heptapeptide of 3MMG docked using VinaHA showed different orientations of both terminal residues. Since VinaHA placed four out of five central residues with high accuracy, the overall backbone RMSD was 1.2 Å (Figure 2C). However, it failed to reproduce all hydrogen bonds between protein and peptide. With exception of both terminal residues, SurflexSA positioned the nonamer peptide of 2W0Z very accurately (RMSD = 1.3 Å; Figure 2D). GOLD:GS reproduced the 1H6W peptide binding mode for seven of the ten residues with high accuracy (Figure 2E). Only the N-terminal amino acids revealed larger deviations from the X-ray crystal structure and the backbone RMSD for the whole peptide is 1.1 Å. The conformation for the 2XFX peptide generated by SurflexHA differed only by 1.4 Å from the X-ray crystal structure (Figure 2F). Coordinates of N- and Cterminus matched the crystal structure well but positions of some largely solvent-exposed residues showed larger deviations

Table 2. Peptide Docking Performance as Measured by Best Scored Binding Modesa

a

Backbone RMSD of the best scored poses are shown in a gradient color code. Highly accurate poses (10.0 Å) is highlighted in dark red. Abbreviations: res, residues; SA, standard accuracy; HA, high accuracy; ASP, Astex Statistical Potential; CP, ChemPLP; CS, ChemScore; GS, GoldScore.

near-native conformations of several longer peptides, including also two peptides comprising 11 residues. Only three other approaches reached a success rate of 30% (SurflexHA, GOLD:CS, GOLD:GS). GOLD:CP was on third place and E

DOI: 10.1021/acs.jcim.5b00234 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 2. Selected examples of accurately reproduced peptide binding modes. (A) 3BS4 (peptide length: 3 aa, method VinaSA), (B) 2HPL (5 aa, AutoDockSA), (C) 3MMG (7 aa, VinaHA), (D) 2W0Z (9 aa, SurflexSA), (E) 1H6W (10 aa, GOLD:GoldScore), (F) 2XFX (11 aa, SurflexHA). For 2XFX all side chain atoms except for lysine were removed for clarity. Proteins are shown as surface (carbon atoms in gray), peptides as capped sticks (cocrystallized carbon atoms in orange, docked carbon atoms in green).

resulting in a completely different orientation of the lysine side chain. Binding modes of inaccurately docked poses revealed shifting along the backbone, large reorientations of the N- and/or Cterminal region or even completely inverted peptides compared to the X-ray crystal structure conformation (Figure 3, left panel). In a next step we investigated whether other peptide conformations within the set of 20 poses generated for each peptide better agree with the reference structure. Thus, for each docking approach the peptide pose with lowest RMSD to the cocrystallized conformation was extracted. Compared to the top-ranked poses, the median RMSD for the best pose set was significantly lower (Table 3). The drop in RMSD varied between 1.5 and 5.0 Å. Four approaches (VinaHA, SurflexSA, SurflexHA, and GOLD:GS), revealed a median RMSD ≤ 2.5 Å. For all docking programs, the number of near-native poses increased for the best pose set compared to the set containing top-ranked poses (Table 3). SurflexSA performed best and identified 29 near-native poses. Results for VinaHA, SurflexHA and GOLD:GS were almost equally well, resulting in 28 correctly reproduced peptide conformations. VinaSA and GOLD:CS were able to identify 20 near-native poses. GOLD with either ASP or CP placed 17 peptides correctly. Finally, both AutoDock approaches revealed the lowest number of near-native poses (SA 11, HA 12). Furthermore, we investigated the actual number of nearnative conformations within the set of 20 docking poses (Table 4). Despite similar overall performance in terms of median RMSD (best pose) and number of near-native occurrences over the whole benchmark data set, results for VinaHA, GOLD:GS, and Surflex (both settings) largely differed. Application of VinaHA resulted in 101 (9.5% of the set of 1060 poses) and GOLD:GS produced 183 near-native poses (17.3%). SurflexSA and SurflexHA generated 340 (32.1%) and 317 (29.9%) nearnative conformations, respectively. For six peptides both SurflexSA and SurflexHA achieved the maximum number of near-native poses. In case of SurflexSA, this included not only short (3−4 res, 1B9J, 3BS4, 4C2C), but also long peptides (10−11 res, 1H6W, 1N12, 3BFW). SurflexHA generated 20 near-native poses for four short (1B9J, 3BS4, 1TW6, 4C2C), one medium-sized (3MMG), and one long (1N12) peptide.

Figure 3. Identification of near-native docking poses using rescoring. (A) 1UOP (peptide length: 4 aa, method: VinaHA and ChemPLP). (B) 1SVZ (6 aa, VinaHA and ChemPLP). (C) 1OU8 (8 aa, SurflexHA and ASP). (D) 4DGY (12 aa, VinaHA and ChemPLP). Proteins are shown as surface (carbon atoms in gray), peptides as capped sticks (cocrystallized carbon atoms in orange, best-scored pose green, best rescored pose cyan).

GOLD:GS achieved the maximum number of near-native poses only for drug-like peptides (1B9J, 2OY2, 4C2C, 4J44). For either AutoDock or Vina (both settings) the maximum number F

DOI: 10.1021/acs.jcim.5b00234 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

set of 20 docking poses (Table 4). For example SurflexSA produced up to three near-native conformations for six peptides of which one was also top-ranked (2W0Z). Application of SurflexHA resulted in seven peptides with few near-native docking poses and for two peptides (3NJG, 1ELW) SurflexHA top-ranked these conformations. In more than a third (19) of the 53 peptides VinaHA revealed three or less near-native poses among the 20 peptide conformations. Only in four cases (3BS4, 4C2C, 4NNM, 2XFX) the top-ranked conformation was also a near-native pose. GOLD:GS revealed several peptides (15) with up to 3 near-native conformations with 4 of them topranked by GoldScore (3MMG, 4N7H, 1H6W, 3BRL). Utilization of Rescoring for Improving Peptide Docking Results. The clear discrepancy between the number of top-ranked near-native poses and existing low RMSD conformations indicated a substantial potential for improving the outcome of peptide docking scenarios for several programs. One option is to re-evaluate the docking poses using other scoring functions. The docking program GOLD provides the opportunity to rescore preexisting docking poses using its internal scoring functions. In order to evaluate whether utilization of this functionality may narrow the RMSD gap between top-ranked and best pose set, all docking poses generated during our evaluation were rescored with CP, CS, GS, and ASP, respectively. Table 5 provides a summary of the best rescoring option over the whole data set. Most programs did not show any improvement of the overall median RMSD and when using GOLD with scoring functions ASP and CS the performance even declined. Only for VinaHA (3.5 Å) the median RMSD dropped significantly. However, much more important than the overall RMSD improvement is the capability to actually increase the number of near-native poses within the top-ranked rescored conformations. Table 6 shows the comparison of near-native conformations of topscored, best poses and top-rescored poses. Results for GOLD:CP did not improve when using rescoring and for AutoDockSA the number of top-rescored poses declined compared to the top-ranked poses. For all other approaches the number of near-native poses increased between 6 and 110%. In combination with ASP, CP or CS, VinaHA revealed best overall improvement and all three docking/rescoring combinations resulted in 21 near-native conformations, compared to only 10 considering the top-ranked pose. In combination with CP the additional near-native poses comprised peptides composed of 4 (Figure 3A), 5, 6 (Figure 3B), 7, and 8 as well as 11 and 12 (Figure 3D) residues, respectively. Also the performance of SurflexHA in combination with ASP rescoring improved significantly (38%), resulting in 22 nearnative conformations. Additional peptides with near-native conformations are composed of five, six, eight (Figure 3C) and ten residues, respectively. In case of SurflexSA, identified as the most performant tool considering the best scored set, rescoring increased the number of near-native conformations by 10% to 22 using either CP or CS. All other docking/rescoring approaches always resulted in less than 20 near-native conformations (Table 6). Although rescoring turned out to significantly improve the overall redocking performance for several programs, in few cases the top-rescored poses adopted non-native conformations while the best scored docking pose fulfilled the 2.5 Å criterion. Such unwanted events occurred for example when using VinaHA with CP or CS rescoring (2XFX, top-scored pose: 2.0 Å, top-

Table 3. Peptide Docking Performance as Measured by Best Sampled Binding Modesa

a

Lowest backbone RMSD values from each docking run are shown in a gradient color code. Highly accurate poses (10.0 Å) is highlighted in dark red. Abbreviations: res, residues; SA, standard accuracy; HA, high accuracy; ASP, Astex Statistical Potential; CP, ChemPLP; CS, ChemScore; GS, GoldScore.

of near-native poses per peptide did not exceed 17 (AutoDockHA, 2OY2) or 11 (VinaHA, 4J44), respectively. In several cases VinaHA, SurflexSA, SurflexHA and GOLD:GS generated only few (1−3) near-native conformations within the G

DOI: 10.1021/acs.jcim.5b00234 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Table 4. Number of Near-Native Poses Generated by Each Docking Approacha AutoDock res 3 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10 10 10 10 10 11 11 11 11 11 12 12 12 12 12 sum

Vina

Surflex

GOLD

PDB

SA

HA

SA

HA

SA

HA

ASP

CP

CS

GS

1B9J 2OY2 3GQ1 3BS4 2OXW 2B6N 1TW6 3VQG 1UOP 4C2C 4J44 2HPL 2 V3S 3NFK 1NVR 4V3I 3T6R 1SVZ 3D1E 3IDG 3LNY 4NNM 4Q6H 3MMG 3Q47 3UPV 4QBR 3NJG 1ELW 3CH8 4WLB 1OU8 1N7F 3OBQ 4BTB 2W0Z 4N7H 2QAB 1H6W 3BRL 1NTV 4DS1 2O02 1N12 2XFX 3BFW 4EIK 3DS1 4J8S 2W10 3JZO 4DGY 2B9H

6 8 3 0 5 0 14 0 2 5 10 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 57

14 17 13 2 7 0 16 0 4 13 8 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 97

3 1 6 3 2 1 4 5 0 3 6 2 4 0 0 0 0 0 0 0 3 1 0 0 0 2 2 1 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 54

9 4 10 3 2 0 6 6 2 3 11 3 2 2 1 0 0 3 0 1 2 3 0 4 3 2 6 5 0 2 0 0 0 0 3 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 101

20 2 13 20 1 0 19 10 7 20 11 9 0 10 0 0 0 3 5 17 0 19 0 17 0 8 18 6 12 1 0 16 0 14 0 1 0 0 20 0 0 1 0 20 0 20 0 0 0 0 0 0 0 340

20 0 16 20 0 0 20 19 0 20 4 2 0 9 2 0 0 3 4 0 0 18 10 20 0 5 13 3 1 0 0 12 0 13 0 0 0 0 16 1 0 11 0 20 15 19 1 0 0 0 0 0 0 317

20 20 5 6 9 2 10 1 1 8 14 0 3 1 0 1 1 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 106

20 20 15 11 4 0 14 0 0 10 17 0 6 0 0 2 6 0 1 0 3 1 0 0 0 0 0 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 134

20 20 17 14 18 3 8 1 0 13 16 1 4 0 0 5 4 0 1 0 6 2 2 0 1 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 158

20 20 15 9 1 3 19 4 5 20 20 0 6 2 0 3 7 0 0 0 1 8 0 1 1 0 7 2 3 0 0 0 0 1 0 0 1 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 183

a

Abbreviations: res, residues; SA, standard accuracy; HA, high accuracy; ASP, Astex Statistical Potential; CP, ChemPLP; CS, ChemScore; GS, GoldScore.

rescored pose 3.8 Å), SurflexSA with CS (1OU8, 1.7 vs 2.8 Å; 2W0Z, 1.3 vs 4.3 Å), SurflexHA with ASP (4QBR, 1.2 vs 12.3 Å)

or GOLD:GS with ASP (3BS4, 0.9 vs 5.4 Å; 4QBR, 1.9 vs 11.5 Å). H

DOI: 10.1021/acs.jcim.5b00234 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

docking. We restricted our analysis to docking/rescoring approaches with at least 33% success rate. The data set for analysis included VinaHA:CP, SurflexSA:CS, SurflexHA:ASP, and GOLD:GS:ASP. At first, we evaluated whether an increasing number of rotatable bonds (see also Table 1) has negative effects on the docking performance (Figure 4A). This was true for all programs, in particular the performance of GOLD:GS:ASP was highly dependent on the number of rotatable bonds. Also we evaluated the impact of the peptide conformation in the bound state on the docking performance. For this purpose the ratios between maximum Cα−Cα distances for bound and linearized peptides were determined. A low ratio indicates a more folded peptide while a high ratio reveals a linear conformation. Within the LEADS-PEP data set 4DGY (bound/linear ratio: 0.13), 4J8S (0.42), and 3DS1 (0.43) possess lowest elongation while several peptides are fully stretched when in complex with their target protein (4C2C, 1.02; 3OBQ, 1.09; 2W0Z, 1.26). Except for VinaHA:CP, we observed a correlation between elongation and RMSD for both Surflex approaches as well as GOLD (Figure 4B). There was also a clear trend for these three approaches when investigating the number of intramolecular hydrogen bonds in the peptides. These are expected to occur in peptides with more condensed conformation. Within LEADSPEP peptides contain between zero and nine such hydrogen bonds. As shown in Figure 4C, an increase in intramolecular hydrogen bonds correlated with a loss in docking accuracy for Surflex and GOLD. Recently, it has been suggested that the presence of free charged side chains strongly correlates with docking success.13 In our data set, the majority of peptides possesses no (36) or a single (14) free charged side chain. Of the four best docking/scoring options, only VinaHA:ASP performance revealed as strongly dependent on the number of free charged side chains (Figure 4D).

Table 5. Overview of Best Docking/Rescoring Combinationsa



DISCUSSION For determination of binding modes of small molecules at their protein target or for identification of bioactive molecules from a set of active and nonactive compounds current docking programs and scoring functions have reached an acceptable performance.33,34 However, docking of highly flexible peptides still remains a computational challenge and only few specific peptide docking programs have been developed so far. Up to now, no standards regarding an intermethod comparison of docking and scoring performance have been established as has been done for small molecule docking and screening.14−16 In order to evaluate their tools, developers of peptide docking programs often have used self-constructed benchmark data sets, therefore a biased selection cannot be completely excluded and usually the prepared structures are not available for other researchers. Reconstruction of these data sets is error-prone, as the preparation procedure of the proteins (e.g., protonation states, amide and histidine side chain corrections, inclusion of water molecules) may differ. Therefore, we created a benchmark data set for peptide docking with several advantages: (a) LEADS-PEP is publicly available at www. leads-x.org. (b) It is not biased toward any docking program. (c) It is ready-to-use as fully prepared protein and peptide structures are provided. Our collection contains 53 highresolution protein−peptide complexes with peptide lengths ranging from 3 to 12 residues. The selection process was guided by quality and sequential diversity of the structures using an objective and reproducible workflow. Starting from a large set

a

Backbone RMSD of the best docking/scoring combinations are shown in a gradient color code. Highly accurate poses (10.0 Å) is highlighted in dark red. Abbreviations: res, residues; SA, standard accuracy; HA, high accuracy; ASP, Astex Statistical Potential; CP, ChemPLP; CS, ChemScore; GS, GoldScore.

Molecular Properties Determining Peptide Docking Success. In a last step we intended to investigate molecular properties of the peptides influencing the success of peptide I

DOI: 10.1021/acs.jcim.5b00234 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Table 6. Overview of Near-Native Conformations Obtained with Different Approaches and Best Rescoring Optionsa AutoDock best score best pose rescoring best rescoring options

Vina

Surflex

GOLD

SA

HA

SA

HA

SA

HA

ASP

CP

CS

GS

10 11 9 ASP CS

9 12 11 CS

10 20 12 CS CP GS

10 28 21 CP CS ASP

20 29 22 CS CP

16 28 22 ASP

8 17 12 CS

12 17 12 CP

16 20 17 GS

16 28 18 ASP CS

a

Number of near-native conformations obtained for top-ranked docking pose (best score), pose with lowest RMSD (best pose), and top-rescored pose (rescoring). Best rescoring options are sorted by overall median RMSD. Abbreviations: SA, standard accuracy; HA, high accuracy; ASP, Astex Statistical Potential; CP, ChemPLP; CS, ChemScore; GS, GoldScore.

Figure 4. Evaluation of factors affecting performance for best performing docking/rescoring options. (A) Number of rotatable bonds (VinaHA:CP, Pearson’s correlation coefficient r = 0.31, two-tailored p value = 0.02*); SurflexSA:CS, r = 0.41, p = 0.003**; SurflexHA:ASP, r = 0.37, p = 0.007**; GOLD:GS:ASP, r = 0.51, p = 0.0001***). (B) Ratio between maximum Cα−Cα of bound and linear peptides (VinaHA:CP, r = 0.02, p = 0.89; SurflexSA:CS, r = 0.40, p = 0.003**; SurflexHA:ASP, r = 0.41, p = 0.002**; GOLd:GS:ASP, r = 0.40, p = 0.003**). (C) Number of intramolecular hydrogen bonds (VinaHA:CP, r = 0.10, p = 0.46; SurflexSA:CS, r = 0.48, p = 0.0003***; SurflexHA:ASP, r = 0.54, p =