Subscriber access provided by George Mason University Libraries & VIVA (Virtual Library of Virginia)
Article
Indirect Readout in Protein-peptide Recognition: A Different Story from Classical Biomolecular Recognition Hua Yu, Peng Zhou, Maolin Deng, and Zhicai Shang J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/ci5000246 • Publication Date (Web): 07 Jul 2014 Downloaded from http://pubs.acs.org on July 9, 2014
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 43
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
Indirect Readout in Protein-peptide Recognition: A Different Story from Classical Biomolecular Recognition Hua Yu,† Peng Zhou,‡ Maolin Deng,*,§ Zhicai Shang*,† †
‡
Department of Chemistry, Zhejiang University, Hangzhou 310027, China
Center of Bioinformatics (COBI), School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China §
Department of Mechanics, Zhejiang University, Hangzhou 310027, China
* Corresponding author. For Zhicai Shang: E-mail:
[email protected] * Corresponding author. For Maolin Deng: E-mail:
[email protected] ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ABSTRACT: Protein-peptide interactions are prevalent and play essential roles in many living activities. Peptides recognize their protein partners by direct non-bonded interactions and indirect adjustment of conformations. Although processes of protein-peptide recognition have been comprehensively studied in both sequences and structures recently, flexibility of peptides and the configuration entropy penalty in recognition did not get enough attention. In this study, 20 protein-peptide complexes and their corresponding unbound peptides were investigated by molecular dynamics simulations. Energy analysis revealed that configurational entropy penalty introduced by restriction of the degrees of freedom of peptides in indirect readout process of protein-peptide recognition is significant. Configurational entropy penalty has become the main content of the indirect readout energy in protein-peptide recognition instead of deformation energy which is the main source of the indirect readout energy in classical biomolecular recognition phenomena, such as protein-DNA binding. These results provide us a better understanding of protein-peptide recognition and give us some implications in peptide ligand design.
1.
INTRODUCTION Protein-peptide interactions play important roles in many cellular processes, such as
signal transduction, apoptotic, immune system and other pathways.1-3 Hence, they are also attractive targets for drug design. Recent years, lots of studies have been done to investigate the sequence and structural basis of protein-peptide binding process.4-8 As
ACS Paragon Plus Environment
Page 2 of 43
Page 3 of 43
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
knowing more about protein-peptide interactions, lots of researchers have tried to design peptide ligands though rational designs that have shown potent inhibition in vitro or in vivo are still rare, which drives us to do further exploration.9-10 Traditionally, the classical biomolecular recognition can be decomposed into two processes in terms of thermodynamics. First, the two partners change their conformations to the one adopted in the complex, which is called indirect readout. Then, the two right conformations bind through direct non-bonded interactions, which is called direct readout. Both processes contribute to the total binding free energy change. A lot of researches have been done by Akinori Sarai’s11-16 group and other researchers17 to evaluate the direct and indirect readout energies and their relative contributions to the sequence specificity of the protein-DNA and drug-DNA complexes. In their reports, indirect readout energy comes from the conformational energy difference which is called deformation energy between the bound and the unbound DNA conformations. However, situations may be different in protein-peptide indirect readout for the great flexibility of peptides. In 2007, Chang and co-workers reported that ligand configurational entropy loss upon protein-ligand binding is neglected or underestimated for a long time by analyzing the association of amprenavir with HIV protease.18 In 2009, Killian researched the binding of Tsg101 ubiquitin E2 variant domain with an HIV-derived PTAP nonapeptide and noted that the configurational entropy changes on binding play a role as important as non-bonded interactions.19 Thus, entropy penalty (−∆) introduced by the restriction of degrees of freedom of the bound peptide may be large and cannot be
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ignored in protein-peptide recognition. However, almost all present studies on entropy cost focus on one specific complex just like mentioned above. We lack a systematic investigation on many different protein-peptide complexes. Hence, in this article we addressed this problem via molecular dynamics simulations on 20 protein-peptide complexes with various peptide structures, various binding modes and various cellular activities selected from the Protein Data Bank (PDB)20. Present analysis revealed that peptide configurational entropy penalty (−∆) is significant and becomes the main source of the indirect readout energy in protein-peptide recognition. This suggests us to pay more attention to the flexibility of peptides whenever predicting binding affinity or designing peptide ligands, and so on. 2.
MATERIALS AND METHODS 2.1. The Dataset. All of the protein-peptide complexes21-38 were obtained from the
PDB. We selected manually the complexes that have different structures of peptides (Table 1), different binding modes and various cellular activities with available dissociation constants (Kd) (Table 2). All peptides consist of natural amino acids only. The statistical numbers of various amino acids present in peptides of the dataset were summarized in Figure S1. Proline is overrepresented because of the selection of proline-rich peptides. There were two kinds of objects in our simulations. One was the protein-peptide complex. The other was the unbound peptide obtained from the corresponding complex. Here, for complexes that have multiple molecular models, we uniformly chose the first one. In theory, no matter which model we use, the simulation
ACS Paragon Plus Environment
Page 4 of 43
Page 5 of 43
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
result will be same as various NMR models are very likely the actual ensemble of conformations in solution39. However, a certain degree of deviation among energy results of different models always exists because of the limited time scale and randomness of the dynamics simulations in practice (1H3H, 1H3H-model-3, 1H3H-model-18, 2KHH, 2KHH-model-12 and 2KHH-model-20 in Table S2). Though the energy variation didn’t influence the main conclusion of this paper significantly, it did have a little effect on the correlation analysis in Figure 4 below. A detailed analysis of this kind of influence was shown in Figure S2. 2.2. Molecular Dynamics Simulations. All of the molecular dynamics (MD) simulations were performed using GROMACS (ver.4.5.4) software package40 with the Amber0341 force field and the TIP3P42 water model. Hydrogen atoms present in the original coordinate files were ignored and then added according to the description in the Amber03 force field using “pdb2gmx” tool available in GROMACS. Default protonation states were used for all residues. Using the “editconf” tool, each protein-peptide complex\unbound peptide was placed in the center of a dodecahedron box. Distance between the solute and the box was 12 Å. Then the box with the solute inside was immersed into TIP3P water using the GROMACS utility program “genbox”. In each system, proper numbers of Na+ or Cl- were added to keep neutralization using “genion”. This step was followed by steepest descent energy minimization until the maximum force was smaller than 100 KJ mol-1 nm-1 on any atom. Afterwards, each system was equilibrated in two stages. First, a 1 ns simulation was performed under a constant
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
volume ensemble (NVT) with a harmonic position restraint applied on heavy atoms of the solute. Second, a 1 ns simulation was performed under a constant pressure ensemble (NPT) without any restraint. All production simulations were conducted for 60 ns under NPT ensemble. For all the simulations, the default linear constraint solver (LINCS)43 algorithm was used to constrain all bonds containing hydrogen atoms. Periodic boundary conditions were applied in all three directions. Berendsen’s44 coupling algorithm with constants of 0.1 ps and 1.0 ps was used to maintain temperature (300 K) and pressure (1 bar), respectively. Solute, solvent and the counter ions were coupled separately for temperature coupling. The particle-mesh-Ewald (PME)45-46 method was used to calculate electrostatic interactions. Van der Waals interactions were treated using a cutoff of 14 Å. Time step was set to 2 fs. Every 1 ps snapshot was collected for the whole simulation of 60 ns. 2.3. Calculation of the Direct and Indirect Readout Energetic Components Involved in Protein-peptide Recognition. The same Amber03 force field and the trajectory of the last 20 ns were used to calculate energies for all the simulations. Any necessary analysis tool was available in GROMACS (ver.4.5.4). 2.3.1. Direct Readout Energy. The direct readout energy (∆ ) was estimated as the sum of the non-bonded interaction energy ( ) and the solvation free energy (∆ ) in the direct readout process. ∆ = + ∆ where,
ACS Paragon Plus Environment
(1)
Page 6 of 43
Page 7 of 43
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
= 〈 〉 + 〈 〉
(2)
where 〈⋯ 〉 denotes the average result for a set of structures of a period simulation trajectory. The non-bonded interaction energy contains electrostatic interaction energy ( ) and van der Waals interaction energy ( ) between peptide and its protein partner. Both can be obtained using “G-energy” tool within GROMACS. ∆ = − !"#$ − "#%
(3)
where, = 〈& 〉 + 〈 & 〉
(4)
We used GBSA47-48 implemented in GROMACS (ver.4.5.4) to get the value of , where the polar free energy of solvation, & , was calculated from generalized Born equation while the non-polar term, & , was calculated using a constant of 2.25936 KJ mol-1 nm-2 for the value of surface tension with SA algorithms. Here, trajectories of protein and peptide needed for calculations of !"#$ and "#% were extracted from the trajectory of the corresponding complex. Note that the entropic term of solvent was included here in the solvation free energy and hence in the direct readout energy. Therefore, the direct readout energy is a different concept from enthalpy change of protein-peptide binding. 2.3.2. Indirect Readout Energy. The indirect readout energy ( ∆ ) in protein-peptide recognition was estimated as peptide configurational entropy penalty (−∆) which is only one part of the total entropy penalty of protein-peptide recognition. ∆ = −∆ = −'〈&& ()$% 〉 − 〈&& )$()$% 〉*
ACS Paragon Plus Environment
(5)
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 43
where && the absolute peptide configurational entropy, was computed based on the , Quasiharmonic approach implemented in GROMACS. It describes the entropy associated with fluctuations of the 3N-6 internal coordinates of the molecule and contains both vibrational entropy and conformational entropy. This term is the reflection of flexibility. In this study, we mainly used the entropy difference of the same peptide in bound and unbound states. Though it was reported that Quasiharmonic approach overestimates the absolute configurational entropy of the flexible systems,49-51 the entropy difference of the same molecule will be relatively reliable as the errors tend to be canceled.18,49
&& ()
%$and
&& )$()
%$are
the
absolute
peptide
configurational entropies in bound and unbound states, respectively. The simulation temperature T = 300K was used here. 2.3.3. Binding Free Energy. Apparently, the final binding free energy of protein-peptide recognition is the sum of the direct readout energy and the indirect readout energy. ∆ + = ∆ + ∆
(6)
2.3.4. Deformation Energy. Deformation energy, ∆ -. , was estimated as the total energy change of peptide from the unbound to bound state. ∆ -. = 〈 && ()$% 〉 − 〈 && )$()$% 〉
(7)
in which, 〈 && 〉, the average total energy of peptide, was coarsely obtained from the trajectory which only contains peptide with all other molecules and ions removed. Perhaps, this will lead to an overestimation of the potential energy, especially for the
ACS Paragon Plus Environment
Page 9 of 43
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
bound peptide without considering influences of both protein and water molecules. As kinetic and potential energies of the peptide constitute && , as && = && / 01 + && &0 0
(8)
the deformation energy will be overestimated to some degree. The value of 〈 && 〉 was directly obtained via “g_energy” tool in Gromacs. 2.3.5. Z-value and W-value. In order to measure how strongly peptide configurational entropy penalty influences binding free energy, we defined a Z-value as Z-value = 2
∆345647 ∆3647
2
(9)
It’s also called entropy effect in the context below. A bigger Z-value means a bigger adverse effect that peptide configurational entropy penalty makes to the binding free energy, while a smaller one means the opposite. Similarly, we introduced a W-value to estimate how much difference deformation energy of peptide can make to the binding process. ∆869:;7