Multimodal Structural Distribution of the p53 C-Terminal Domain upon

Mar 11, 2019 - Relative Principal Components Analysis: Application to Analyzing Biomolecular Conformational Changes. Journal of Chemical Theory and ...
0 downloads 0 Views 2MB Size
Subscriber access provided by READING UNIV

Biomolecular Systems

Multimodal Structural Distribution of the p53 C-terminal Domain Upon Binding to S100B via a Generalised Ensemble Method: From Disorder to Extra-Disorder Shinji Iida, Takeshi Kawabata, Kota Kasahara, Haruki Nakamura, and Junichi Higo J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.8b01042 • Publication Date (Web): 11 Mar 2019 Downloaded from http://pubs.acs.org on March 18, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Multimodal Structural Distribution of the p53 C-terminal Domain Upon Binding to S100B via a Generalised Ensemble Method: From Disorder to Extra-Disorder Shinji Iida†, Takeshi Kawabata†, Kota Kasahara§, Haruki Nakamura†, Junichi Higo#,* Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka 565-0871, Japan. †

§ College

of Life Sciences, Ritsumeikan University, Noji-higashi 1-1-1, Kusatsu, Shiga 5258577, Japan. Graduate School of Simulation Studies, University of Hyogo, 7-1-28 Minatojimaminamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan. #

ABSTRACT: Intrinsically disordered regions (IDRs) of a protein employ a flexible binding manner when recognising a partner molecule. Moreover, it is recognised that binding of IDRs to a partner molecule is accompanied with folding, with a variety of bound conformations often being allowed in formation of the complex. In this study, we investigated a fragment of the disordered p53 C-terminal domain (CTDf) that interacts with one of its partner molecules, S100B, as a representative IDR. Although the 3D structure of CTDf in complex with S100B has been previously reported, the specific interactions remained controversial. To clarify these interactions, we performed

ACS Paragon Plus Environment

1

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 48

generalised ensemble molecular dynamics (MD) simulations (virtual-system coupled multicanonical MD, termed V-McMD), which enable effective conformational sampling beyond that provided by conventional MD. These simulations generated a multimodal structural distribution for our system including CTDf and S100B, indicating that CTDf forms a variety of complex structures upon binding to S100B. We confirmed that our results are consistent with chemical shift perturbations and nuclear Overhauser effects that were observed in previous studies. Furthermore, we calculated the conformational entropy of CTDf in bound and isolated (free) states. Comparison of these CTDf entropies indicated that the disordered CTDf shows further increase in conformational diversity upon binding to S100B. Such entropy gain by binding may comprise an important feature of complex formation for IDRs.

INTRODUCTION Intrinsically disordered regions (IDRs) in proteins are known to be ubiquitous in eukaryotic cells. Moreover, although the structure-function relationship constitutes an important paradigm in protein science, increasing knowledge of IDRs has necessitated re-definition of the paradigm. Currently, it is thought that IDRs recognise their partner molecules via a flexible binding mechanism, by which IDRs function as hubs in proteinprotein interaction networks1–3. IDRs have been shown to fold upon binding to partner molecules, which is referred to as ‘coupled folding and binding’ or ‘disorder-to-order

ACS Paragon Plus Environment

2

Page 3 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

transition’, and they often form an ambiguous complex structure termed a ‘fuzzy complex’4–7. Such interactions of IDRs with their binding partners have been widely investigated through experiments, simulations, or their combination8–13. A well-known IDR-partner complex is that of the C-terminal domain (CTD) of the tumour

suppressor

protein

p53

and

S100B.

S100B

inhibits

p53-dependent

transcriptional activation, with the inhibition dependent upon intracellular calcium concentration14. S100B is also used as a diagnostic marker of cancer15. Calcium binding promotes conformational changes of S100B, thereby exposing hydrophobic residues of S100B to CTD14. Notably, this CTD interacts with numerous partner protein molecules, with several CTD-partner complex structures having been determined experimentally. In particular, CTD binds to four partners via the same amino acid region in CTD14,16–18. However, on the binding surface of each of the four partner molecules, CTD forms various secondary structures: an -helix on S100B (Protein Data Bank19,20 (PDB) ID: 1DT7)14, a -sheet on Sir2 (PDB ID: 1MA3)18, and coiled structures on CBP (PDB ID: 1JSP)17 and on cyclin A (PDB ID: 1H26)16. Specifically, the structure of the CTD-S100B complex, which was determined using nuclear magnetic resonance (NMR) spectroscopy, comprises a dimer of the heterodimer (i.e., a dimer of CTD-S100B; see Figure. S1A)14. In turn, dynamic and structural properties of an isolated CTD peptide in solution were investigated using computer simulations. These studies, including one of our previous studies21, demonstrated that CTD in isolation possessed the ability to form

ACS Paragon Plus Environment

3

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 48

helical conformations21–26. This suggested that pre-formed helical conformations were populated

in

isolation,

which

may

provide

support

for

the

conformational

selection/population shift model27,28; however, the induced folding model could not be refuted27,28. Moreover, interactions of CTD in complex with S100B were studied via Monte Carlo (MC) or molecular dynamics (MD) simulations in previous computational studies22– 25,29–31.

Chen et al. indicated that non-helical CTD conformations existed in the global

minimum of the complex state23. The other computational studies also demonstrated that the helical conformation of CTD was not stable in the complex22,24,25,29,30; i.e., instability of helical bound conformations was observed for various force fields. Furthermore, McDowell et al. noted that the intermolecular nuclear Overhauser effects (NOEs) for this complex, which were used to determine the dimer of heterodimer structure14, are relatively weak29. These earlier studies imply that the heterodimer of CTD-S100B possibly forms a structurally heterogeneous complex. However, the specific interactions of CTD with S100B remain controversial. In the present study, we aimed to reveal and characterise interactions of the CTD fragment (CTDf) upon binding to S100B. To enable effective conformational sampling beyond that provided by conventional MD, we utilised a generalised ensemble method, virtual system coupled multicanonical MD (V-McMD) simulation32. By performing allatom V-McMD simulations using an explicit solvent model, we obtained a thermally-

ACS Paragon Plus Environment

4

Page 5 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

equilibrated structural ensemble consisting of a variety of structures of the S100B-CTDf complex, from which we suggested that S100B-CTDf forms a heterogeneous complex. We also found that contact probabilities of the S100B residues with CTD were comparable with previously reported chemical shift perturbations33 and that our structural ensemble well satisfies the NOE-derived distance limits established in the structure determination study14. Finally, a conformational distribution of bound-state CTDf was compared with that of free-state CTDf (i.e., in isolation).

MATERIALS AND METHODS CTDf-S100B System In this study, residue numbers for S100B (residues 0–91) were the same as those for the PDB file (PDBID: 1DT7), whereas those for CTD (residues 367–388) accorded with UniProt. As reported in an NMR study14, S100B with two calcium ions forms a homodimer in solution, and a CTD peptide binds to each S100B, so that the dimer of heterodimer (denoted as [CTD-S100B]2) is formed (Figure S1A). We constructed a simulation system comprising one of the heterodimers along with ions and water molecules. The heterodimer was derived from the first model in the PDB file. The residues of CTD from S367 to S376 were eliminated from the CTD peptide of the model (S367–E388), resulting in a CTD fragment (T377–E388), subsequently referred to as CTDf. The N- and C-termini of S100B and CTDf were, in turn, capped by

ACS Paragon Plus Environment

5

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 48

acetyl and amine groups, respectively. The capped CTDf and S100B were immersed in a periodic cube whose volume was (70.03 Å)3. The box was filled with water molecules and sodium and chloride ions. The ions were introduced to neutralise the net charges of the system, with a concentration set to 0.15 M. The simulation system in this study was finally composed of 33,451 atoms (1466 atoms for S100B, 226 for CTDf, 31,707 for water (10,569 molecules), 2 calcium ions, and 29 sodium and 21 chloride ions). Hereafter, this system has been referred to as the ‘CTDf-S100B system’. A single copy of the dimer of the heterodimer was assumed to be sufficient for simulation because the binding interface of CTDf in the single copy is separated from the dimer interface and no cooperative binding of two CTDs to the S100B dimer has been reported. Additionally, as the structure of S100B in the present system was restrained around the original structure in the heterodimer by distance restraints, which are described in a subsequent section, the single copy would approximate to the original one. The initial atom positions in our simulation system were generated as follows. Energy minimisation was performed so as to remove structural distortions in the system. Then, a constant-pressure and constant-temperature (NPT) simulation at 1 atm and 300 K was performed to equilibrate the volume of the box, by which the volume became (68.93 Å)3. Starting from the last snapshot of the NPT simulation, constant-volume (NVT) simulations at 650 K were carried out (180 runs; 7.2 ns for each run; time step = 2.0 fs). The NVT simulations generated 180 structures (final snapshots), in which CTDf was

ACS Paragon Plus Environment

6

Page 7 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

mostly unstructured and dissociated from S100B (Figure S2A). During these simulations, distance restraints were applied to S100B to ensure that S100B did not unfold (see Figure S1B and ‘Intra-S100B and Steric Restraints’ in Supporting Information). These 180 structures were used as the initial structures in V-McMD simulations. Although four among the 180 initial structures were relatively similar to the NMR complex (i.e., CTDf was not dissociated from S100B) (Figure S2B), the overall feature of the resulting structural ensemble was not influenced by the trajectories from these four initial structures (see ‘Influence of Initial Conformations on the Structural Ensemble’ in Supporting Information). Then, 180 production runs of V-McMD of 6.48 𝜇𝑠 were performed (simulation time step 2.0 fs; 36 ns for each run). The computer program myPresto/omegagene (ver. 0.38)34 was used for the hightemperature MD and V-McMD simulations, and myPresto/psygene-G35 was used for the NPT simulation. All of the simulations used the SHAKE method36 to fix covalent bond lengths related to hydrogens, the zero-dipole summation method37 to compute long range electrostatic interactions, the velocity scaling method38 to control temperature, and an Amber-based hybrid force field (mixing parameter 𝜔 = 0.8)39 for proteins as used in our previous study21. The other parameters of the force field for chloride and sodium ions40,41, and the TIP3P water model42 were also used. V-McMD runs were performed in TSUBAME2.5 and TSUBAME3.0 supercomputers at the Tokyo Institute of Technology,

ACS Paragon Plus Environment

7

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 48

and each of the runs was submitted to a CPU (Intel Xeon X5670) and a GPU (NVIDIA Tesla K20X and P100).

Virtual-System-Coupled Multicanonical MD (V-McMD) In this section, a brief background of V-McMD is provided. The theoretical background for V-McMD is explained in the ‘V-McMD section’ of the Supporting Information, and the sampling procedure in this study is given in ‘Actual Sampling Procedure of V-McMD’ of the Supporting Information. The multicanonical algorithm was first implemented for MC simulation (McMC) for a simple physical system43. This method was, in turn, applied to small peptide systems44,45. The algorithm was subsequently employed in molecular dynamics (MD) simulation (McMD)46–48. Then, to increase sampling efficiency further, trivial trajectory parallelisation (TTP-McMD) was developed49–51 and applied to a system consisting of an intrinsically disordered segment and its partner protein52. Recently, virtual system coupled McMD (V-McMD)32,53 has been developed wherein a virtual system is defined that enables readily passing bottlenecks in a structural space. These studies indicate an essential characteristic of the multicanonical algorithm: structural sampling is enhanced by a modified potential energy, generating a multicanonical ensemble that can be converted to an accurate canonical ensemble at an arbitrarily temperature.

ACS Paragon Plus Environment

8

Page 9 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Structural Ensemble V-McMD samples a variety of structures (snapshots) in a wide potential-energy range (e.g., a wide temperature range) (see Figure 1), yielding a multicanonical ensemble, 𝑄V ― McMD. By a reweighting technique21,32, the ensemble 𝑄V ― McMD is converted to a ― S100B canonical ensemble (i.e., structural ensemble) at any temperature 𝑇0, 𝑄CTDf (𝑇0) cano

where 𝑇0 was set to 300 or 600 K in this study. Finally, each structure in the ensemble is weighted by an existence probability, 𝑃c(𝐸R,𝑇0). See ‘Ensemble Reweighting’ of the Supporting Information for details. Figure 1 demonstrates the obtained flat energy distributions {𝑃vmc(𝐸R,𝑣𝑖); 𝑖 = 1,…,7} that indicate equilibration of the multicanonical ensemble with respect to 𝐸R.

ACS Paragon Plus Environment

9

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 48

Figure 1. Seven energy distributions {𝑃vmc(𝐸R,𝑣𝑖); 𝑖 = 1,…,7} generated by the V-McMD production runs. Each distribution has a single zone {𝑧𝑖} (Table S1). Two canonical distributions 𝑃c(𝐸R, 300 K) and 𝑃c(𝐸R, 600 K) are also illustrated. The canonical distributions weighted the multicanonical ensemble, leading to two structural ensembles ― S100B( ) 𝑇0 (𝑇0 = 300 K and 600 K). See Eq. S23 for derivation of the canonical distribution 𝑃c(𝐸R, 𝑇0). 𝑄CTDf cano

― S100B ― S100B The present study focused on four ensembles: 𝑄CTDf (300K), 𝑄CTDf cano cano

(600K),

― CTDf (300K), 𝑄bound cano

and

― CTDf (300K). 𝑄free cano

The

bound-state

ensemble

― CTDf ― S100B . We (300K) consisted of the CTDf conformations extracted from 𝑄CTDf 𝑄bound cano cano ― CTDf noted that all the conformations in 𝑄bound (300K) were located on the surface of cano ― CTDf S100B, as shown later. The free-state ensemble 𝑄free (300K) was composed of cano

conformations for an isolated CTDf in solution; this ensemble was obtained in our previous study21.

Intra-S100B and Steric Restraints Intra-atomic restraints were applied to S100B, thereby hindering S100B from unfolding during V-McMD simulations (Figure S1B). If the restraints were not applied, S100B would unfold during the V-McMD simulations because V-McMD explores a broad potentialenergy (temperature) range. Such case is beyond the scope of our aims for the present study. The forms of the intra-atomic restraints were the same as those used in our previous studies52,54. It should be noted that we had not applied restraints to CTDf; thus, CTDf was completely flexible during the simulations. A steric potential, 𝐸st, was also

ACS Paragon Plus Environment

10

Page 11 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

introduced to hinder CTDf from approaching the homodimer interface of S100B (Figure S1C). For further details, see ‘Intra-S100B and Steric Restraints’ in the Supporting information. The inter-atomic restraints were weak as defined in Supporting Information; thus, the S100B conformation fluctuated around the native structure during the V-McMD simulation. In theory, the effect of the restraints decreases when the system is fluctuating

in

a

low-energy

range

(room-temperature

range)

because

the

conformational deviation of S100B from the native structure is small. However, we did not insist that the motions of S100B in the low-energy range are completely realistic. The effect of restraints may be evaluated in future studies by utilising another sampling method free from the restraints.

Principal Component Analysis (PCA) PCA has been used to visualise the structural distribution of a complicated system expressed in a high-dimensional space and maps the distribution onto a low dimensional space (‘PC space’)21,52,55. Here, PCA was applied to the three ensembles: ― S100B (300K), 𝑄CTDf cano

― CTDf (300K), 𝑄bound cano

and

― CTDf (300K). 𝑄free cano

Eigenvectors

and

eigenvalues were obtained from diagonalisation of a variance-covariance matrix, which was constructed by C-alpha atom pair distances of the sampled structures in each of the ― CTDf ― CTDf three ensembles or in the combined ensemble of 𝑄bound (300K) and 𝑄free (300 cano cano

ACS Paragon Plus Environment

11

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 48

K). We arranged the eigenvectors in a descending order of eigenvalues; the eigenvector corresponding to the largest eigenvalue was referred to as the ‘first PC axis’ (or PC1) and the eigenvector for the second largest eigenvalue as the ‘second PC axis’ (or PC2). PCA was performed by using the python library scikit-learn56. Then, a low dimensional structural space was constructed using these PC axes. For further details, see ‘Principal Component Analysis (PCA)’ of the Supporting Information.

Clustering Method: DBSCAN Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a wellestablished clustering method57. See ‘Parameter Estimation for DBSCAN’ of the Supporting Information for the outline of DBSCAN. This method has several advantages: (i) no prior knowledge of the number of clusters is required; (ii) any shape of a cluster can be generated; and (iii) outliers can be handled. DBSCAN was applied to a structural ― CTDf distribution of 𝑄S100B (300 K) expressed in the 4D PC space spanned by PC1, PC2, cano

PC3, and PC4; the cumulative contribution ratio for the 4D space was more than 70%. Finally, twenty-one clusters were generated (Figure 2 and 3) by setting the parameters as: 𝜖 = 5.0 Å and 𝑀in = 38. The relationship between the parameters and resultant clusters is illustrated in Figure S6. DBSCAN was performed using the python library scikit-learn56.

ACS Paragon Plus Environment

12

Page 13 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Centre-of-Geometry Points of CTDf around S100B To determine the CTDf-binding sites on S100B in simulations, we calculated centre-of― S100B geometry points of CTDf conformations in the structural ensembles 𝑄CTDf (300 K) cano ― S100B (Figure 4A) and 𝑄CTDf (600 K) (Figure 4B). First, each snapshot in a structural cano

ensemble was superimposed on the last snapshot (reference structure) of the NPT simulation, using the backbone atoms of S100B. As a result, each CTDf conformation was translated and rotated according to the superposition of S100B. Subsequently, centre-of-geometry points of the CTDf conformations were computed by using the backbone atoms of each residue in CTDf.

Orientation of CTDf Relative to S100B ― S100B The orientation of CTDf also summarises the structural ensemble 𝑄CTDf (300 K) cano

and therefore can be a key quantity for interpretation of the ensemble. We defined two unit vectors 𝒖a and 𝒖b, each of which represents direction between a C-alpha atom pair: 𝒖a from V53 to T60 in S100B and 𝒖b from S378 to T387 in CTDf, as illustrated in Figure 5A. In turn, we calculated the inner product 𝒖a ⋅ 𝒖b for each snapshot to quantify the orientation of CTDf relative to S100B.

Contact Probability Analysis

ACS Paragon Plus Environment

13

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 48

― S100B From the structural ensemble 𝑄CTDf (300 K), inter-residue contact probabilities cano

between S100B and CTDf were computed. The centre-of-mass was first calculated by using the side chain atoms of each residue. If the centre-of-mass distance of a residue pair between the two molecules was less than 6 Å, we judged that the two residues are in contact. The contact probability was calculated by 𝑝𝑖 = 𝑁𝑖/𝑁𝑐, where 𝑁𝑐 is the total ― S100B number of snapshots in 𝑄CTDf (300 K), and 𝑁𝑖 is the number of snapshots where cano

residue i of S100B is in contact with one or more residues in CTDf. In addition, contacting residues for the NMR models were calculated using the same criteria.

Confirmation Analysis for NMR Models and the Structural Ensemble Nuclear Overhauser Effect Spectroscopy (NOESY) provides NOE signals of inter-1H pairs. From the strength of the signals, upper bounds of inter-1H distances are inferred. In this study, such 1H pairs and upper bounds were referred to as ‘NOE pairs’ and ‘NOE upper bounds’, respectively. The NOE upper bounds are often utilised to validate a simulated ― S100B ensemble29,58. Here, to validate the ensemble 𝑄CTDf (300 K), we used the NOE cano

upper bounds that were used to construct the NMR models of [CTD-S100B]214. The data of the bounds were downloaded from PDB entry 1DT7. The NOE upper bounds used in this study are listed in Table S3. The NOE pairs were classified into two groups: interCTDf-S100B and intra-CTDf NOE pairs (Table S3). The NOE-like distance 𝑟NOE ― like(𝑘) for an NOE pair is defined as:

ACS Paragon Plus Environment

14

Page 15 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(



𝑟NOE ― like(𝑘) = 𝑁𝑐 ―1

𝑁𝑐

𝑟𝑖―6(𝑘)

𝑖

)



1 6

,

(1)

where 𝑟𝑖(𝑘) is the inter-1H distance of the 𝑘-th NOE pair for the 𝑖-th structure in an ensemble. We judged that 𝑟NOE ― like(𝑘) is consistent with the NOESY measurement if it satisfies the following inequality:

𝑟NOE ― like(𝑘) ≤ 𝑟NOE up (𝑘) + 𝛥𝑟NOE,

(2)

where 𝑟NOE up (𝑘) is the NOE upper bound for the 𝑘-th NOE pair and 𝛥𝑟NOE is a tolerance that was set to 0.5 Å. The number of NOE-like distances satisfying Eq. 2 for inter- and intra-pairs were expressed as 𝑛intra and 𝑛inter , respectively. ‘NOE satisfaction ratios’ were s s inter intra defined as 𝑆𝑅NOE /𝑁inter and 𝑆𝑅NOE /𝑁intra, where 𝑁inter and 𝑁intra are the inter = 𝑛s intra = 𝑛s

total number of inter-CTDf-S100B (= 63) and intra-CTDf NOE pairs (= 30), respectively. To assess whether the construction of the NMR models obeyed the NOE upper bounds, we calculated each distance of the NOE pairs in the models and judged whether it satisfied the following inequality:

𝑟NMR(𝑘) ≤ 𝑟NOE up (𝑘) + 𝛥𝑟NOE,

(3)

ACS Paragon Plus Environment

15

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 48

where 𝑟NMR(𝑘) is the distance of the 𝑘-th NOE pair in an NMR model. The tolerance is the same as that in Eq. 2.

Conformational Entropy To assess the conformational variety of CTDf, we computed the conformational entropy ― CTDf (i.e., chain entropy) of CTDf for each of the two ensembles 𝑄free (300K) and cano ― CTDf ― CTDf (300K) as follows. We first applied PCA to the combined ensemble (𝑄free 𝑄bound cano cano ― CTDf (300K) plus 𝑄bound (300K)) to obtain eigenvectors. Note that these eigenvectors cano

provide common axes to view the two ensembles. Then, for each ensemble, we estimated the conformational probability function for each of the ensembles in the common 2D PC space, and defined conformational entropy 𝑆 as:

𝑆 (𝑠) = ―𝑅gas

∑ 𝑃 (𝑥 ,𝑥 )ln 𝑃 (𝑥 ,𝑥 ), 𝑠

1

2

𝑠

1

2

(𝑥1,𝑥2)

(4)

where (𝑥1,𝑥2) specifies an area in the 2D PC space, 𝑅gas is the gas constant, 𝑃𝑠(𝑥1,𝑥2) is the probability at the area, and the subscript 𝑠 is a state identifier. 𝑠 = ‘free state’ was ― CTDf used for the free-state ensemble 𝑄free (300K) and 𝑠 ‘bound state’ for the boundcano ― CTDf state ensemble 𝑄bound (300K). The summation is taken over 2D regions; notably, cano

the contributions of the probabilities 𝑃(𝑥1,𝑥2) = 0 to the entropy vanish according to the

ACS Paragon Plus Environment

16

Page 17 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

relation of lim 𝑃(𝑥1,𝑥2)ln 𝑃(𝑥1,𝑥2) = 0. Similarly, we calculated the conformational 𝑃→0

entropy on the common 4D PC space as:

𝑆 (𝑠) = ―𝑅gas



𝑃𝑠(𝑥1,𝑥2,𝑥3,𝑥4)ln 𝑃𝑠(𝑥1,𝑥2,𝑥3,𝑥4),

(𝑥1,𝑥2,𝑥3,𝑥4)

(5) where (𝑥1,𝑥2,𝑥3,𝑥4) specifies a volume in the common 4D PC space and 𝑃𝑆(𝑥1,𝑥2,𝑥3,𝑥4) is the probability at the volume. Area sizes (bin sizes) were all set to 1.0 Å2 for the 2D and to 1.0 Å4 for the 4D spaces.

Free Energy Landscape To analyse bound CTDf conformations, we produced a free energy landscape of CTDf on ― CTDf the PC space, which was calculated from the ensemble 𝑄bound (300K). From the cano

conformational distribution function 𝑃(𝑥1,𝑥2) defined in the above sub-section, a ‘free energy landscape’ (also termed ‘potential of mean force’ (PMF)) was defined as:

𝑃𝑀𝐹(𝑥1,𝑥2) = ― 𝑅gas𝑇0ln 𝑃(𝑥1,𝑥2),

(6)

where 𝑇0 (= 300 K) is temperature. The minimum of 𝑃𝑀𝐹(𝑥1,𝑥2) was set to zero.

RESULTS AND DISCUSSION

ACS Paragon Plus Environment

17

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 48

Multimodal Structural Distribution of the CTDf-S100B System To obtain a structural distribution at 300 K, we performed PCA for the ensemble ― CTDf (300K). The distribution was represented in the 2D PC space (Figure 2). 𝑄S100B cano

Contribution ratios for the first (PC1) and the second PC axis (PC2) were 35.9% and 18.3%, respectively (cumulative ratio = 54.2%). Considering the 4D PC space spanned by PC1, PC2, PC3, and PC4 axes (cumulative ratio = 70.3%), DBSCAN clustering was ― CTDf performed over 𝑄S100B . Because of the high cumulative ratio on the 4D PC space, cano

the clustering yielded well-separated structural clusters. Finally, 21 clusters were obtained (Figure 2). Because the distribution consists of several clusters, we referred it to as a ‘multimodal structural distribution’ in this study. We also analysed details of the multimodal distribution and considered its biological implications by calculating interatomic contacts between CTDf and S100B (see ‘Biological Implication of Multimodal Interactions’ of the Supporting Information). A measure to quantify the structural similarity between clusters would be useful to analyse clusters. Toward this end, a one-dimensional measure has been successfully defined59. However, since the multimodal distribution currently obtained indicates complicated pathways in the PC space, structural similarities between clusters are not defined simply by a one-dimensional metric. Furthermore, it may be useful to compare the free-state and bound-state clusters, as performed in the prior study59. However, a considerably large variation of the conformational distribution was observed between

ACS Paragon Plus Environment

18

Page 19 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the free- and bound-states in the present study, indicating the difficulty in identifying corresponding clusters between the two states.

― 𝐶𝑇𝐷𝑓 (300𝐾) projected in 2D (PC1 and PC2) space. A coloured Figure 2. Structural distribution for ensemble 𝑄𝑆100𝐵 𝑐𝑎𝑛𝑜 circle corresponds to a complex structure; the same coloured circles belong to the same cluster. Each black dot represents a complex structure that does not belong to any cluster (i.e., an outlier). DBSCAN was applied to the ensemble in 4D space constructed by PC1, …, and PC4, and then 21 clusters were obtained (see Materials and Methods). Clusters 𝐶1,.., 𝐶5 represent the top five largest clusters, the populations of which are 26.7%, 8.0%, 7.8%, 4.1%, and 2.1%, respectively. The white star indicates the first NMR model, which is almost structurally identical to the other models. Complex structures indicated by 𝑐𝑖 ― 𝑗 are shown in Figure 3.

Figure 3 demonstrates representative complex structures chosen from the top five largest clusters 𝐶1, 𝐶2, 𝐶3, 𝐶4, and 𝐶5 indicated in Figure 2. The largest cluster 𝐶1 contains complex structures with several different CTDf conformations: extended (𝑐1 ― 1), partially bent (𝑐1 ― 2 and 𝑐1 ― 4), and helical conformations (𝑐1 ― 3) (Figure 3A). Note that structures in 𝑐1 ― 3 are similar to the NMR models14 (red-coloured ribbon in Figure S7). Moreover, such NMR-like structures were obtained even though the V-McMD simulations were initiated from the CTDf-dissociated initial structures (Figure S2A). The

ACS Paragon Plus Environment

19

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 48

second largest cluster 𝐶2 includes bent conformations (𝑐2 ― 1) (Figure 3B), whereas clusters 𝐶3 and 𝐶5 comprise extended conformations (𝑐3 ― 1 and 𝑐3 ― 2, and 𝑐5 ― 1) (Figure 3B) and cluster 𝑐4 ― 1 adopts helical structures (Figure 3B). As shown in Figure 2 and 3, it is likely that the CTDf-S100B heterodimer forms a variety of complex structures. This result suggested that CTDf interacts with S100B in various ways. In terms of structural classification, the CTDf-S100B complex might correspond to the ‘random complex’ reported in comprehensive reviews4,60. Our ensemble included not only the NMR-like structure but also other various complex structures, whereas the reported NMR experiment14 provided only structural models similar to each other. It might therefore be considered that our ensemble appears to be inconsistent with the NMR structure. However, as demonstrated below, the NMR models represent a subset of the whole ensemble, which was elucidated in the present study.

Figure 3. Representative structures in each cluster in Figure 2. Green and red regions include atoms used for PCA (see Figure S5). The red-and-white coloured model is CTDf, and the white sphere indicates the N-terminus of CTDf. Yellow spheres are calcium ions. (A) Structures in the largest cluster 𝐶1. (B) Structures in 𝐶2, …, and 𝐶5.

ACS Paragon Plus Environment

20

Page 21 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Distribution of Centre-of-Geometry and Orientation of CTDf Relative to S100B ― S100B ― S100B For 𝑄CTDf (300 K) and 𝑄CTDf (600 K), we depicted the distribution of the cano cano

centre-of-geometry points of CTDf around S100B (Figure 4). Most points concentrated on a pocket of S100B at 300 K (Figure 4A) with few exceptions. This pocket corresponds to the hydrophobic pocket to which CTD binds in the NMR models14. In contrast, at 600 K, the points were distributed widely in space (Figure 4B). Although approaching the pocket might be considered to be biased by the steric restraints applied during the VMcMD simulations, the effects of the restraints appeared to be negligible because CTDf was able to approach the pocket in addition to other regions at 600 K (Figure 4B).

Figure 4. Centre-of-geometry spheres of CTDf around S100B (A) at 300 K and (B) at 600 K. The spheres are coloured in orange. Yellow spheres are calcium ions.

Probability distribution of the inner product 𝒖a ⋅ 𝒖b indicates that the orientation of CTDf tended to be parallel or antiparallel to the helix axis (V53-T60) of S100B (Figure 5B) and that the CTDf conformations parallel to the helix were the most likely. Notably, this parallel orientation corresponds to that of the NMR models14. In addition, this

ACS Paragon Plus Environment

21

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 48

directional tendency of CTDf reflects the shape of the hydrophobic groove formed in the CTDf binding site of S100B (see Figure S11).

Figure 5. Orientation of CTDf relative to a helix (V53–T60) of S100B. (A) Two unit vectors 𝒖𝑎 and 𝒖𝑏 are illustrated on the first NMR model. Green spheres are C-alpha atoms of V53 and T60, and magenta spheres are those of and S378 and T387, which define the two unit vectors (𝒖𝑎 and 𝒖𝑏). Blue- and red-coloured ribbon models are S100B and CTDf, respectively. Small yellow spheres are calcium ions. (B) Probability distribution of the inner product (𝒖𝑎 ⋅ 𝒖𝑏) at 300 K. The NMR model locates at 𝒖𝑎 ⋅ 𝒖𝑏 ≈ 1.0 (parallel). The opposite direction is 𝒖𝑎 ⋅ 𝒖𝑏 = ―1.0 (antiparallel).

Validation of NMR Models and Structural Ensemble We validated the NMR models of the CTDf-S100B heterodimer by examining whether distance of each NOE pair satisfies Eq. 3. We found that on average, 87.7% of the intraCTDf NOE pairs satisfy Eq. 3 (Table S2A) whereas only 36.6% of the inter-CTDf-S100B NOE pairs meets Eq. 3 (Table S2B). This low satisfaction ratio implied that the CTDfbinding site of S100B had not been determined precisely in the NMR models. This is consistent with the instability of the NMR models that was shown in previous MD studies22–25,29–31.

ACS Paragon Plus Environment

22

Page 23 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

To validate the structural ensemble computed from our simulation, we examined whether each NOE-like distance satisfies the inequality Eq. 2. We found that 86.6% of inter-NOE-like distances satisfied Eq. 2, as did 84.1% of the intra-NOE-like distances (Table S3A and S3B). These relatively high satisfaction ratios suggested that our ensemble is not inconsistent with the NOESY measurement14. Furthermore, to examine whether the residues at which chemical shift perturbation (CSP) occurred33 were consistent with the contacting residues observed in our structural ensemble, we calculated the contact probability of each residue pair between S100B and CTDf (Figure 6). We found that the top ten of the most frequentlycontacting residues in S100B were, from the most frequent, E45, L44, M79, A83, T59, V56, F87, V80, K55, and C84. Except for K55, these residues corresponded to the CSPoccurring residues (yellow stars in Figure 6). However, the residues V53, T81, and H85 were not in contact with CTDf in our simulation, although they exhibited CSPs. This discrepancy might be explained as follows: the three residues of S100B in the NMR models are not exposed to solvent, which implies that they cannot interact directly with CTDf. Although CSPs are in general caused by direct interactions or allosteric effects from neighbouring residues, the former case is thus not likely. Because our contact analysis did not take the latter effects into account, the discrepancy may have been caused by allosteric effects.

ACS Paragon Plus Environment

23

Journal of Chemical Theory and Computation

E45

0.04

39

T78

0.00

V53

0.02

F88 F87 E86 H85 C84 T82 T81

0.06

A83

0.08

V80

T59 V56 K55 V52

0.10

M79

0.12

F43 S41

Contact probability

0.14

L44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 48

60

80

Sequence of S100B

92

Figure 6. Inter-CTDf-S100B contact probability as a function of the sequence number of S100B. The sequence number of the dimer interface is not shown in this figure. Definition of a ‘contact’ is given in the Materials and Methods section. Yellow stars specify residues at which chemical shift perturbations occured33 and blue stars represent residues in contact with CTD in the NMR models14.

The contacting residues of S100B with CTDf in the NMR models were also examined (blue stars in Figure 6). We found only three frequently-contacting residues in the models: M79, E45, and V52, which belong to the set of contacting residues derived from the computed structural ensemble and the CSPs. These results suggested that the NMR models belong to a subset of the structural ensemble obtained in the present study. To investigate NMR models further, we reconstructed NMR models via the same restraints (NOE distance restraints, dihedral-angle restraints, and hydrogen-bond restraints) that were used in the original NMR study14 (see ‘Reconstruction of NMR

ACS Paragon Plus Environment

24

Page 25 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

models’ of the Supporting Information), using Xplor-NIH software61,62. As expected, the resultant models (Figure S8A) were similar to the original NMR models14 (PDB ID: 1DT7). In the above reconstruction, it was observed that only a single pair (No. 92 in Table S3A) of the 101 intra-CTD NOE pairs supported the helix formation for CTD. Moreover, the chemical shift index (CSI) information deposited in the BMRB database (accession number, 4702)63 suggested the existence of only a single helical turn between the 384-th and 387-th residues (Figure S9). Notably, these two residues are consistent with the NOE pair No. 92. It was therefore inferred that CTD does not form a long helix over two turns upon binding to S100B. We thus rebuilt the NMR models again without the dihedral-angle and hydrogen-bond restraints; all of the NOE distance restraints satisfied Eq. 3 with the tolerance 0.5 Å. We confirmed the disappearance of the long helix (Figure S8B). Moreover, five of the ten models involved one helical turn, which is supported by the NOE distance pair No. 92 and the CSI information. Then, we further checked effects of the dihedral-angle and hydrogen-bond restraints on the obtained models in Figure S8A. All of the NOE distance restraints in the models of Figure S8A satisfied Eq. 3 where 𝛥𝑟NOE = 0.5 Å. However, seven dihedral-angle restraints of CTDf in the models were violated; the violation data were outputted when the reconstruction by Xplor-NIH software61,62 was completed. The violation indicated that the helix-related restraints induce stress in the CTD conformation. Considering that the original NMR models (1DT7) involved violations of the NOE-distance restraints, we

ACS Paragon Plus Environment

25

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 48

presumed that the dihedral-angle and hydrogen-bond restraints induce structural stress, which in turn may results in violations of the NOE-distance restraints depending on the modelling process. The multimodal structural distribution (Figure 2) may also be supported by other S100B-related complex structures (Figure S10) that have been reported previously64–66. These complexes indicate that each of the different peptides binds to S100B by using various conformations. For example, the RSK1 peptide adopts various conformations on S100B (Figure S10B). Although these complexes are not directly related to our CTDfS100B system, the structural heterogeneity of the peptides in these S100B-related complexes may provide further evidence supporting the multimodal structural distribution.

Conformational Entropy Gain of CTDf in Complex with S100B ― CTDf To compare the free-state ensemble with the bound-state one of CTDf (𝑄free cano ― CTDf (300K) and 𝑄bound (300K), respectively), we mapped them onto the 2D PC space, cano

producing conformational distributions (Figure 7). The contribution ratios for PC1, PC2, PC3, and PC4 were 59.3%, 19.6%, 8.4%, and 4.5%, respectively: cumulative ratio = 78.9% for PC1 and PC2 and 91.8% for PC1 to PC4. The conformational distribution on the 2D PC space demonstrated that both the free and the bound states were disordered (Figure

ACS Paragon Plus Environment

26

Page 27 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

7) because they have a variety of conformations: helical (𝑠1), partially helical (𝑠2, 𝑠3, and 𝑠6), bent (𝑠4), and extended (𝑠5) conformations (Figure 8). ― CTDf We also found that the distribution for 𝑄bound (300K) was broader than that cano ― CTDf for 𝑄free (300K) in the 2D space. This tendency was observed even in our previous cano

simulation of the NRSF-mSin3 complex52 and thus may not comprise a special but rather a general feature for IDP-partner complex formation.

― 𝐶𝑇𝐷𝑓 ― 𝐶𝑇𝐷𝑓 (300𝐾) (magenta) on 2D space Figure 7. Conformational distributions of 𝑄𝑏𝑜𝑢𝑛𝑑 (300K) (green) and 𝑄𝑓𝑟𝑒𝑒 𝑐𝑎𝑛𝑜 𝑐𝑎𝑛𝑜 ― 𝐶𝑇𝐷𝑓 constructed by PC1 and PC2 axes. These axes were derived from the combined ensemble (𝑄𝑏𝑜𝑢𝑛𝑑 plus 𝑐𝑎𝑛𝑜 ― 𝐶𝑇𝐷𝑓 𝑆 𝑄𝑓𝑟𝑒𝑒 ,…,𝑆 ), and hence the axes for the two distributions are identical. Conformations in circles 1 6 are illustrated 𝑐𝑎𝑛𝑜 in Figure 8.

ACS Paragon Plus Environment

27

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 48

Figure 8. CTDf conformations chosen from 𝑆1,…,𝑆6. Red regions were used in the PCA of Figure 7, whereas white regions were not used. The white sphere represents the N-terminus of CTDf.

To quantify the difference between the two distributions, we computed conformational entropy 𝑆, which is defined in Eq. 4 and 5, of each distribution on the 2D and 4D PC. Here, we define an entropy difference as ∆𝑆 = 𝑆(bound state) ―𝑆(free state). ∆𝑆 values were 0.0033 and 0.0046 kcal/(mol ⋅ K) in the 2D and 4D PC spaces, respectively; their free-energy differences were 0.99 and 1.40 kcal/mol. We also calculated the conformational entropy as defined in earlier studies67–69 (see Eq. S32 in ‘Quasi-Harmonic Conformational Entropy’ of the Supporting Information). The resultant free-energy difference was 4.8 kcal/mol, which was larger than those calculated from Eq. 4 and 5. In general, entropy is a quantity difficult to be computed from simulations, and the above-obtained ∆𝑆 value has some ambiguities; rather, its error cannot be computed. Thus, the resultant ∆𝑆 for thermodynamics of the binding mechanism is not

ACS Paragon Plus Environment

28

Page 29 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

utilised quantitatively. However, Eq. 4, 5, or S32 may serve as a good measure to assess the conformational variety of a given distribution. Because the original molecular conformation is defined in the high-dimensional space, the 2D representation (i.e., Figure 7) may provide false information for the conformational variety. Note that Eq. 5 and S32 are applicable to higher-dimensional conformational space. We concluded that the bound-state ensemble has a larger conformational variety than that of the freestate. One may further consider that the conformational entropy gain (∆𝑆 > 0) would be suspicious because properties of IDR peptides are sensitive to force field parameters70. ― CTDf However, in our previous study, the free-state ensemble 𝑄free (i.e., the ensemble of cano

a single CTDf in isolation) has been supported by circular dichroism spectroscopy experiments21. Additionally, as we used an identical force field in both the previous21 and present study, the actual difference between the free- and bound-state ensembles was obtained. A physicochemical reason for the gain of the conformational variety might be as follows: CTDf in isolation forms compact conformations (e.g., magenta dots in 𝑠2 in Figure 7) via its intra hydrogen bonds or hydrophobic interactions. Alternatively, when CTDf attempts to bind to S100B, the hydrophobic residues of CTDf can be exposed to hydrophobic surfaces of S100B; see the surface on S100B in Figure S11. Thus, the peptide adopts extended conformations more frequently than in the free state. Such a

ACS Paragon Plus Environment

29

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 48

hydrophobic surface may enable fuzzy complex formation in general, because another receptor, mSin3, for an IDP, NRSF, also presents a wide hydrophobic surface, to which NRSF binds through various binding modalities (Figure S12)52. The gain of conformational variety could decrease free energy upon binding of CTD to S100B, and compensate the free-energy elevation induced by confinement of CTDf around the binding site of S100B. Assuming that the gain occurs when CTD binds to another partner molecule, the binding free energy may decrease. This decline of the free energy may induce the ability to bind to many partners, which is termed the ‘hub property’ (discussed in the subsequent section).

Hub Property Most IDRs act as hubs in a protein-protein interaction network because of the ability (hub property) to bind to many partner molecules. CTD functions as such a hub. Here, we suggest that changes in free energy landscape control the hub property depending on a partner molecule. ― CTDf By performing PCA for 𝑄bound (300K), we produced a free-energy (PMF) cano

landscape (Eq. 6) on the 2D PC space (Figure 9). The contribution ratios were 64.3% and 17.2% for PC1 and PC2, respectively (cumulative ratio = 81.5%). The four bound CTDf conformations, which were derived from the four different partner complexes, were projected onto the landscape. The landscape contains several free-energy basins; i.e., it

ACS Paragon Plus Environment

30

Page 31 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

is multimodal. Furthermore, it covers S100B-bound along with other bound conformations. The S100B-bound conformation is located in the lowest free-energy basin. The cyclinA- and Sir2-bound conformations are in a low free-energy basin, which includes extended conformations, whereas the CBP-bound conformation is at a high free-energy region. Although the four bound conformations were involved in the sampled region, their free energy should depend on the partner molecule to be bound. For example, although the CBP-bound conformation is located at an unstable region on the landscape of Figure 9, the free energy may decrease upon binding to CBP. Such changes in the free energy landscape may regulate the hub property. To confirm that the landscape varies depending on each partner molecule, other CTD-partner systems should be investigated in future studies.

ACS Paragon Plus Environment

31

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 48

― 𝑠𝑡𝑎𝑡𝑒 (300𝐾)) in 2D PC space. The colour Figure 9. Free-energy (PMF) landscape of the bound-state ensemble (𝑄𝑏𝑜𝑢𝑛𝑑 𝑐𝑎𝑛𝑜 bar shows PMF value. Each black mark indicates a position of a bound CTDf conformation in each of the four complexes: CTD-S100B (circle), CTD-Cyclin A (triangle), CTD-Sir2 (square), and CTD-CBP (star). PCA was performed using distance pairs of seven C-alpha atoms (residues H380–K386) of CTDf. The seven C-alpha atoms were selected from the binding region common to the four complexes.

CONCLUSION Intrinsically disordered proteins are ubiquitous in eukaryotic cells and often act as hubs in protein-protein interaction networks via their molecular interactions. We investigated the molecular interactions of CTDf upon binding to S100B by performing V-McMD, a generalised ensemble all-atom MD simulation, using an explicit solvent. Our simulations produced a structural ensemble of the S100B-CTDf system, demonstrating that CTDf and S100B form a variety of complex structures. Furthermore, binding sites on S100B were determined by the distribution of the centre-of-geometry

ACS Paragon Plus Environment

32

Page 33 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

points of CTDf on S100B. The distribution showed that CTDf concentrated on the hydrophobic site to which CTDf binds in the NMR models. We also analysed the orientation of CTDf relative to S100B, demonstrating that the most probable CTDf orientation corresponded to that in the NMR models. In order to validate our computational data, we compared them with experimental (NOEs14 and CSP33) data. In particular, more than 80% of the NOE-like distances computationally obtained were consistent with the NOE data. Conversely, we noticed that 63.4% of the NOE-pair-inter-distances in the models were inconsistent with the NOE data. Notably, the comparison with CSP data demonstrated that computed contacting residues of S100B with CTDf corresponded to CSP-occurring residues. ― CTDf To examine the difference between the bound-state (𝑄bound ) and the freecano ― CTDf state (𝑄free ) ensembles, we quantified the conformational variety, using expressions cano

of conformational entropy. The entropy of the bound-state ensemble was beyond that of the free-state. The gain of conformational variety suggested that conformational entropy may not operate disadvantageously against the molecular binding. Although the multimodal structural distribution (Figure 2) comprises one of the thermodynamic descriptions for the S100B-CTDf system, the dissociation constant and kinetic constant, which constitute other thermodynamic descriptions, will also improve the understanding of the multimodal interactions. These constants have been obtained experimentally33,71 but were not further discussed here as enhanced sampling methods

ACS Paragon Plus Environment

33

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 48

does not sample dissociated structures sufficiently to obtain a quantitatively correct dissociation constant, and the method cannot in principle provide information regarding time evolution. Simulations using a larger periodic box would be able to sample dissociated structures adequately; however, these require extensive computational time. V-McMD provides useful information for the calculation of kinetic constants even though this method cannot simulate the kinetics of a physical system. Specifically, the information is in the form of a free energy landscape that describes free-energy minima and probable pathways for conformational changes among these minima. Such data help canonical MD simulations to intensively sample structures along the pathways. Consequently, they may provide kinetic constants among these energy minima. This strategy will be implemented in future studies. Moreover, the free energy landscape (Figure 2), which is also called the multimodal distribution here, enabled qualitative assessment of kon value of CTDf. This landscape showed multimodal interactions, which implies ease of the first contact with S100B, and therefore kon value would be large. This inference matches the experimental result indicating that kon value of CTD was larger than those of other peptides tested71. In general, MD simulations may pass over small-volume regions (i.e., minor basins or narrow pathways) in a conformational space despite their combination with an enhanced sampling method. The smaller the regions, the higher the possibility of passing them over72. In contrast, the larger the regions, the higher the possibility of their

ACS Paragon Plus Environment

34

Page 35 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

detection. We have confirmed that such regions are sampled reproducibly using different sets of simulation runs21,52. Therefore, the free-energy landscape (or the conformational distribution) obtained from the current enhanced sampling is regarded as an overall feature of the free-energy landscape. Finally, insights into the atomistic interactions between CTDf and S100B may serve to expand structure-based drug design to IDRs. For example, such understanding would enable modelling IDR-mimic peptides/compounds73, thereby yielding novel compounds against S100B.

ASSOCIATED CONTENT Supporting Information The Supporting Information is available free of charge on the ACS Publications website, and includes Figure S1; Influence of Initial Conformations on the Structural Ensemble; Figure S2; Figure S3; Table S1; Intra-S100B and Steric Restraints; V-McMD; Actual Sampling Procedure: MD and MC; Figure S4; Ensemble Reweighting; Principal Component Analysis (PCA); Figure S5; Parameter Estimation for DBSCAN; Figure S6; Table S2; Satisfaction Ratio of the NMR Model with Upper Bounds; Table S3; Figure S7; Reconstruction of NMR Models; Figure S8; Figure S9; Figure S10; Quasi-Harmonic Conformational Entropy; Shape of CTDf Binding Interface on S100B; Figure S11; Surface

ACS Paragon Plus Environment

35

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 48

of mSin3 Receptor Against a NRSF Disordered Peptide; Figure S12; Biological Implication of the Multimodal Interactions; Figure S13; Figure S14

AUTHOR INFORMATION Corresponding Author *Email: [email protected]

ACKNOWLEDGEMENTS We thank Dr. Hajime Tamaki (Institute for Protein Research, Osaka University) whose instruction enabled us to reconstruct the NMR models. The MD simulations were performed on the TSUBAME2.5 and TSUBAME3.0 supercomputers at the Tokyo Institute of Technology, provided through the HPCI System Research Project (Project IDs: hp140032, hp150015, hp170020, hp170025, hp180050, and hp180054). J.H. was supported by JSPS KAKENHI Grant No. 16K05517 and by the Development of Core Technologies for Innovative Drug Development based upon IT from the Japan Agency for Medical Research and Development, AMED. K.K. was supported by JSPS KAKENHI Grant No. 16K18526 and the Cooperative Research Program of Institute for Protein Research, Osaka University, CR-17-05. H.N. was supported by a Grant-in-Aid for Scientific Research on Innovative Areas (24118008) and by a Grant-in-Aid for

ACS Paragon Plus Environment

36

Page 37 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Challenging Exploratory Research (16K14711) from JSPS. S.I. was supported by a JSPS Research Fellowship for Young Scientists Grant No. 17J07112.

REFERENCES (1)

Patil, A.; Kinoshita, K.; Nakamura, H. Hub Promiscuity in Protein-Protein Interaction Networks. Int. J. Mol. Sci. 2010, 11 (4), 1930–1943.

(2)

Haynes, C.; Oldfield, C. J.; Ji, F.; Klitgord, N.; Cusick, M. E.; Radivojac, P.; Uversky, V. N.; Vidal, M.; Iakoucheva, L. M. Intrinsic Disorder Is a Common Feature of Hub Proteins from Four Eukaryotic Interactomes. PLoS Comput. Biol. 2006, 2 (8), e100.

(3)

Ekman, D.; Light, S.; Björklund, A. K.; Elofsson, A. What Properties Characterize the Hub Proteins of the Protein-Protein Interaction Network of Saccharomyces Cerevisiae? Genome Biol. 2006, 7 (6), R45.

(4)

van der Lee, R.; Buljan, M.; Lang, B.; Weatheritt, R. J.; Daughdrill, G. W.; Dunker, A. K.; Fuxreiter, M.; Gough, J.; Gsponer, J.; Jones, D. T.; et al. Classification of Intrinsically Disordered Regions and Proteins. Chem. Rev. 2014, 114 (13), 6589– 6631.

(5)

Sugase, K.; Dyson, H. J.; Wright, P. E. Mechanism of Coupled Folding and Binding of an Intrinsically Disordered Protein. Nature 2007, 447 (7147), 1021–1025.

(6)

Tompa, P.; Fuxreiter, M. Fuzzy Complexes: Polymorphism and Structural Disorder in Protein-Protein Interactions. Trends Biochem. Sci. 2008, 33 (1), 2–8.

(7)

Babu, M. M. The Contribution of Intrinsically Disordered Regions to Protein

ACS Paragon Plus Environment

37

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 48

Function, Cellular Complexity, and Human Disease. Biochem. Soc. Trans. 2016, 44 (5), 1185–1200. (8)

Kasahara, K.; Shiina, M.; Higo, J.; Ogata, K.; Nakamura, H. Phosphorylation of an Intrinsically Disordered Region of Ets1 Shifts a Multi-Modal Interaction Ensemble to an Auto-Inhibitory State. Nucleic Acids Res. 2018, 46 (5), 2243–2251.

(9)

Borgia, A.; Borgia, M. B.; Bugge, K.; Kissling, V. M.; Heidarsson, P. O.; Fernandes, C. B.; Sottini, A.; Soranno, A.; Buholzer, K. J.; Nettels, D.; et al. Extreme Disorder in an Ultrahigh-Affinity Protein Complex. Nature 2018, 555 (7694), 61–66.

(10) Rogers, J. M.; Wong, C. T.; Clarke, J. Coupled Folding and Binding of the Disordered Protein PUMA Does Not Require Particular Residual Structure. J. Am. Chem. Soc. 2014, 136 (14), 5197–5200. (11) Brzovic, P. S.; Heikaus, C. C.; Kisselev, L.; Vernon, R.; Herbig, E.; Pacheco, D.; Warfield, L.; Littlefield, P.; Baker, D.; Klevit, R. E.; et al. The Acidic Transcription Activator Gcn4 Binds the Mediator Subunit Gal11/Med15 Using a Simple Protein Interface Forming a Fuzzy Complex. Mol. Cell 2011, 44 (6), 942–953. (12) Terakawa, T.; Kenzaki, H.; Takada, S. P53 Searches on DNA by Rotation-Uncoupled Sliding at C-Terminal Tails and Restricted Hopping of Core Domains. J. Am. Chem. Soc. 2012, 134 (35), 14555–14562. (13) Levine, Z. A.; Larini, L.; LaPointe, N. E.; Feinstein, S. C.; Shea, J.-E. Regulation and Aggregation of Intrinsically Disordered Peptides. Proc. Natl. Acad. Sci. 2015, 112

ACS Paragon Plus Environment

38

Page 39 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(9), 2758–2763. (14) Rustandi, R.; Baldisseri, D.; Weber, D. Structure of the Negative Regulatory Domain of P53 Bound to S100B(ββ). Nat. Struct. Mol. 2000, 7 (7), 570–574. (15) Hartman, K. G.; Mcknight, L. E.; Liriano, M. A.; Weber, D. J. The Evolution of S100B Inhibitors for the Treatment of Malignant Melanoma. Future Med. Chem. 2013, 5 (1), 97–109. (16) Lowe, E. D.; Tews, I.; Cheng, K. Y.; Brown, N. R.; Gul, S.; Noble, M. E. M.; Gamblin, S. J.; Johnson, L. N. Specificity Determinants of Recruitment Peptides Bound to Phospho-CDK2/Cyclin A. Biochemistry 2002, 41 (52), 15625–15634. (17) Mujtaba, S.; He, Y.; Zeng, L.; Yan, S.; Plotnikova, O.; Sachchidanand; Sanchez, R.; Zeleznik-Le, N. J.; Ronai, Z.; Zhou, M.-M. Structural Mechanism of the Bromodomain of the Coactivator CBP in P53 Transcriptional Activation. Mol. Cell 2004, 13 (2), 251–263. (18) Avalos, J. L.; Celic, I.; Muhammad, S.; Cosgrove, M. S.; Boeke, J. D.; Wolberger, C. Structure of a Sir2 Enzyme Bound to an Acetylated P53 Peptide. Mol. Cell 2002, 10 (3), 523–535. (19) Kinjo, A. R.; Suzuki, H.; Yamashita, R.; Ikegawa, Y.; Kudou, T.; Igarashi, R.; Kengaku, Y.; Cho, H.; Standley, D. M.; Nakagawa, A.; et al. Protein Data Bank Japan (PDBj): Maintaining a Structural Data Archive and Resource Description Framework Format. Nucleic Acids Res. 2012, 40 (D1).

ACS Paragon Plus Environment

39

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 48

(20) Berman, H.; Henrick, K.; Nakamura, H. Announcing the Worldwide Protein Data Bank. Nat. Struct. Mol. Biol. 2003, 10 (12), 980–980. (21) Iida, S.; Mashimo, T.; Kurosawa, T.; Hojo, H.; Muta, H.; Goto, Y.; Fukunishi, Y.; Nakamura, H.; Higo, J. Variation of Free-Energy Landscape of the P53 C-Terminal Domain Induced by Acetylation: Enhanced Conformational Sampling. J. Comput. Chem. 2016, 37 (31), 2687–2700. (22) Allen, W. J.; Capelluto, D. G. S.; Finkielstein, C. V; Bevan, D. R. Modeling the Relationship between the P53 C-Terminal Domain and Its Binding Partners Using Molecular Dynamics. J. Phys. Chem. B 2010, 114 (41), 13201–13213. (23) Chen, J. Intrinsically Disordered P53 Extreme C-Terminus Binds to S100B(ββ) through “Fly-Casting.” J. Am. Chem. Soc. 2009, 131 (6), 2088–2089. (24) Staneva, I.; Huang, Y.; Liu, Z.; Wallin, S. Binding of Two Intrinsically Disordered Peptides to a Multi-Specific Protein: A Combined Monte Carlo and Molecular Dynamics Study. PLoS Comput. Biol. 2012, 8 (9), e1002682. (25) Kannan, S.; Lane, D. P.; Verma, C. S. Long Range Recognition and Selection in IDPs: The Interactions of the C-Terminus of P53. Sci. Rep. 2016, 6, 23750. (26) Fadda, E.; Nixon, M. G. The Transient Manifold Structure of the P53 Extreme CTerminal Domain: Insight into Disorder, Recognition, and Binding Promiscuity by Molecular Dynamics Simulations. Phys. Chem. Chem. Phys. 2017, 19 (32), 21287– 21296.

ACS Paragon Plus Environment

40

Page 41 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(27) James, L. C.; Tawfik, D. S. Conformational Diversity and Protein Evolution – a 60Year-Old Hypothesis Revisited. Trends Biochem. Sci. 2003, 28 (7), 361–368. (28) Bosshard, H. R. Molecular Recognition by Induced Fit: How Fit Is the Concept? Physiology 2001, 16 (4), 171–173. (29) McDowell, C.; Chen, J.; Chen, J. Potential Conformational Heterogeneity of P53 Bound to S100B(ββ). J. Mol. Biol. 2013, 425 (6), 999–1010. (30) Whitlow, J. L.; Varughese, J. F.; Zhou, Z.; Bartolotti, L. J.; Li, Y. Computational Screening and Design of S100B Ligand to Block S100B-P53 Interaction. J. Mol. Graph. Model. 2009, 27 (8), 969–977. (31) Pirolli, D.; Carelli Alinovi, C.; Capoluongo, E.; Satta, M. A.; Concolino, P.; Giardina, B.; De Rosa, M. C. Insight into a Novel P53 Single Point Mutation (G389E) by Molecular Dynamics Simulations. Int. J. Mol. Sci. 2010, 12 (1), 128–140. (32) Higo, J.; Umezawa, K.; Nakamura, H. A Virtual-System Coupled Multicanonical Molecular Dynamics Simulation: Principles and Applications to Free-Energy Landscape of Protein-Protein Interaction with an All-Atom Model in Explicit Solvent. J. Chem. Phys. 2013, 138 (18), 184106–184117. (33) Wilder, P. T.; Lin, J.; Bair, C. L.; Charpentier, T. H.; Yang, D.; Liriano, M.; Varney, K. M.; Lee, A.; Oppenheim, A. B.; Adhya, S.; et al. Recognition of the Tumor Suppressor Protein P53 and Other Protein Targets by the Calcium-Binding Protein S100B. Biochim. Biophys. Acta 2006, 11, 1284–1297.

ACS Paragon Plus Environment

41

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 48

(34) Kasahara, K.; Ma, B.; Goto, K.; Dasgupta, B.; Higo, J.; Fukuda, I.; Mashimo, T.; Akiyama, Y.; Nakamura, H. MyPresto/Omegagene: A GPU-Accelerated Molecular Dynamics Simulator Tailored for Enhanced Conformational Sampling Methods with a Non-Ewald Electrostatic Scheme. Biophys. Physicobiology 2016, 13, 209– 216. (35) Mashimo, T.; Fukunishi, Y.; Kamiya, N.; Takano, Y.; Fukuda, I.; Nakamura, H. Molecular

Dynamics

Simulations

Accelerated

by

GPU

for

Biological

Macromolecules with a Non-Ewald Scheme for Electrostatic Interactions. J. Chem. Theory Comput. 2013, 9 (12), 5599–5609. (36) Ryckaert, J.; Ciccotti, G.; Berendsen, H. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of nAlkanes. J. Comput. Phys. 1977, 341, 327–341. (37) Fukuda, I.; Yonezawa, Y.; Nakamura, H. Molecular Dynamics Scheme for Precise Estimation of Electrostatic Interaction via Zero-Dipole Summation Principle. J. Chem. Phys. 2011, 134 (16), 164107. (38) Evans,

D.;

Morriss,

G.

THE

ISOTHERMAL/ISOBARIC

MOLECULAR

DYNAMICSENSEMBLE. Phys. Lett. A 1983, 98 (8), 7–10. (39) Kamiya, N.; Watanabe, Y. S.; Ono, S.; Higo, J. AMBER-Based Hybrid Force Field for Conformational Sampling of Polypeptides. Chem. Phys. Lett. 2005, 401 (1–3), 312– 317.

ACS Paragon Plus Environment

42

Page 43 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(40) Fox, T.; Kollman, P. A. Application of the RESP Methodology in the Parametrization of Organic Solvents. J. Phys. Chem. B 1998, 102 (41), 8070–8079. (41) Ȧqvist, J. Ion-Water Interaction Potentials Derived from Free Energy Perturbation Simulations. J. Phys. Chem. 1990, 94 (21), 8021–8024. (42) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79 (2), 926–935. (43) Berg, B. a.; Neuhaus, T. Multicanonical Ensemble: A New Approach to Simulate First-Order Phase Transitions. Phys. Rev. Lett. 1992, 68 (1), 9–12. (44) Kidera,

a. Enhanced Conformational Sampling in Monte Carlo Simulations of

Proteins: Application to a Constrained Peptide. Proc. Natl. Acad. Sci. U. S. A. 1995, 92 (21), 9886–9889. (45) Hansmann, U. H. E.; Okamoto, Y. Prediction of Peptide Conformation by Multicanonical Algorithm: New Approach to the Multiple-minima Problem. J. Comput. Chem. 1993, 14 (11), 1333–1338. (46) Hansmann, U. H. E.; Okamoto, Y.; Eisenmenger, F. Molecular Dynamics, Langevin and Hydrid Monte Carlo Simulations in a Multicanonical Ensemble. Chem. Phys. Lett. 1996, 259 (3–4), 321–330. (47) Nakajima, N.; Nakamura, H.; Kidera, A. Multicanonical Ensemble Generated by Molecular Dynamics Simulation for Enhanced Conformational Sampling of

ACS Paragon Plus Environment

43

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 48

Peptides. J. Phys. Chem. B 1997, 101 (5), 817–824. (48) Nakajima, N.; Higo, J.; Kidera, A.; Nakamura, H. Flexible Docking of a Ligand Peptide to a Receptor Protein by Multicanonical Molecular Dynamics Simulation. Chem. Phys. Lett. 1997, 278 (October), 297–301. (49) Ikebe, J.; Umezawa, K.; Kamiya, N.; Sugihara, T.; Yonezawa, Y.; Takano, Y.; Nakamura, H.; Higo, J. Theory for Trivial Trajectory Parallelization of Multicanonical Molecular Dynamics and Application to a Polypeptide in Water. J. Comput. Chem. 2011, 32 (7), 1286–1297. (50) Higo, J.; Kamiya, N.; Sugihara, T.; Yonezawa, Y.; Nakamura, H. Verifying Trivial Parallelization of Multicanonical Molecular Dynamics for Conformational Sampling of a Polypeptide in Explicit Water. Chem. Phys. Lett. 2009, 473 (4–6), 326–329. (51) Sugihara, T.; Higo, J.; Nakamura, H. Parallelization of Markov Chain Generation and Its Application to the Multicanonical Method. J. Phys. Soc. Japan 2009, 78 (7), 074003. (52) Higo, J.; Nishimura, Y.; Nakamura, H. A Free-Energy Landscape for Coupled Folding and Binding of an Intrinsically Disordered Protein in Explicit Solvent from Detailed All-Atom Computations. J. Am. Chem. Soc. 2011, 133 (27), 10448–10458. (53) Higo, J.; Nakamura, H. Virtual States Introduced for Overcoming Entropic Barriers in Conformational Space. Biophysics (Oxf). 2012, 8, 139–144. (54) Iida, S.; Nakamura, H.; Higo, J. Enhanced Conformational Sampling to Visualize a

ACS Paragon Plus Environment

44

Page 45 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Free-Energy Landscape of Protein Complex Formation. Biochem. J. 2016, 473, 1651–1662. (55) Hayward, S.; Kitao, A.; Berendsen, H. J. C. Model-Free Methods of Analyzing Domain Motions in Proteins from Simulation: A Comparison of Normal Mode Analysis and Molecular Dynamics Simulation of Lysozyme. Proteins Struct. Funct. Genet. 1997, 27 (3), 425–437. (56) Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Müller, A.; Nothman, J.; Louppe, G.; et al. Scikit-Learn: Machine Learning in Python. … Mach. Learn. … 2012, 12, 2825–2830. (57) Ester, M.; Kriegel, H. P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proc. 2nd Int. Conf. Knowl. Discov. Data Min. 1996, 226–231. (58) Ikebe, J.; Standley, D. M.; Nakamura, H.; Higo, J. Ab Initio Simulation of a 57Residue Protein in Explicit Solvent Reproduces the Native Conformation in the Lowest Free-Energy Cluster. Protein Sci. 2011, 20 (1), 187–196. (59) Tajielyato, N.; Li, L.; Peng, Y.; Alper, J.; Alexov, E. E-Hooks Provide Guidance and a Soft Landing for the Microtubule Binding Domain of Dynein. Sci. Rep. 2018, 8 (1), 13266. (60) Sharma, R.; Raduly, Z.; Miskei, M.; Fuxreiter, M. Fuzzy Complexes: Specific Binding without Complete Folding. FEBS Lett. 2015, 589 (19PartA), 2533–2542.

ACS Paragon Plus Environment

45

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 46 of 48

(61) Schwieters, C. D.; Kuszewski, J. J.; Tjandra, N.; Marius Clore, G. The Xplor-NIH NMR Molecular Structure Determination Package. J. Magn. Reson. 2003, 160 (1), 65–73. (62) SCHWIETERS, C.; KUSZEWSKI, J.; MARIUSCLORE, G. Using Xplor–NIH for NMR Molecular Structure Determination. Prog. Nucl. Magn. Reson. Spectrosc. 2006, 48 (1), 47–62. (63) Ulrich, E. L.; Akutsu, H.; Doreleijers, J. F.; Harano, Y.; Ioannidis, Y. E.; Lin, J.; Livny, M.; Mading, S.; Maziuk, D.; Miller, Z.; et al. BioMagResBank. Nucleic Acids Res. 2007, 36 (Database), D402–D408. (64) Gógl, G.; Alexa, A.; Kiss, B.; Katona, G.; Kovács, M.; Bodor, A.; Reményi, A.; Nyitray, L. Structural Basis of Ribosomal S6 Kinase 1 (RSK1) Inhibition by S100B Protein. J. Biol. Chem. 2016, 291 (1), 11–27. (65) Charpentier, T. H.; Thompson, L. E.; Liriano, M. a; Varney, K. M.; Wilder, P. T.; Pozharski, E.; Toth, E. a; Weber, D. J. The Effects of CapZ Peptide (TRTK-12) Binding to S100B-Ca2+ as Examined by NMR and X-Ray Crystallography. J. Mol. Biol. 2010, 396 (5), 1227–1243. (66) Jensen, J. L.; Indurthi, V. S. K.; Neau, D. B.; Vetter, S. W.; Colbert, C. L. Structural Insights into the Binding of the Human Receptor for Advanced Glycation End Products (RAGE) by S100B, as Revealed by an S100B–RAGE-Derived Peptide Complex. Acta Crystallogr. Sect. D Biol. Crystallogr. 2015, 71 (5), 1176–1183. (67) Schlitter, J. Estimation of Absolute and Relative Entropies of Macromolecules

ACS Paragon Plus Environment

46

Page 47 of 48 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Using the Covariance Matrix. Chem. Phys. Lett. 1993, 215 (6), 617–621. (68) Andricioaei, I.; Karplus, M. On the Calculation of Entropy from Covariance Matrices of the Atomic Fluctuations. J. Chem. Phys. 2001, 115 (14), 6289–6292. (69) Karplus, M.; Kushick, J. N. Method for Estimating the Configurational Entropy of Macromolecules. Macromolecules 1981, 14 (2), 325–332. (70) Robustelli, P.; Piana, S.; Shaw, D. E. Developing a Molecular Dynamics Force Field for Both Folded and Disordered Protein States. Proc. Natl. Acad. Sci. 2018, 115 (21), E4758–E4766. (71) Wafer, L. N.; Streicher, W. W.; McCallum, S. A.; Makhatadze, G. I. Thermodynamic and Kinetic Analysis of Peptides Derived from CapZ, NDR, P53, HDM2, and HDM4 Binding to Human S100B. Biochemistry 2012, 51 (36), 7189–7201. (72) Higo, J.; Kasahara, K.; Nakamura, H. Multi-Dimensional Virtual System Introduced to Enhance Canonical Sampling. J. Chem. Phys. 2017, 147 (13), 134102. (73) Pelay-Gimeno, M.; Glas, A.; Koch, O.; Grossmann, T. N. Structure-Based Design of Inhibitors of Protein-Protein Interactions: Mimicking Peptide Binding Epitopes. Angew. Chemie Int. Ed. 2015, 54 (31), 8896–8927.

ACS Paragon Plus Environment

47

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 48

TOC figure

Multimodal Structural Distribution

Principal component space

ACS Paragon Plus Environment

48