Multiscale Model for the Assembly Kinetics of Protein Complexes - The

Jan 6, 2016 - Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, ...
0 downloads 0 Views 3MB Size
Subscriber access provided by UNIV OF CALIFORNIA SAN DIEGO LIBRARIES

Article

A Multi-Scale Model for the Assembling Kinetics of Protein Complexes Zhong-Ru Xie, Jiawen Chen, and Yinghao Wu J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.5b08962 • Publication Date (Web): 06 Jan 2016 Downloaded from http://pubs.acs.org on January 13, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

A Multi-scale Model for the Assembly Kinetics of Protein Complexes Zhong-Ru Xie, Jiawen Chen, Yinghao Wu* Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461.

*

Corresponding authors: Yinghao Wu Phone: (718) 678-1232, Fax: (718) 678-1018, E-mail:[email protected]

1 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 2 of 45

Yinghao Wu

ABSTRACT The assembly of proteins into high-order complexes is a general mechanism for these biomolecules to implement their versatile functions in cells. Natural evolution has developed various assembling pathways for specific protein complexes to maintain their stability and proper activities. Previous studies have provided numerous examples of the misassembly of protein complexes leading to severe biological consequences. Although the research focusing on protein complexes has started to move beyond the static representation of quaternary structures to the dynamic aspect of their assembly, the current understanding of the assembly mechanism of protein complexes is still largely limited. To tackle this problem, we developed a new multi-scale modeling framework. This framework combines a lowerresolution rigid-body-based simulation with a higher-resolution Cα-based simulation method so that protein complexes can be assembled with both structural details and computational efficiency. We applied this model to a homo-trimer and a hetero-tetramer as simple test systems. Consistent with experimental observations, our simulations indicated very different kinetics between protein oligomerization and dimerization. The formation of protein oligomers is a multi-step process that is much slower than dimerization but thermodynamically more stable. Moreover, we showed that even the same protein quaternary structure can have very diverse assembly pathways under different binding constants between subunits, which is important for regulating the functions of protein complexes. Finally, we revealed that the binding between subunits in a complex can be synergistically strengthened during assembly without considering allosteric regulation or conformational changes. Therefore, our model provides a useful tool to understand the general principles of protein complex assembly.

2 ACS Paragon Plus Environment

Page 3 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

1. Introduction Proteins are building blocks of almost all biological functions. The majority of these biomolecules are oligomerized into complexes in living cells1-2. The formation of some complexes is dynamic, such as the molecular platforms in the transient processes of signal transduction3-4 or transcriptional regulation5-6. Some complexes, in contrast, are permanent, such as ATP synthase7, RNA polymerase8 and other supramolecular motors9. From the perspective of evolution, the oligomerization of proteins is advantageous for a number of reasons. For instance, subunits in a protein complex will gain thermodynamic stability after oligomerization10. Moreover, oligomerization often leads to further opportunities for functional control. For example, the presence of multiple binding sites in a protein oligomer facilitates high-order cooperation or allosteric regulation during ligand binding11. Both the stability and functions of protein complexes are largely determined by the multistep process by which their subunits assemble. Numerous studies have shown that the misassembly of proteins can result in severe biological consequences12. Therefore, studies of the assembly pathways for protein complexes are essential in biophysics and molecular biology. However, compared to the intensive attention in properties of binary protein interactions and the determination of protein quaternary structures, the current understanding of the assembly mechanism of protein complexes is still largely limited due to the high-order complexity in the assembling pathways. In principle, protein complex assembly is modulated by multiple factors, such as stoichiometry, spatial arrangement of subunits, binding rates and equilibrium constants among individual subunits. Quantitative evaluations of these determinants and their impacts on complex assembly are important for understanding the biological functions of protein complexes in cells. The kinetics of complex assembly have been experimentally observed in a number of molecular systems, such as the spliceosomal snRNP core13, the 26S proteasome14, and the pre-initiation transcription complex15. The in vitro biophysical techniques, such as analytical ultracentrifugation (AUC)16, scattering techniques17, NMR spectroscopy18, isothermal titration calorimetry (ITC)19, and mass spectrometry20, permit a quantitative analysis of the stoichiometry or binding parameters of complexes but lose 3 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 4 of 45

Yinghao Wu

the biological relevance of their assembly21. Other in vivo approaches, such as crosslinking22, Forster resonance energy transfer (FRET)23, fluorescence recovery after photobleaching (FRAP)24, can be used to detect the assembly processes in physiological environments. The information that these approaches can provide, however, is incomplete due to the multiple levels of cellular factors. In addition to experimental measurements, a large variety of computational models have also been developed to simulate the association of protein complexes. Computational simulations can reach dimensions that are currently unapproachable in the laboratory. Mathematical models were first applied to describe the binding kinetics of protein oligomers25. However, specific information, such as spatial heterogeneities and stochasticity could not be captured. In contrast, atom-based or coarse-grained molecular dynamic simulations were used to account for the structural description of proteins26-42. A primary limitation of these structural-based simulations is the intensive computational consumption, which prohibits them from being applied to the full assembly processes of large molecular complexes in a realistic time scale. Alternatively, lower-resolution particle-based simulation approaches compromise both molecular details and computational efficiency43-50. The central problem in these methods is how to characterize the binding between subunits of an oligomer with physical details while maintaining the computational affordability. In this article, we present a multi-scale model to tackle this problem. Multi-scale modeling is a powerful tool to address the multi-level complexity in biological problems. One category of models is based on the coarse-graining atomic structure of biomolecules to study their dynamics and functions51-57. Other lower-resolution models are used to bridge processes at the cellular level58-62. Our model consists of simulations on two different scales. The low-level simulation uses Cα-based coarse-grained (CG) models of protein structure to evaluate the binding rates between two subunits in a complex, whereas the high-level simulation uses a rigid-body (RB) representation to model the process of complex assembly. The two levels of simulations are integrated by feeding the binding rates estimated from the Cα-based model into the RB model as input parameters. A homo-trimer that contains three identical subunits was first used as a test system. Compared to the process of dimerization, the assembly kinetics of the trimer are much 4 ACS Paragon Plus Environment

Page 5 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

slower. However, assembled trimers are thermodynamically more stable than are dimers. The functional importance of this phenomenon is discussed in the text. We further applied our method to study the assembly of a hetero-tetramer, which includes two types of subunits. We found that the assembling pathways are highly dependent on the values of the given binding rates between subunits. Moreover, the tetramer contains both homogenous and heterogeneous binding interfaces. It was shown that these two types of interactions are mutually strengthened, indicating a mechanism of cooperation during assembly. In summary, our model provides a useful tool to understand the general principles of protein complex assembly. This method can be applied to study the assembly of specific bio-molecular complexes with both structural detail and computational efficiency.

2. Methods 2.1. General framework of the multiscale model A study of the full process of complex assembly is beyond the scope of current structural-based simulation methods. The simplified lower-resolution model is a suitable tool to tackle the problem. We recently developed a coarse-grained model to simulate the kinetics of molecular binding63. In this model, the binding affinity ∆G0 and the on rate kon are two important factors that determine the thermodynamic and kinetic properties of binding. The binding affinity is an independent parameter in simulations; it can be derived from experimental measurement or by computational prediction using currently available programs. The kinetic on rate, however, is affected by both the diffusion constant of each protein and the intrinsic binding parameter rass. The rass is the rate of association between two interacting proteins within certain binding criteria; it is a specific parameter resulting from the coarse-grained nature of our simplified model and depends on the choice of different binding criteria, such as the distance and orientation between two proteins. The values of rass were arbitrarily determined in our previous study; its physical properties and relation to the experimentally measured kon have not been systematically evaluated. 5 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 6 of 45

Yinghao Wu

By introducing a multiscale modeling framework, we revisited the definition of rass and used it to link the macroscopic parameters, such as the kinetic rate, with the microscopic coefficients, such as the energy at the binding interfaces. The schematic of the multiscale framework is presented in Fig. 1. Specifically, Cα-based CG simulation was applied to calculate the kon of association between two interacting proteins or protein subunits64-65. The value of kon depends on an energy parameter that is used in the simulation. We adjusted the energy parameter to reproduce the experimentally measured or computationally predicted value of kon (Fig. 1a). Given the same energy parameter, a different set of Cα-based CG simulations was then used to estimate the rate of association rass (Fig. 1b), assuming that the physical properties of binding between these two proteins can be captured by the selected energy parameter. Finally, the derived rass, together with the diffusion constants and binding affinity of interacting proteins, was used to guide the RB-based simulation (Fig. 1c), which contains a large number of molecules in the system. To simulate the process of complex assembly, the model has been extended by placing multiple binding sites on the surface of each rigid body, depending on the quaternary arrangement of a specific complex. The associations between each different pairs of subunits in the complex need to be calculated individually by the same process, as shown in Fig. 1d. All derived rates are integrated into the RB simulation so that multiple copies of all subunit types can be assembled together. The details of each step in this multiscale modeling framework will be expanded as follows. 2.2. Cα-based Kinetic Monte-Carlo (KMC) simulations of protein-protein association In an attempt to evaluate the binding between two proteins or two subunits in a protein complex, a Monte-Carlo method based on coarse-grained protein structural models has been developed to simulate the process of protein association64-65 and is briefly described as follows. The details of the method can be found in our previous studies. In the simulations, each residue in a protein is represented by the Cα atom. Proteins are connected by a virtual backbone that is formed between every two consecutive Cα atoms. The simulation begins from an initial conformation, in which a pair of proteins or protein subunits is randomly placed in a 3D box (Fig. 1a). Following the initial conformation, both proteins randomly diffuse in the simulated box, and a 6 ACS Paragon Plus Environment

Page 7 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

periodic boundary condition is applied. Within each time step, the diffusions of each protein consist of a random translational movement and a random rotation along three Euler angles. The translational and rotational diffusion constants for each protein are adopted from fitting the data that were calculated by a precise boundary element method66-67. Diffusions are guided by the following target function: E Tot = E GO + E Rp

(1)

The first term is a GO-like potential68 that is applied to all native contacts: 0, EGO = ∑ij µδ (rij ); δ (rij ) =  1,

rij − rij0 ≥ d c rij − rij0 < d c

(2)

In the above equation, any pair of Cα atoms between residues i in one protein and j in the other is defined as a native contact if its corresponding distance in the native structure is smaller than 7.5 Ǻ. The rij is the Cα distance between residues i and j in the current conformation, rij0 is the distance in the native structure, and dc is a distance cutoff equals to 2 Ǻ. The adjustable parameter µ defines the energy depth of the GO potential. The second term in Equation (1) is a repulsive penalty effective when the Cα distance between any residue pair is smaller than 3 Ǻ during simulations. The simulation steps of biased diffusions are iterated until two originally separated proteins either form an encounter complex or fail within a given time length. Assuming that the formation of encounter complexes is rate limiting, we generated a large number of Ntot simulation trajectories and counted the frequency ρ at which two proteins successfully associate among these trajectories. Because each trajectory includes n simulation steps, the total simulation time ttot of one trajectory is n×∆t, in which ∆t is the time length of each simulation step. Given the volume of the simulation box V, the on rate of protein association can be calculated as:

kon =

c ρV ttot

(3)

7 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 45

2016-01-06

Yinghao Wu

In Equation (3), c is a constant that converts the unit from molecule/nm3 to M. As we will illustrate in the results, changes in the energy parameter µ will result in different frequency values for successful association. Therefore, for a pair of interacting proteins whose on rate has been experimentally or computationally determined, we adjusted the parameter µ to optimize their binding interactions so that our calculated kon fit the predetermined value. This parameter was then used to estimate the rate rass, as described in the next part.

2.3. Derivation of the association rate rass by the given energy parameter µ We derived the value of the energy parameter µ to fit the experimentally measured kon in 2.2. This value was further used to calculate the intrinsic rate of association rass for the RB simulation on the next level. The same algorithm of Cα-based KMC simulation was used. However, the initial configuration and simulation strategy were modified. The modifications are specified as follows. First, instead of placing two proteins or protein subunits randomly in a 3D box as shown in Fig. 1a, a different initial conformation is constructed. Specifically, two proteins or protein subunits are placed randomly, but their binding interfaces are within the given distance cutoff dc, and the range of their packing angles is within the cutoff Θc (Fig. 1b). The same cutoffs dc and Θc were used in our previously developed RB-based simulation

63

. In the RB model, an

association between two proteins or protein subunits needs to meet two criteria: 1) the distance between the functional sites of two molecules dij has to be within the given cutoff dc; and 2) to capture the orientational constraint of binding, the packing angles of the two molecules need to fall within the specific range Θc. In the Cα-based simulations, the initial conformations are confined by the same values of dc and Θc so that the intrinsic rate of association rass for the RB model can be appropriately estimated under the same standard. The second modification from the Cα-based simulation of 2.2 is the removal of the periodic boundary condition. Starting from the initial conformation, two molecules either formed an encounter complex or separated far away from each other by the end of each simulation trajectory. The rass is an intrinsic property between two individual 8 ACS Paragon Plus Environment

Page 9 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

proteins; it is concentration independent. Therefore, the application of a periodic boundary condition representing the concentration information is not valid. Finally, the last modification was made by fixing the length of simulation time to ∆tRB. The ∆tRB is the simulation time step of the RB model. By the definition of rass and ∆tRB, between the two molecules that meet the binding criteria, an association can occur at the probability of

rass×∆tRB within each time step of RB simulation. To estimate the value of rass, Cα-based simulations should be carried out with the same time scale. Consequently, each trajectory of Cα-based simulation consists of n* steps so that the total length of simulation time for each trajectory satisfies ∆tRB=n*×∆t, in which ∆t is the time step of Cα-based simulation. We generated a large number of trajectories by the modified Cα-based simulation described above. Within each trajectory, the two proteins randomly diffused with each other after the initial conformation, without the periodic boundary condition. The Go-like potential with the energy parameter µ derived from the last part that fit the experimentally measured kon was fixed to guide the simulations. After all simulation trajectories were completed, we counted the frequency ρ* that two proteins successfully entered into encounter complexes among these trajectories. Therefore, starting from an initial conformation in which two interacting proteins were placed within the binding criteria of the RB model, the Cα-based simulation gives the probability of association ρ* within the time length ∆tRB. Based on the definition of the RB model, rass was calculated as rass= ρ*/n*×∆t= ρ*/∆tRB. The calculated rass was then fed into the RB-based simulations to guide protein interactions or complex assembly in a system that contains a large number of molecules, as will be described in the next part. The stepwise procedure of deriving the rass can be summarized by the following algorithmic flowchart: i.

Input the energy parameter µ from 2.2.

ii.

Generate Nt trajectories of modified Cα-based simulation, each of which contains n*= ∆tRB /∆t steps. For a specific trajectory: ii.a.

Construct the initial conformation by the given cutoffs dc and Θc; set the status of complex formation equal to 0.

ii.b.

Perform diffusions for each molecule separately. 9 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 10 of 45

Yinghao Wu

ii.c.

Update the new conformation by calculating the energy using Equation (1) and energy parameter µ.

ii.d.

Check the new conformation; if it satisfies the binding criteria, set the status of complex formation equal to 1 and go to ii.g.

ii.e.

Update the simulation time with ∆t.

ii.f.

If the total length of simulation time equals ∆tRB, go to ii.g; otherwise, go to ii.b.

ii.g. iii.

Exit the current simulation trajectory.

Determine the value of ρ* by counting the number of trajectories at which the status of complex formation equals 1.

iv.

Set rass=ρ*/∆tRB. Output the value of rass to 2.4.

2.4. An RB-based diffusion reaction algorithm of complex assembling simulations Our previously developed diffusion-reaction simulation model 63 was extended to study the kinetic process of complex assembly. In this model, each molecule was simplified as a spherical rigid body with a given radius. We simulated the kinetic and thermodynamic properties of binary molecular interactions in different cellular environments. We placed only one functional site on the surface of each rigid body to delineate the binding interface between two interaction proteins because in the original version of the RB model, each molecule had only one binding partner. Binding occurs when the corresponding functional sites of two rigid bodies are closer than a predefined distance cutoff and when the two functional sites are oriented properly. However, in a protein complex, each subunit can have more than one binding partner. Therefore, in this study, the RB model has been upgraded by adding multiple binding sites to the surface of each rigid body. The spatial assignment of each binding site depends on the quaternary arrangement of a specific complex. Details about the binding site assignment for each testing complex will be described in the results.

10 ACS Paragon Plus Environment

Page 11 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

Based on the binding site assignment for each subunit in a studied complex, simulations are started from an initial configuration, in which a large number of copies for all species of subunits are randomly distributed in a 3D simulation box. The number of subunits and the size of simulation box in the initial set-up are given by the concentrations and the stoichiometry of the protein complex in study. Followed by the initial conformation, the system is evolved by a diffusion-reaction process. Each simulation time step in this process consists of two scenarios. In the first scenario, molecules are chosen to undergo random diffusions. The acceptance rate of diffusion movements for each molecule is determined by its corresponding diffusion coefficients. If a final complex or a partial transient complex is formed during the process of assembly, all subunits in it will move together, with a relatively smaller value of the diffusion coefficient. The second scenario is to simulate the reaction kinetics of the system. The scenario is divided into association and dissociation parts. In the association part, any pair of subunits that fulfill the distance and orientational binding criteria will associate together with the probability Pon=rass×∆tRB, in which rass is the intrinsic rate of association that has been calculated by Cα-based simulation, as introduced in 2.2 and 2.3. The binding criteria were defined previously in

63

by the distance and orientational

cutoffs dc and Θc, respectively (Fig. 1c). In contrast, any associated pair of subunits will break into separate monomers followed by a calculated probability Poff=C0×kon×e∆G

×∆tRB, in which kon and ∆G are the experimentally derived on rate and binding affinity,

respectively, and C0 is the standard unit of concentration. After both scenarios are sequentially performed in the corresponding time step, the new configuration is updated. Finally, as the above process is iterated, the kinetics of the system evolve in both Cartesian and constitutional spaces.

3. Results 3.1. Validate the multiscale modeling framework by a case study of protein dimerization To test whether our modeling framework is able to capture the kinetic properties of protein association, we first applied the method to study the dimerization of a specific protein complex. In detail, the complex between CD8-α and the human MHC molecule HLA-A2 (PDB id 1AKJ) is used as a test system69. Both the kinetic rate kon and 11 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 12 of 45

Yinghao Wu

dissociation constant Kd of its dimerization have been experimentally measured70. The kon equals 1.4×105 M-1s-1, and the Kd is 126 µM (corresponding to a binding affinity ∆G of 9.0 RT). To computationally validate these experimental values, Cα-based KMC was first used to optimize the energy parameter µ in the GO-like potential. Cα-based structural models of CD8 and HLA-A2 were randomly placed in a cubic box of 10 nm in length. Freely translational and rotational diffusions were applied to both molecules. The theoretically estimated translational diffusion constants for CD8 and HLA-A2 are 9.1×107 nm2/s and 7.7×107 nm2/s, respectively. Their estimated rotational diffusion constants are 2.5×106 rad2/s and 4.0×105 rad2/s, respectively. As described in the methods, the diffusions are guided by the energy parameter µ, which is adjustable to mediate the constraints from native contacts. We changed different values of µ from 0 to 2.0 kT. For each testing value, 103 simulation trajectories were carried out. Each trajectory contained 104 simulation steps with a time step of 1.0 ns. At the end of each trajectory, we checked whether an encounter complex was successfully formed. The criteria of forming an encounter complex are as follows: the distances of more than 50% native contacts are below the cutoff value. Given these simulation parameters, the kon can be calculated from the frequency of forming an encounter complex based on Equation (3). For instance, if only one encounter complex was observed among all 103 trajectories, the calculated kon is 6.0×104 M-1s-1. The simulation results of calculated kon for all tested µ are plotted in Fig.

2. The figure shows that the rates are higher at larger values of µ. Moreover, when µ equals 0.25 kT, the calculated kon is 1.2×105 M-1s-1, which is the closest to the experimental value. Therefore, the value of µ was fixed at 0.25 kT in the following studies to evaluate the intrinsic rate of association rass between RB models of CD8 and HLA-A2. In the RB model, if the distance between the functional sites of two proteins is smaller than dc and their packing angles from the native complex are smaller than Θc, their probability of dimer association is rass within the next time step ∆tRB. Following the definition of rass, 103 trajectories of Cα-based simulations were generated for a second time. In the beginning of each trajectory, CD8 and HLA-A2 were placed in a random configuration in which the distance between their binding interfaces is smaller than the given value of dc, and their orientation from the native complex is smaller than Θc. Biased 12 ACS Paragon Plus Environment

Page 13 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

diffusions were followed by the initial configuration with the same diffusion constants, guided by the GO-like potential with the energy parameter 0.25 kT. Each trajectory contained 102 simulation steps with a time step of 0.1 ns so that the total length of each trajectory is 10 ns, which equals the length of a simulation step in RB-based model. The value of rass was then calculated by counting how frequently an encounter complex was formed with the same criteria. The calculated rass will change with different cutoff values of dc and Θc. However, these cutoff values and resultant rass are covariant so that different choices will not affect the kinetics of the system. To verify this statement, three sets of simulations were carried out with different values of dc and Θc. In the first simulation, when dc equaled 10 Angstroms and Θc equaled 30 Degrees, we found 65 encounter complexes among 103 trajectories. Therefore, the calculated rass is 0.0065 ns-1. Similarly, we found 4 encounter complexes when dc equaled 25 Angstroms and Θc equaled 30 Degrees in the second simulation and found 18 encounter complexes when dc equaled 10 Angstroms and Θc equaled 45 Degrees in the third simulation. All three of the calculated

rass, together with their corresponding dc and Θc, were further used as input parameters in the following RB-based simulations, in which 200 RB models of CD8 and HLA-A2 were initially distributed in a box of 100 nm in length. As shown in Fig. 3a, in all three simulations, with the given parameters, the number of formed dimers increased with time and stopped growing after the system reached equilibrium. Using the same values of molecular concentration, binding affinity and rate constants, the correlation between time and the number of dimers can be solved analytically 63, as shown by the dashed curve in Fig. 3a. This figure illustrates that simulations with different cutoff parameters lead to similar kinetic profiles, indicating that the association rate depends on only the energy parameter but is not sensitive to the choice of the specific cutoff values. This result reflects the robustness of the multiscale simulation model. Moreover, the good agreement between the analytic solution and our numerical simulation indicates the accuracy of our model. We further investigated how binding energy influences the kinetics of dimerization by changing the value of µ. In detail, when µ equals 1.0 kT, the calculated

kon increases to approximately 7.0×105 M-1s-1 (Fig. 2). This larger value of µ was used in the Cα-based simulations, in which the initial configurations of CD8 and HLA-A2 were 13 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 14 of 45

Yinghao Wu

constrained in a range in which dc equaled 10 Angstroms and Θc equaled 30 Degrees. Consequently, 180 encounter complexes were observed among 103 trajectories, resulting in a rass of 0.018 ns-1. Without changing other parameters, we substituted the rass with a new value in the RB-based simulation. The kinetic profiles of the RB simulation with different values of µ are plotted in Fig. 3b. Compared to the original value (0.25 kT) that matches the experimental data (1.4×105 M-1s-1), this larger µ gives a higher value of kon (7.0×105 M-1s-1) and drives the system to reach equilibrium more quickly, as shown by the read solid curve in the figure. However, because their binding affinities are the same (9 RT), the numbers of complexes are very close in both simulations after they reached equilibrium, indicating that the association rate affects only the kinetics and not the thermodynamics of the system. Finally, it is worth mentioning that when µ equals 1.0 kT, our simulation profile of dimerization kinetics is slightly slower than the analytic solution (red dashed curve in Fig. 3b). We speculate that this discrepancy resulted from the unphysical nature of the GO-like potential. As we will mention in the discussion, the Golike potential cannot capture the long-range interactions that play a dominant role in the rapid association of proteins. Therefore, the accuracy of kinetics for higher values of kon is more affected. Nevertheless, our results demonstrate that the kinetics of protein interactions can be quantitatively estimated by our multiscale simulation framework. This framework bridges kon (the macroscopic quantity that is experimentally measurable) with

µ (the atomic properties of molecular interactions) by linking simulations between Cαbased and lower-resolution levels. This strategy will be used in the following sections to simulate the assembly of different protein complexes.

3.2. Simulating the assembly kinetics of a homogenous trimer Different from protein dimerization, in which each molecule has only one binding interface, subunits in a protein complex might contain multiple binding sites. The manybody interactions among these subunits will lead to more complicated dynamics during their oligomerization. Therefore, based on the validation of the multiscale modeling framework in the last section, we further applied the simulation methods to two typical types of protein oligomers. The first type is a homo-trimer that is described in this section. Specifically, phospholipase A2 (PLA2) (PDB id 1A3F) is used as a test system71. The 14 ACS Paragon Plus Environment

Page 15 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

crystal structure of PLA2 complex contains three identical subunits in C3 symmetry (Fig. 4a). There are three binding interfaces in the complex that are exactly same. Each interface is shared by a pair of subunits. Consequently, this homo-trimer contains only one type of interface. Because it is difficult to experimentally measure the binding parameters of binary interactions in a protein oligomer, currently available computational software was used to estimate the values of these constants. For instance, the kon of binding between two subunits in the trimer equals 3.36×105 M-1s-1, which was calculated by TransComp72, and the calculated binding affinity equals 8.5 kT, which was generated from PPEPred73. Using these parameters as inputs, Cα-based KMC simulations were carried out to estimate the rass. As shown in Fig. 4c, the structural models of two subunits were placed in a cubic box of 10 nm in length. After 103 simulation trajectories, we found that when µ equals 0.05 kT, the calculated kon is the closest to the value that generated from TransComp. Afterwards, 103 trajectories of Cα-based simulations were generated a second time. In the beginning of each trajectory, two subunits of the trimer were placed in a random configuration in which the distance between their binding interfaces was smaller than 10 Angstroms, and their orientation from the native The total length of each trajectory was 10 ns, which equals the time step of the follow-up RB-based simulation. As a result, we found 60 encounter complexes among 103 trajectories, and the calculated

rass was 0.006 ns-1. Following the determination of rass, simulations of trimer assembly were tested by an improved version of the RB-based model. In the new model, each subunit in a trimer is represented by a rigid body whose radius is 2 nm (Fig. 4b). Two different binding sites are assigned to the surface of each rigid body. One binding site is called the donor site (white), and the other site is called the acceptor site (gray). These two sites form a packing angle of 60° with the center of the corresponding rigid body. Binding occurs only when the donor site on one subunit is close to the acceptor site on another subunit. The binding criteria and probability have been determined by previous Cα-based simulations. To test the kinetics of assembly, 100 RB models of PLA2 subunits were initially distributed in a box of 50 nm in length. Following the initial conformation, the system was evolved by the diffusion-reaction algorithm. If trimers or dimers were formed among subunits, they moved together as an entity until dissociated or encountered with 15 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 16 of 45

Yinghao Wu

additional subunits. Fig. 4d shows a snapshot taken from the simulation trajectory, in which each subunit is plotted as a blue sphere. Trimers that share the similar quaternary arrangement as the crystal structure can be found in the figure, as highlighted by the dashed circle. In addition to the trimers, we also found a number of dimers and monomers that were not assembled together in the system. More quantitative information about assembly kinetics was further provided in Fig. 5b. In the figure, the numbers of both the dimers and trimers that formed during the simulation time were plotted. The figure shows that the number of dimers (black curve) increased at the beginning of the simulation but decreased to a much lower level later. Along with the decrease in dimers, the number of trimers monotonically increased by a much slower kinetics. At the end of the simulation, approximately 30 trimers were formed, indicating that almost all subunits were oligomerized. These results suggest that trimers are assembled through a two-step process. Subunits are first associated into dimers, which serve as transient intermediates. These relatively rapid kinetics are followed by a second step in which a trimer is finally formed by adding an additional subunit to the dimer. To further differ the kinetics of trimer assembly from dimerization, a control simulation was carried out. In this simulation, the same number of subunits was included. However, each subunit in this simulation contains only one binding site. In other word, these molecules can form only dimers. Given the same volume of simulation box and the same values of binding constants, the kinetic profile of dimerization is plotted by the red curve in Fig. 5a. Consistent with the analytic solution that is plotted by the black dashed curve, the system of homodimerization reached equilibrium at 0.01 s, and approximately thirty interactions were obtained after equilibrium. In other words, sixty subunits were involved in dimers, and the remaining forty were unbound. In comparison, the kinetic profile of the trimer assembly is plotted by the blue curve. Much more interactions were observed after the system reach equilibrium. Specifically, 90 interactions corresponded to 30 trimers, indicating that only ten subunits were unbound. More importantly, we found that the process of trimer assembly was much slower than that of dimerization. The system reached equilibrium after 0.05 s. However, despite the slower kinetics, the curve of trimer assembly showed smaller fluctuations than did dimerization after equilibrium, indicating that the system of trimers is not only thermodynamically more stable but also 16 ACS Paragon Plus Environment

Page 17 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

more resistant to external noises. These kinetic features reflected from our simulations imply functional significance. Although the initial seeding (the intermediate of dimerization) during complex assembly leads to a slower kinetics, the intrinsic cooperation between subunits makes the final complex more stable than the binary interactions. These properties assure that functions of protein oligomers could exhibit an all-or-none transition only when there is a persistent and high dose of stimulation, which is called a threshold response74. This response is biologically important for cells to remain functional in a stochastic environment.

3.3. Simulating the assembly kinetics of a heterogeneous tetramer Finally, we studied the assembly kinetics of a homo-oligomer. However, a large portion of the protein complexes in cells are heterogeneous that contain different types of subunits. In these hetero-oligomers, more than one type of binding interface simultaneously coexists between various subunits, and the pathways of assembly are suggested to be determined by the properties of these interfaces. One of the simplest examples of hetero-oligomers is a hetero-tetramer that contains two types of subunits (A and B). Fig. 6a shows a schematic in which two subunits A and two subunits B are spatially arranged in the tetramer. As a result, two types of binding interfaces are indicated in the figure. One interface is between subunits A and B, and the other interface is between two subunits of type A. Previous studies have shown that hetero-tetramers with the same quaternary organization as Fig. 6a have very different assembling pathways. For instance, in the tetramer of Tryptophan synthase, a homodimer is formed between subunits A in the initial stage of assembly75, whereas in the tetramer of Nitrile hydratase, a heterodimer is formed between subunits A and B in the beginning

76

. To

understand the general principles of assembling dynamics for hetero-oligomers, we did not focus on one specific protein system, but rather used the hetero-tetramer A2B2 as a testing template and changed the binding constants of both the AB and AA interfaces within a wide range. As in the initial configuration, 100 RB models of subunit A and 100 RB models of subunit B were initially distributed in a box of 100 nm in length. The radius of the rigid body for subunit A was 5 nm, and radius of the rigid body for subunit B was 2.5 nm. 17 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 18 of 45

Yinghao Wu

The rigid body surface of each subunit B contains one binding site that can interact with a subunit A. In contrast, each subunit A contains two binding sites. One site is for homodimerization, and the other site is for hetero-dimerization. The binding criteria of AA and AB interactions were determined by the cutoffs dc (2.5 nm) and Θc (30 Degrees). Fig. 6b shows a snapshot taken from the middle of a simulation trajectory, in which each subunit A is plotted as a blue sphere and each subunit B is plotted as a yellow sphere of smaller size. Tetramers that share the same quaternary arrangement as Fig. 6a were obtained in the simulation, as highlighted by the red dashed circle. In addition to the tetramer, we also found a number of partial complexes that have not been fully assembled in the system. Additional numerical experiments were carried out to test the details of assembly dynamics. In the first experiment, three simulation scenarios were designed to study the relationship between different binding interfaces. In the first scenario, the affinity between two subunits A equals 0. The only binding is between subunits A and B. In the second scenario, the affinity between subunits A and B equals 0. The only binding is between two subunits A. Finally, in the third scenario, both homogeneous and heterogeneous binding are turned on. The numbers of AA and AB interactions after simulations of all three scenarios are recorded in Fig. 6c. As shown in the figure, few AA interactions were formed in the first scenario due to the lack of AA binding affinity (slashed bar). Similarly, few AB interactions were formed in the second scenario due to the lack of AB binding affinity (grey bar). More AB interactions were formed than AA interactions because AB interactions occurred between 100 subunits A and 100 subunits B, whereas AA interactions occurred only among the 100 subunits A. When the AA and AB binding affinities were turned on simultaneously in the third scenario, both homogeneous and heterogeneous interactions were obtained in the system (black bar). Interestingly, the number of AB interactions that formed in the third scenario was greater than the number of AB interactions that formed in the first scenario, and the number of AA interactions was greater than that in the second scenario, although they shared the same molecular concentrations and values of binding affinities. This result indicates a synergistic binding between two types of interfaces. Without allosteric regulation, one type of binding can strengthen the binding of another type in a complex. We propose the following mechanism to explain this cooperation. Given both AA and AB binding 18 ACS Paragon Plus Environment

Page 19 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

affinities, partial complexes can be formed either between two subunits A or between subunits A and B. All subunits in these intermediate complexes move together with lower diffusion constants. This loss of mobility facilitates the binding of additional subunits to the intermediate complexes. To systematically investigate the kinetic properties of the assembly process, we further changed the association rates of two binding interfaces under different values of binding affinities. In principle, assembly can be affected by four parameters: the affinity and rate of AA interaction (∆GAA and rass(AA)) or the affinity and rate of AB interaction (∆GAB and rass(AB)). To narrow down the parameter space, both association rate and binding affinity were discretized into two values. In detail, -7 kT was used to represent a weak binding, and -13 kT was used to represent a strong binding. Similarly, a fast association was indicated by setting rass=1.0 ns-1, and a slow association was indicated by setting rass=0.01 ns-1. Consequently, there were four types of binding for each AA and AB interaction: fast/weak; fast/strong; slow/weak and slow/strong. All possible assembly pathways of a tetramer with both AA and AB interfaces therefore can be described in a simplified way by 4×4=16 sets of parameters. We fixed the binding affinity at -13 kT to discuss the behaviors of assembly kinetics under different values of association rates (Fig. 7). The kinetic profiles of assembly for all parameter combinations are plotted in supplemental Fig. S1. Fig. 7 indicates that different association rates between subunits lead to very diverse assembly pathways, even though the binding affinities are unchanged. For instance, when AB association is slow and AA association is fast (Fig. 7a), the number of AA interactions (red curve) initially increases but decreases later. This result indicates that the first step during the assembly pathway is the formation of AA homodimers. In contrast, when the AB association is fast and the AA association is slow (Fig. 7d), the number of AB interactions (black curve) decreases after peaking in the beginning. This result suggests an alternative pathway of assembly in which the first step is the formation of AB heterodimers. These results are supported by the examples of Tryptophan synthase and Nitrile hydratase that we introduced earlier. Finally, the assembly kinetics when both AB and AA association rates are high (Fig. 7b) is much faster than the kinetics when both rates are low (Fig. 7d). Taken together, our simulations demonstrate that given the binding properties between individual subunits, we are able to 19 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 20 of 45

Yinghao Wu

capture the assembly landscape of a complex with specific quaternary structural organization.

4. Concluding Discussion Oligomerization is a ubiquitous mechanism by which proteins carry out their diverse functions in cells. Recent studies have shown a relationship between the assembly pathway and the evolution of protein complexes76-77. Therefore, an ordered assembly process is an important strategy that nature has developed to maintain a proper conformations of protein complexes. Accordingly, our research focus has started to move beyond the static representation to the dynamic aspect of protein quaternary structures. Computational simulations provide unique contributions during this process. However, high-resolution methods based on molecular dynamic simulations can hardly be used to study the fully assembly processes of large protein complexes because of the limited molecular size and time scale that these methods are able to reach. On the other hand, low-resolution models fail to provide a quantitative description of the structure and energetics of protein complexes. Faced with this dilemma, we developed a multi-scale model that combines low-resolution and high-resolution simulation methods so that the protein complex assembly can be studied with both binding details and computational efficiency. Practically, the association rate between two proteins was calculated by structure-based modeling. Using this rate as an input parameter, a low-resolution RBbased model was used to simulate the association among a large number of proteins. The kinetic profile that was generated from RB simulations fit the analytic solution, demonstrating the accuracy of our multi-scale model. The assembly processes of specific protein complexes were further tested by designing the stoichiometry, multiple binding interfaces and binding rates in the RB model. Consistent with previous experimental observations, our simulations indicated very different kinetics between protein oligomerization and dimerization. The formation of protein oligomers is a multi-step process that is much slower than dimerization but thermodynamically more stable. Moreover, we showed that even the same protein quaternary structure can have very diverse assembly pathways under different binding constants between subunits, which is important for regulating the functions of protein complexes. Finally, we revealed that the 20 ACS Paragon Plus Environment

Page 21 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

binding between subunits in a complex can be synergistically strengthened during assembly without considering allosteric regulation or conformational changes. The mechanism of this cooperation was speculated. The future extension of our method will overcome the current limitations of the work that is presented in this paper. Specifically, the 20 amino acids have sidechains with different hydrophilic and hydrophobic properties. These properties cannot be reflected in our coarse-grained structural model of protein complexes, in which each residue is represented by a Cα atom. Fortunately, this simplified representation of amino acids did not greatly affect the results of CG simulations because we used the Go-like potential to describe the interaction between two interaction proteins or protein subunits. The Go-like potential that was used in the target function of CG simulations is not based on physical properties, but rather is biased towards the native contacts of interacting proteins. One advantage of this native-biased energy function is that the rate constants can be modulated by easily adjusting the single parameter in the function. However, the Go-like potential did not capture the long-range interactions between two proteins, for instance, the electrostatic interaction. The long-range electrostatic interactions play a dominant role in the fast association of proteins whose rates can be greater than 107 M-1s-1 78. Therefore, the current model can be applied only to proteins whose associations are mainly mediated by diffusion. Future studies will implement a set of physically based energy functions to simulate protein binding processes more realistically. Additionally, the intramolecular degrees of freedom of each protein subunit were fixed in the RB-based simulation. In other words, the conformational changes were not considered during complex assembly. However, previous studies have illustrated that conformational changes are important in protein complex assembly79. Specifically, the flexibility of protein subunits was found to facilitate quaternary structural assembly. Allosteric modulation has become an emerging approach to control protein oligomerization in drug design. Although the effect of conformational flexibility is not reflected by the RB model, it can be estimated by our CG simulation. For instance, the elastic network model (ENM)80 will be integrated into the current model of CG simulation so that protein conformations can be changed during association. Finally, many protein complexes consist of multi-domain subunits. The structure of a multi-domain protein subunit cannot be simply coarse-grained by a rigid 21 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 45

2016-01-06

Yinghao Wu

body. Therefore, in the future, the current RB model will be improved by a new representation in which each multi-domain protein subunit will be modeled by a chain of multiple rigid bodies. Each rigid body corresponds to one structural domain in the subunit. Based on these further improvements, our model can potentially be applied to study a number of different problems. First, the approach can be used to understand the assembling pathway of specific molecular systems. One example is the assembly of death domain (DD)-induced signaling complexes. DD-based protein aggregations are observed during the processes of many cell signaling pathways. For instance, in the NF-κB pathway, MyD88, IRAK4 and IRK1/2 form a signaling platform called Myddosome

4

through interactions between their DDs. The crystal structure of the Myddosome complex has recently been derived. The complex possesses of a single stranded left-handed helical symmetry, comprising 6 MyD88, 4 IRAK4 and 4 IRAK2 DD subunits. However, the detailed kinetics of this complex assembly are unknown. Based on the spatial arrangement of molecular compositions and the interactions between subunits as revealed by the crystal structure, it was speculated that the assembly is hierarchical, starting from six MyD88s, continuing with four IRAK4s and ending with four IRAK1/2s. This proposed hierarchical assembly mechanism can be tested by our multiscale model in the future. Furthermore, the maturation of our model will allow us to predict the assembly pathway for any given protein complex and how mutations at the binding interface between two subunits change the original pathway. Consequently, this method can be used to explore the correlation between assembly pathways and the evolution of protein quaternary structures. Previous studies have indicated that the assembly of protein complexes intimately reflects their evolutionary information

76-77

. By applying our

multiscale simulations to a set of protein complexes whose subunits are evolutionarily conserved but whose quaternary structural arrangements are different, we will be able to directly map the assembly order of these complexes to the pathway of their evolution. This mapping will further increase the potential of our method in the design of new protein oligomers for biomedical applications.

22 ACS Paragon Plus Environment

Page 23 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

Acknowledgements This work was supported by a start-up grant from the Albert Einstein College of Medicine. Computational support was provided by the Albert Einstein College of Medicine High Performance Computing Center.

Supporting Information The simulation results of assembly kinetics under all 16 combinations of binding parameters of the tested heterogeneous tetramer

23 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 24 of 45

Yinghao Wu

Figure Legends Figure 1: In the schematic of our multiscale simulation framework, we adjusted the energy parameter to reproduce the experimentally measured or computationally predicted value of

kon (a). Given the same energy parameter, a different set of Cα-based CG simulations was then used to estimate the rate of association rass (b). Finally, the derived rass, together with the diffusion constants and binding affinity of interacting proteins, was used to guide the RBbased simulation (c) which contains a large number of molecules in the system. The model has been extended to simulate assembly of protein oligomers (d). The rates of association between each different pairs of subunits in an oligomer are calculated individually by the same process. All derived rates are integrated into the RB simulation, in which multiple copies of all subunit types can be assembled together.

Figure 2: The complex between CD8-α and HLA-A2 (PDB id 1AKJ) is used as a test system to validate the multiscale model. The association between CD8 and HLA-A2 in Cα-based simulations is guided by the Go-like potential, in which an energy parameter, µ, was adjusted to mediate the constraints from native contacts. We changed different values of µ from 0 to 2.0 kT. The calculated kons are higher at larger values of µ. Moreover, when µ equals 0.25 kT, the calculated kon is 1.2×105 M-1s-1, which is the closest to the experimental value.

Figure 3: Three sets of simulations were carried out with different values of dc and Θc (a). These simulations led to similar kinetic profiles, indicating that association rate depends on only the energy parameter but not the choice of specific cutoff values. Moreover, the good agreement between the analytic solution and our numerical simulation indicates the accuracy of our model. Furthermore, a larger value of µ was used to investigate how binding energy influences the kinetics of dimerization (b). Compared with the original value (0.25 kT) that matches the experimental data (1.4×105 M-1s-1), this larger µ gives a higher value of kon (7.0×105 M-1s-1) and drives the system to reach equilibrium more quickly.

Figure 4: The phospholipase A2 (PLA2) (PDB id 1A3F) (a) is used as a test system to simulate the assembly kinetics of homo-oligomers. Each subunit in this trimer is represented by a rigid body. Two different binding sites are assigned on the surface of each rigid body (b). Cα-based KMC simulations were first carried out to estimate the rass between two subunits in the trimer (c). Simulations of trimer assembly were further tested by RB-based model. A

24 ACS Paragon Plus Environment

Page 25 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

snapshot is taken from the simulation trajectory (d). Trimers that share the similar quaternary arrangement as the crystal structure are highlighted in the figure by the dashed circle.

Figure 5: We show that the kinetics of trimer assembly differs largely from the kinetics dimerization (a). The process of trimer assembly is much slower than dimerization. However, the curve of trimer assembly shows smaller fluctuations than dimerization after equilibrium, indicating that the system of trimers is not only thermodynamically more stable but also more resistant to external noises. Moreover, we show that during the process of trimer assembly, the number of dimers rises at the beginning of the simulation, but reduces to a much lower level later (b). Along with the decrease of dimers, the number of trimers monotonically increases by a much slower kinetics. These results suggest that trimers are assembled through a two-step process.

Figure 6: A tetramer that includes two subunits A and two subunits B (a) was used as a testing template to simulate the assembly of hetero-oligomers. A snapshot is taken from the trajectory of RB simulations (b), in which each subunit A is plotted as a blue sphere and each subunit B is plotted as a yellow sphere with smaller size. One derived hetero-tetramers is highlighted by the red dashed circle. Three simulation scenarios were first designed to study the relationship between different binding interfaces. In the first scenario, the affinity between two subunits A equals 0. The only binding is between subunits A and B. In the second scenario, the affinity between subunits A and B equals 0. The only binding is between two subunits A. Finally, in the third scenario, both homogeneous and heterogeneous binding are turned on. The numbers of AA and AB interactions after simulations of all three scenarios are plotted as histogram in (c).

Figure 7: We fixed the binding affinity to test the kinetics of assembly under different values of association rates. The kinetic profiles of complex assembly when AB association is slow and AA association is fast are plotted in (a). The kinetic profiles of complex assembly when both AB and AA associations are fast are plotted in (b). The kinetic profiles of complex assembly when both AB and AA associations are slow are plotted in (c). Finally, the kinetic profiles of complex assembly when AB association is fast and AA association is slow are plotted in (d). The figures show that different association rates between subunits lead to very diverse assembly pathways, even though the binding affinities are unchanged.

25 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 26 of 45

Yinghao Wu

References (1) Ali, M. H.; Imperiali, B., Protein Oligomerization: How and Why. Bioorg Med Chem 2005, 13, 5013-5020. (2) Levy, E. D.; Teichmann, S., Structural, Evolutionary, and Assembly Principles of Protein Oligomerization. Prog Mol Biol Transl Sci 2013, 117, 25-51. (3) Kagan, J. C.; Magupalli, V. G.; Wu, H., Smocs: Supramolecular Organizing Centres That Control Innate Immunity. Nat Rev Immunol 2014, 14, 821-826. (4) Motshwene, P. G.; Moncrieffe, M. C.; Grossmann, J. G.; Kao, C.; Ayaluru, M.; Sandercock, A. M.; Robinson, C. V.; Latz, E.; Gay, N. J., An Oligomeric Signaling Platform Formed by the TollLike Receptor Signal Transducers Myd88 and Irak-4. J Biol Chem 2009, 284, 25404-25411. (5) Tan, K.; Shlomi, T.; Feizi, H.; Ideker, T.; Sharan, R., Transcriptional Regulation of Protein Complexes within and across Species. Proc Natl Acad Sci U S A 2007, 104, 1283-1288. (6) Diez, D.; Hutchins, A. P.; Miranda-Saavedra, D., Systematic Identification of Transcriptional Regulatory Modules from Protein-Protein Interaction Networks. Nucleic Acids Res 2014, 42, e6. (7) Nakamoto, R. K.; Baylis Scanlon, J. A.; Al-Shawi, M. K., The Rotary Mechanism of the Atp Synthase. Arch Biochem Biophys 2008, 476, 43-50. (8) Hurwitz, J., The Discovery of Rna Polymerase. J Biol Chem 2005, 280, 42477-42485. (9) Schliwa, M.; Woehlke, G., Molecular Motors. Nature 2003, 422, 759-765. (10) Brooijmans, N.; Sharp, K. A.; Kuntz, I. D., Stability of Macromolecular Complexes. Proteins 2002, 48, 645-653. (11) Edelstein, S. J., Cooperative Interactions of Hemoglobin. Annu Rev Biochem 1975, 44, 209-232. (12) Ellis, R. J., Protein Misassembly: Macromolecular Crowding and Molecular Chaperones. Adv Exp Med Biol 2007, 594, 1-13. (13) Raker, V. A.; Plessel, G.; Luhrmann, R., The Snrnp Core Assembly Pathway: Identification of Stable Core Protein Heteromeric Complexes and an Snrnp Subcore Particle in Vitro. Embo j 1996, 15, 2256-2269. (14) Gallastegui, N.; Groll, M., The 26s Proteasome: Assembly and Function of a Destructive Machine. Trends Biochem Sci 2010, 35, 634-642. (15) Baldick, C. J., Jr.; Cassetti, M. C.; Harris, N.; Moss, B., Ordered Assembly of a Functional Preinitiation Transcription Complex, Containing Vaccinia Virus Early Transcription Factor and Rna Polymerase, on an Immobilized Template. J Virol 1994, 68, 6052-6056. (16) Ghirlando, R., The Analysis of Macromolecular Interactions by Sedimentation Equilibrium. Methods 2011, 54, 145-156. (17) Some, D., Light-Scattering-Based Analysis of Biomolecular Interactions. Biophys Rev 2013, 5, 147-158. (18) Walters, K. J.; Ferentz, A. E.; Hare, B. J.; Hidalgo, P.; Jasanoff, A.; Matsuo, H.; Wagner, G., Characterizing Protein-Protein Complexes and Oligomers by Nuclear Magnetic Resonance Spectroscopy. Methods Enzymol 2001, 339, 238-258. (19) Velazquez-Campoy, A.; Leavitt, S. A.; Freire, E., Characterization of Protein-Protein Interactions by Isothermal Titration Calorimetry. Methods Mol Biol 2015, 1278, 183-204. (20) Hernandez, H.; Robinson, C. V., Determining the Stoichiometry and Interactions of Macromolecular Assemblies from Mass Spectrometry. Nat Protoc 2007, 2, 715-726. (21) Gell, D. A.; Grant, R. P.; Mackay, J. P., The Detection and Quantitation of Protein Oligomerization. Adv Exp Med Biol 2012, 747, 19-41. 26 ACS Paragon Plus Environment

Page 27 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

(22) Fadouloglou, V. E.; Kokkinidis, M.; Glykos, N. M., Determination of Protein Oligomerization State: Two Approaches Based on Glutaraldehyde Crosslinking. Anal Biochem 2008, 373, 404-406. (23) Piston, D. W.; Kremers, G. J., Fluorescent Protein Fret: The Good, the Bad and the Ugly. Trends Biochem Sci 2007, 32, 407-414. (24) Sprague, B. L.; McNally, J. G., Frap Analysis of Binding: Proper and Fitting. Trends Cell Biol 2005, 15, 84-91. (25) Nakabayashi, J.; Sasaki, A., A Mathematical Model for Apoptosome Assembly: The Optimal Cytochrome C/Apaf-1 Ratio. J Theor Biol 2006, 242, 280-287. (26) Wieczorek, G.; Zielenkiewicz, P., Influence of Macromolecular Crowding on ProteinProtein Association Rates--a Brownian Dynamics Study. Biophysical journal 2008, 95, 5030-5036. (27) Ermakova, E., Lysozyme Dimerization: Brownian Dynamics Simulation. Journal of molecular modeling 2005, 12, 34-41. (28) Forlemu, N. Y.; Njabon, E. N.; Carlson, K. L.; Schmidt, E. S.; Waingeh, V. F.; Thomasson, K. A., Ionic Strength Dependence of F-Actin and Glycolytic Enzyme Associations: A Brownian Dynamics Simulations Approach. Proteins 2011, 79, 2813-2827. (29) Long, H.; Chang, C. H.; King, P. W.; Ghirardi, M. L.; Kim, K., Brownian Dynamics and Molecular Dynamics Study of the Association between Hydrogenase and Ferredoxin from Chlamydomonas Reinhardtii. Biophysical journal 2008, 95, 3753-3766. (30) Frembgen-Kesner, T.; Elcock, A. H., Absolute Protein-Protein Association Rate Constants from Flexible, Coarse-Grained Brownian Dynamics Simulations: The Role of Intermolecular Hydrodynamic Interactions in Barnase-Barstar Association. Biophysical journal 2010, 99, L75-77. (31) Zimmer, M. J.; Geyer, T., Do We Have to Explicitly Model the Ions in Brownian Dynamics Simulations of Proteins? The Journal of chemical physics 2012, 136, 125102. (32) Dlugosz, M.; Huber, G. A.; McCammon, J. A.; Trylska, J., Brownian Dynamics Study of the Association between the 70s Ribosome and Elongation Factor G. Biopolymers 2011, 95, 616-627. (33) Huber, G. A.; Kim, S., Weighted-Ensemble Brownian Dynamics Simulations for Protein Association Reactions. Biophysical journal 1996, 70, 97-110. (34) Rojnuckarin, A.; Livesay, D. R.; Subramaniam, S., Bimolecular Reaction Simulation Using Weighted Ensemble Brownian Dynamics and the University of Houston Brownian Dynamics Program. Biophysical journal 2000, 79, 686-693. (35) Zou, G.; Skeel, R. D., Robust Biased Brownian Dynamics for Rate Constant Calculation. Biophysical journal 2003, 85, 2147-2157. (36) Zhou, H. X., Brownian Dynamics Study of the Influences of Electrostatic Interaction and Diffusion on Protein-Protein Association Kinetics. Biophysical journal 1993, 64, 1711-1726. (37) Northrup, S. H.; Erickson, H. P., Kinetics of Protein-Protein Association Explained by Brownian Dynamics Computer Simulation. Proceedings of the National Academy of Sciences of the United States of America 1992, 89, 3338-3342. (38) Merlitz, H.; Rippe, K.; Klenin, K. V.; Langowski, J., Looping Dynamics of Linear DNA Molecules and the Effect of DNA Curvature: A Study by Brownian Dynamics Simulation. Biophysical journal 1998, 74, 773-779. (39) Mereghetti, P.; Gabdoulline, R. R.; Wade, R. C., Brownian Dynamics Simulation of Protein Solutions: Structural and Dynamical Properties. Biophysical journal 2010, 99, 3782-3791. (40) Lin, J.; Beratan, D. N., Simulation of Electron Transfer between Cytochrome C2 and the Bacterial Photosynthetic Reaction Center: Brownian Dynamics Analysis of the Native Proteins and Double Mutants. The journal of physical chemistry. B 2005, 109, 7529-7534.

27 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 28 of 45

Yinghao Wu

(41) De Rienzo, F.; Gabdoulline, R. R.; Menziani, M. C.; De Benedetti, P. G.; Wade, R. C., Electrostatic Analysis and Brownian Dynamics Simulation of the Association of Plastocyanin and Cytochrome F. Biophysical journal 2001, 81, 3090-3104. (42) Haddadian, E. J.; Gross, E. L., A Brownian Dynamics Study of the Interactions of the Luminal Domains of the Cytochrome B6f Complex with Plastocyanin and Cytochrome C6: The Effects of the Rieske Fes Protein on the Interactions. Biophysical journal 2006, 91, 2589-2600. (43) Hattne, J.; Fange, D.; Elf, J., Stochastic Reaction-Diffusion Simulation with Mesord. Bioinformatics 2005, 21, 2923-2924. (44) Ander, M.; Beltrao, P.; Di Ventura, B.; Ferkinghoff-Borg, J.; Foglierini, M.; Kaplan, A.; Lemerle, C.; Tomas-Oliveira, I.; Serrano, L., Smartcell, a Framework to Simulate Cellular Processes That Combines Stochastic Approximation with Diffusion and Localisation: Analysis of Simple Networks. Syst Biol (Stevenage) 2004, 1, 129-138. (45) Rodriguez, J. V.; Kaandorp, J. A.; Dobrzynski, M.; Blom, J. G., Spatial Stochastic Modelling of the Phosphoenolpyruvate-Dependent Phosphotransferase (Pts) Pathway in Escherichia Coli. Bioinformatics 2006, 22, 1895-1901. (46) Stiles, Jr.; Bartol, T. M., Monte Carlo Methods for Simulating Realistic Synaptic Microphysiology Using Mcell. Computational Neuroscience 2001, 87-127. (47) Andrews, S. S.; Bray, D., Stochastic Simulation of Chemical Reactions with Spatial Resolution and Single Molecule Detail. Phys Biol 2004, 1, 137-151. (48) Ridgway, D.; Broderick, G.; Lopez-Campistrous, A.; Ru'aini, M.; Winter, P.; Hamilton, M.; Boulanger, P.; Kovalenko, A.; Ellison, M. J., Coarse-Grained Molecular Simulation of Diffusion and Reaction Kinetics in a Crowded Virtual Cytoplasm. Biophysical Journal 2008, 94, 3748-3759. (49) Frazier, Z.; Alber, F., A Computational Approach to Increase Time Scales in Brownian Dynamics-Based Reaction-Diffusion Modeling. Journal of Computational Biology 2012, 19, 606618. (50) Lee, B.; LeDuc, P. R.; Schwartz, R., Stochastic Off-Lattice Modeling of Molecular SelfAssembly in Crowded Environments by Green's Function Reaction Dynamics. Physical Review E 2008, 78. (51) Noid, W. G.; Chu, J. W.; Ayton, G. S.; Krishna, V.; Izvekov, S.; Voth, G. A.; Das, A.; Andersen, H. C., The Multiscale Coarse-Graining Method. I. A Rigorous Bridge between Atomistic and Coarse-Grained Models. Journal of Chemical Physics 2008, 128, 244114. (52) Ayton, G. S.; Noid, W. G.; Voth, G. A., Multiscale Modeling of Biomolecular Systems: In Serial and in Parallel. Current Opinion in Structural Biology 2007, 17, 192-198. (53) Marrink, S. J.; Risselada, H. J.; Yefimov, S.; Tieleman, D. P.; de Vries, A. H., The Martini Force Field: Coarse Grained Model for Biomolecular Simulations. Journal of Physical Chemistry B 2007, 111, 7812-7824. (54) Ahmed, A.; Gohlke, H., Multiscale Modeling of Macromolecular Conformational Changes Combining Concepts from Rigidity and Elastic Network Theory. Proteins-Structure Function and Bioinformatics 2006, 63, 1038-1051. (55) Wu, Y. H.; Dousis, A. D.; Chen, M. Z.; Li, J. L.; Ma, J. P., Opus-Dom: Applying the FoldingBased Method Vecfold to Determine Protein Domain Boundaries. Journal of Molecular Biology 2009, 385, 1314-1329. (56) Sherwood, P.; Brooks, B. R.; Sansom, M. S. P., Multiscale Methods for Macromolecular Simulations. Current Opinion in Structural Biology 2008, 18, 630-640. (57) Kalli, A. C.; Campbell, I. D.; Sansom, M. S. P., Multiscale Simulations Suggest a Mechanism for Integrin inside-out Activation. Proceedings of the National Academy of Sciences of the United States of America 2011, 108, 11890-11895.

28 ACS Paragon Plus Environment

Page 29 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

(58) Ramis-Conde, I.; Drasdo, D.; Anderson, A. R. A.; Chaplain, M. A. J., Modeling the Influence of the E-Cadherin-Beta-Catenin Pathway in Cancer Cell Invasion: A Multiscale Approach. Biophysical Journal 2008, 95, 155-165. (59) Ramis-Conde, I.; Drasdo, D., From Genotypes to Phenotypes: Classification of the Tumour Profiles for Different Variants of the Cadherin Adhesion Pathway. Physical Biology 2012, 9. (60) Chakrabarti, A.; Verbridge, S.; Stroock, A. D.; Fischbach, C.; Varner, J. D., Multiscale Models of Breast Cancer Progression. Annals of Biomedical Engineering 2012, 40, 2488-2500. (61) Krobath, H.; Rozycki, B.; Lipowsky, R.; Weikl, T. R., Binding Cooperativity of Membrane Adhesion Receptors. Soft Matter 2009, 5, 3354-3361. (62) Walker, D. C.; Georgopoulos, N. T.; Southgate, J., From Pathway to Population - a Multiscale Model of Juxtacrine Egfr-Mapk Signalling. Bmc Systems Biology 2008, 2. (63) Xie, Z.-R.; Chen, J.; Wu, Y., A Coarse-Grained Model for the Simulations of Biomolecular Interactions in Cellular Environments. Journal of Chemical Physics 2014, 140, 054112. (64) Chen, J.; Xie, Z. R.; Wu, Y., A Multiscale Model for Simulating Binding Kinetics of Proteins with Flexible Linkers. Proteins 2014. (65) Xie, Z. R.; Chen, J.; Wu, Y., Linking 3d and 2d Binding Kinetics of Membrane Proteins by Multi-Scale Simulations. Protein Sci 2014. (66) Aragon, S., A Precise Boundary Element Method for Macromolecular Transport Properties. J Comput Chem 2004, 25, 1191-1205. (67) Aragon, S.; Hahn, D. K., Precise Boundary Element Computation of Protein Transport Properties: Diffusion Tensors, Specific Volume, and Hydration. Biophys J 2006, 91, 1591-1603. (68) Hills, R. D., Jr.; Brooks, C. L., 3rd, Insights from Coarse-Grained Go Models for Protein Folding and Dynamics. Int J Mol Sci 2009, 10, 889-905. (69) Gao, G. F.; Tormo, J.; Gerth, U. C.; Wyer, J. R.; McMichael, A. J.; Stuart, D. I.; Bell, J. I.; Jones, E. Y.; Jakobsen, B. K., Crystal Structure of the Complex between Human Cd8alpha(Alpha) and Hla-A2. Nature 1997, 387, 630-634. (70) Wyer, J. R.; Willcox, B. E.; Gao, G. F.; Gerth, U. C.; Davis, S. J.; Bell, J. I.; van der Merwe, P. A.; Jakobsen, B. K., T Cell Receptor and Coreceptor Cd8 Alphaalpha Bind Peptide-Mhc Independently and with Distinct Kinetics. Immunity 1999, 10, 219-225. (71) Segelke, B. W.; Nguyen, D.; Chee, R.; Xuong, N. H.; Dennis, E. A., Structures of Two Novel Crystal Forms of Naja Naja Naja Phospholipase A2 Lacking Ca2+ Reveal Trimeric Packing. J Mol Biol 1998, 279, 223-232. (72) Qin, S.; Pang, X.; Zhou, H. X., Automated Prediction of Protein Association Rate Constants. Structure 2011, 19, 1744-1751. (73) Su, Y.; Zhou, A.; Xia, X.; Li, W.; Sun, Z., Quantitative Prediction of Protein-Protein Binding Affinity with a Potential of Mean Force Considering Volume Correction. Protein Sci 2009, 18, 2550-2558. (74) Qian, H., Cooperativity in Cellular Biochemical Processes: Noise-Enhanced Sensitivity, Fluctuating Enzyme, Bistability with Nonlinear Feedback, and Other Mechanisms for Sigmoidal Responses. Annu Rev Biophys 2012, 41, 179-204. (75) Banik, U.; Ahmed, S. A.; McPhie, P.; Miles, E. W., Subunit Assembly in the Tryptophan Synthase Alpha 2 Beta 2 Complex. Stabilization by Pyridoxal Phosphate Aldimine Intermediates. J Biol Chem 1995, 270, 7944-7949. (76) Marsh, J. A.; Hernandez, H.; Hall, Z.; Ahnert, S. E.; Perica, T.; Robinson, C. V.; Teichmann, S. A., Protein Complexes Are under Evolutionary Selection to Assemble Via Ordered Pathways. Cell 2013, 153, 461-470.

29 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 30 of 45

Yinghao Wu

(77) Levy, E. D.; Boeri Erba, E.; Robinson, C. V.; Teichmann, S. A., Assembly Reflects Evolution of Protein Complexes. Nature 2008, 453, 1262-1265. (78) Zhou, H. X.; Bates, P. A., Modeling Protein Association Mechanisms and Kinetics. Curr Opin Struct Biol 2013, 23, 887-893. (79) Marsh, J. A.; Teichmann, S. A., Protein Flexibility Facilitates Quaternary Structure Assembly and Evolution. PLoS Biol 2014, 12, e1001870. (80) Atilgan, A. R.; Durell, S. R.; Jernigan, R. L.; Demirel, M. C.; Keskin, O.; Bahar, I., Anisotropy of Fluctuation Dynamics of Proteins with an Elastic Network Model. Biophysical Journal 2001, 80, 505-515.

30 ACS Paragon Plus Environment

Page 31 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

Figure 1

31 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 32 of 45

Yinghao Wu

Figure 2

32 ACS Paragon Plus Environment

Page 33 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

Figure 3

33 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 34 of 45

Yinghao Wu

Figure 4

34 ACS Paragon Plus Environment

Page 35 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

Figure 5

35 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 36 of 45

Yinghao Wu

Figure 6

36 ACS Paragon Plus Environment

Page 37 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2016-01-06

Yinghao Wu

Figure 7

37 ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2016-01-06

Page 38 of 45

Yinghao Wu

TOC Graphic

38 ACS Paragon Plus Environment

Page 39 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

In the schematic of our multiscale simulation framework, we adjusted the energy parameter to reproduce the experimentally measured or computationally predicted value of kon (a). Given the same energy parameter, a different set of Cα-based CG simulations was then used to estimate the rate of association rass (b). Finally, the derived rass, together with the diffusion constants and binding affinity of interacting proteins, was used to guide the RB-based simulation (c) which contains a large number of molecules in the system. The model has been extended to simulate assembly of protein oligomers (d). The rates of association between each different pairs of subunits in an oligomer are calculated individually by the same process. All derived rates are integrated into the RB simulation, in which multiple copies of all subunit types can be assembled together. 304x254mm (96 x 96 DPI)

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The complex between CD8-α and HLA-A2 (PDB id 1AKJ) is used as a testing system to validate the multiscale model. The association between CD8 and HLA-A2 in Cα-based simulations is guided by the Golike potential, in which an energy parameter, µ, was adjusted to mediate the constraints from native contacts. We changed different values of µ from 0 to 2.0kT. The calculated kons are higher at larger values of µ. 271x207mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 40 of 45

Page 41 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

254x381mm (96 x 96 DPI)

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The phospholipase A2 (PLA2) (PDB id 1A3F) (a) is used as a testing system to simulate the assembling kinetics of homo-oligomers. Each subunit in this trimer is represented by a rigid body. Two different binding sites are assigned on the surface of each rigid body (b). Cα-based KMC simulations were first carried out to estimate the rass between two subunits in the trimer (c). Simulations of trimer assembly were further tested by RB-based model. A snapshot is taken from the simulation trajectory (d). Trimers that share the similar quaternary arrangement as the crystal structure are highlighted in the figure by the dashed circle. 381x190mm (96 x 96 DPI)

ACS Paragon Plus Environment

Page 42 of 45

Page 43 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

We show that the kinetics of trimer assembly differs largely from the kinetics dimerization (a). The process of trimer assembly is much slower than dimerization. However, the curve of trimer assembly shows smaller fluctuations than dimerization after equilibrium, indicating that the system of trimers is not only thermodynamically more stable, but also more resistant to external noises. Moreover, we show that during the process of trimer assembly, the number of dimers rises at the beginning of the simulation, but reduces to a much lower level later (b). Along with the decrease of dimers, the number of trimers monotonically increases by a much slower kinetics. These results suggest that trimers are assembled through a two-step process. 254x254mm (96 x 96 DPI)

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

A tetramer that includes two subunits A and two subunits B (a) was used as a testing template to simulate the assembly of hetero-oligomers. A snapshot is taken from the trajectory of RB simulations (b), in which each subunit A is plotted as a blue sphere and each subunit B is plotted as a yellow sphere with smaller size. One derived hetero-tetramers is highlighted by the red dashed circle. Three simulation scenarios were first designed to study the relationship between different binding interfaces. In the first scenario, the affinity between two subunits A equals 0. The only binding is between subunits A and B. In the second scenario, the affinity between subunits A and B equals 0. The only binding is between two subunits A. Finally, in the third scenario, both homogeneous and heterogeneous binding are turned on. The numbers of AA and AB interactions after simulations of all three scenarios are plotted as histogram in (c). 254x304mm (96 x 96 DPI)

ACS Paragon Plus Environment

Page 44 of 45

Page 45 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

We fixed the binding affinity to test the kinetics of assembly under different values of association rates. The kinetic profiles of complex assembly when AB association is slow and AA association is fast are plotted in (a). The kinetic profiles of complex assembly when both AB and AA associations are fast are plotted in (b). The kinetic profiles of complex assembly when both AB and AA associations are slow are plotted in (c). Finally, the kinetic profiles of complex assembly when AB association is fast and AA association is slow are plotted in (d). The figures show that different association rates between subunits lead to very diverse assembly pathways, even though the binding affinities are unchanged. 237x186mm (96 x 96 DPI)

ACS Paragon Plus Environment