Subscriber access provided by UNIV OF CAMBRIDGE
Letter
Coupling an ensemble of homologs improves refinement of protein homology models Andre Wildberg, Dennis Della Corte, and Gunnar F. Schröder J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.5b00942 • Publication Date (Web): 28 Oct 2015 Downloaded from http://pubs.acs.org on November 2, 2015
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
Coupling an ensemble of homologs improves
2
refinement of protein homology models
3 André Wildberg1†, Dennis Della Corte1 and Gunnar F. Schröder1,2*
4 5
1
Forschungszentrum Jülich, Institute of Complex Systems, ICS-6: Structural Biochemistry,
6
52425 Jülich, Germany, 2Physics Department, Heinrich-Heine Universität Düsseldorf, 40225
7
Düsseldorf, Germany
8 9 10
† current address: Department of Chemistry and Biochemistry, University of California, 9500 Gilman Drive, La Jolla, CA 92093, USA
11 12
*corresponding author
13 14
Abstract
15
Atomic models of proteins built by homology modeling or from low-resolution experimental
16
data may contain considerable local errors. The refinement success of molecular dynamics
17
simulations is usually limited by both force field accuracy and by the substantial width of the
18
conformational distribution at physiological temperatures. We propose a method to overcome
19
both these problems by coupling homologous replicas during a molecular dynamics simulation,
20
which narrows the conformational distribution, and smoothens and even improves the energy
21
landscape by adding evolutionary information.
22
1 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
The interpretation of genomics data in terms of protein structure is an important post genomic
3
challenge. Building atomic models for individual amino acid sequences becomes increasingly
4
important to understand the molecular effects of genetic variation. Homology modeling is a
5
useful tool to build atomic models if the structure of an homologous protein is known. However,
6
due to the limitation of current methodology such models of protein structures may contain
7
considerable errors. Similarly, atomic models built with low-resolution (e.g. from X-ray
8
diffraction or cryo-EM) or sparse experimental data might contain comparable errors.
9 10
Refinement approaches have the goal of correcting these errors in atomic protein models. The
11
types of errors we consider as being amenable to refinement include disrupted hydrogen bond
12
networks, small shifts of secondary structure elements, incorrect side-chain packing and
13
rotamers, and wrong loop conformations. Correcting such errors is typically challenging since
14
the energy differences between alternative, slightly different conformations are rather small. The
15
Critical Assessment of Structure Prediction (CASP) experiment has a refinement category to test
16
the performance of refinement methods 1,2.
17 18
Regular molecular dynamics (MD) simulations are generally unable to refine homology models
19
and do not consistently yield a structure that is moved closer to the native structure (as usually
20
determined by high-resolution X-ray crystallography) 3. Even though regular unrestrained MD
21
simulations can sample closer-to-native structures, reliably selecting these structures is not
22
possible 4.
23
2 ACS Paragon Plus Environment
Page 2 of 19
Page 3 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
The main causes for this limitation of MD simulations are 1) force field inaccuracies 5, 2) high
2
energy barriers that need to be crossed, and 3) the fact that the Boltzmann distribution, which is
3
approximated by the MD simulation, is broad at physiological temperatures. Simulation at
4
physiological temperatures is however necessary for proper contribution of the entropy; only
5
then is the free energy of conformational states correctly described.
6 7
Position restraints have been used successfully to prevent the simulation from exploring the
8
broad Boltzmann distribution; these restraints force the structure to sample a region around the
9
starting model, which also leads to sampling closer-to-native structures with higher probability 5-
10
7
11
which might hinder sufficient sampling and refinement.
. Position restraints however also set an upper limit to the extent of the conformational change,
12 13
The goal of structure refinement is to determine the most probable conformation, which
14
corresponds to the free energy minimum, rather than the conformational distribution. It has been
15
shown that the most probable conformation can be approximated by averaging the structures
16
from an MD ensemble more robustly than by selecting a single structure with a scoring
17
function6,7. However, for the averaging to yield a good structure requires the simulation to
18
predominantly sample near native structures.
19 20 21
We here present in two steps a modified MD protocol that addresses all three problems of regular
22
MD simulations mentioned above (force field inaccuracies, high energy barriers, broad
23
Boltzmann distribution).
3 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
In the first modification step, we simulate simultaneously eight identical replicas of the starting
3
structure. These replicas are subjected to the same harmonic position restraints (on Cα-atoms),
4
which forces them to remain similar to each other. The positions of the restraints are constantly
5
updated during the simulation and slowly follow the motion of the center of mass of all replicas.
6
These adaptive restraints were inspired by deformable elastic network restraints (DEN), which
7
have been shown to guide structure refinement against X-ray diffraction and cryo-EM data 8-10.
8
Since the restraints are adaptive, the coupled replicas are allowed to undergo any conformational
9
motion as long as they stay close together. The number of replicas has been chosen here simply
10
to limit the computational expense and it is not yet clear how the results depend on the number of
11
replicas.
12 13
The harmonic restraints restrict larger motions more than smaller motions, which leads to a time-
14
scale dependent diffusion coefficient (cf. Supplementary Fig. 1). For small time-scales the size
15
of the diffusion coefficient is comparable to that of free MD simulations, which enables
16
individual replicas to cross local energy barriers. In addition, entropic contributions of solvent
17
and side-chains (which are not restrained) are not strongly affected, which means that in
18
particular the solvation free energy is mostly unperturbed. For longer time-scales the diffusion
19
coefficient decreases significantly, which reduces large conformational fluctuations. Smaller
20
fluctuations mean that the system of coupled replicas is less likely to drift in random directions
21
and will sample low free energy states more frequently than a free MD simulation. Furthermore,
22
the coupling of replicas has an effect of smoothening the energy landscape, similar to particle
23
swarm optimization, which has been applied to MD simulations before 11. The motion of the
4 ACS Paragon Plus Environment
Page 4 of 19
Page 5 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
center of mass is the result of an effective force averaged over all replicas. Because the replicas
2
are in different positions on the energy landscape the center of mass moves on a locally
3
averaged, i.e. smoothened energy landscape (see Methods).
4 5
In the second modification step, the target sequence in seven of the replicas is replaced with
6
homologous sequences. This is motivated by the observation that structure is much more
7
conserved than sequence which causes homologous proteins to fold into similar structures 12.
8
This fact can be exploited by coupling homologous proteins (with pairwise sequence identity of
9
at least 50%) instead of identical replicas during a MD simulation. Keasar et al. 13-15 have
10
proposed that such a coupling of homologous proteins with slightly different energy landscapes
11
results in an energy landscape that is smoothened not only in structure space but also in sequence
12
space. The methodology was implemented in the GROMACS 4.5.3 16 software (see Online
13
Methods).
14 15
Our method has been tested successfully in the recent CASP11 experiment and was among the
16
best performing methods17. Of all 37 refinement targets 65 % could be improved. For the
17
improved models the average GDT-HA score was increased by 6.6. For those models that could
18
not be improved the average GDT-HA score decreased by 3.9. To understand why and how our
19
approach works we studied here the refinement of 5 repesentative homology models in more
20
detail. To avoid any bias we selected 5 models from a publicly available set of decoys that was
21
not generated by us (the Badretdinov decoy set 18, see Supplementary Table 1).
22
5 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Each model was simulated 5 times for 10 ns each. We compared three different simulation
2
protocols: 1) a free MD simulation of the homology model, 2) coupled identical replicas, and 3)
3
coupled homologous replicas.
4 5
Figure 1 shows for each test case average and standard deviation of distance root-mean square
6
(dRMSD) values from 5 independent simulations. The dRMSD is used to measure the deviation
7
of two atomic models. It is calculated as the root mean square deviation of corresponding pairs
8
of Cα-atom distances in two structures. All possible pairs of Cα-atoms are considered.
9 10
Simulations with coupled replicas (with both homologous (Fig. 1, first column) and identical
11
replicas (Fig. 1, second column)) show clear improvement over regular free MD simulations
12
(Fig. 1, third column). In free MD simulations the structure drifted away from the correct
13
structure in all cases except for 1hdn-1ptf, where a small number of frames was improved. In
14
simulations with identical replicas 51 % of the frames were improved on average, but the
15
improvements are small fluctuations around the null refinement (horizontal red line). In contrast,
16
simulations with homologous replicas consistently improve the structure and sample
17
conformations closer to the native structure most of the time.
18 19
The improvement of the structures from the different refinement protocols is quantified by the
20
change in dRMSD of the average structure from each trajectory, compared to that of the starting
21
structure (see Fig. 2). Free MD simulations led in all 5 cases to the lowest structural quality.
22
Comparison of the average improvements (Fig. 2, dashed lines) shows an offset between free
23
MD simulation and coupling of identical replicas, which can be attributed to the particle swarm
6 ACS Paragon Plus Environment
Page 6 of 19
Page 7 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
optimization effect. More importantly another offset can be seen between coupling with
2
identical and with homologous replicas, which represents the improvement that is due to the
3
added evolutionary information and which we interpret as an improvement of the force field.
4
Only for 1lpt-1mzl, which has the lowest starting quality of 3.8 Å RMSD, the average dRMSD
5
was not improved (see Fig. 1 c.1), however, the dRMSD of the average structure was slightly
6
improved (Fig. 2), clearly showing the benefit of structural averaging6.
7 8
Figure 3 compares the result of the simulation with coupled homologous replicas (green) with
9
the starting model (purple) and the native target structure (blue). Several secondary structure
10
elements and loop regions are shifted towards the correct structure. In contrast to free MD
11
simulations, the coupled-replica simulations yielded very consistent results: the average
12
structures from 5 independent simulations are very similar, as visualized in Supplementary Fig. 2
13
by multi-dimensional scaling 19.
14 15
We found that refinement with coupled homologous replicas outperforms regular MD
16
simulations in all test cases. The additional evolutionary information and the reduction in global
17
fluctuations through coupling of homologous replicas leads to consistently sampling structures
18
closer to their native state compared with free MD simulations. This insight will help to develop
19
even more powerful refinement methods based on MD.
20 21 22 23
7 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
METHODS
2
Implementation of replica coupling in GROMACS 4.5.3. For the replica-coupled
3
simulations, the simulation box was composed of 8 replicas, which are positioned at the edges of
4
a cube. The distance between the replicas needs to be large enough to avoid electrostatic
5
interactions between the replicas.
6
The replicas are coupled through adaptive position restraints on all Cα-atoms. GROMACS does
7
not by default support dynamic updates of position restraints during a simulation. We
8
implemented the necessary changes into the source code of Gromacs 4.5.3 16 to enable updates
9
without reducing the speed of GROMACS. We implemented the changes only for domain
10
decomposition runs (in source code file domdec.c). For each Cα-atom i in each replica j a
11
position restraint , is defined on its initial position. The total energy is then the sum of the
12
energy of the eight uncoupled (solvated) replicas and the time dependent energy term for the
13
position restraints, Eposre, given by
14
posre ( ) = , ( ) − , ( ) (1)
15 16
with the coordinates , of Cα-atom i in replica j, the number of atoms N, and the number of
17
replicas M which we chose to be 8. The force constant w was set to 100 kJ/(mol nm2). After a
18
period, n, of 500 steps the position restraints are updated according to:
19 , ( + Δ ) = , ( ) + 〈, ( ) − , ( )〉 (2) 20
8 ACS Paragon Plus Environment
Page 8 of 19
Page 9 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
with the integration timestep ∆t of 2 fs. The relaxation rate κ at which the position restraints
2
follow the average coordinate displacement was set to 0.5. The same displacement vector
3
〈, ( ) − , ( )〉 , which is an average over the corresponding displacements in all replicas j,
4
is added to all replicas, which leads to a coupling of the replicas. These adaptive restraints were
5
inspired by deformable elastic network (DEN) restraints, which yield a similar effect for a γ-
6
value of 1. The original DEN method employs a network of (also long) distance restraints,
7
which cannot efficiently be parallelized with domain decomposition. We therefore decided to use
8
adaptive position restraints.
9 10
For identical replicas the assignement of corresponding atoms in different replicas is trivial.
11
However, in case of homologous replicas, a multisequence alignment is performed to assign each
12
Cα-atom from the starting sequence to the corresponding Cα-atoms in the homologs. If there are
13
no gaps or insertions the assignment is again trivial. If the alignment shows a gap for k sequences
14
at a certain amino acid position, then position restraints are applied and averaged only for the
15
remaining (8-k) residues that are present at this position, which means that the displacement
16
vector will be averaged over (8-k) replicas. Insertions will not generate extra position restraints;
17
those residues instead are kept unrestrained and are free to move. The total number of position
18
restraints is therefore always identical to the number of Cα-atoms in the target sequence.
19 20 21
Method availability. The modified GROMACS version with adaptive position restraints to
22
couple multiple replicas is available from the SimTK website: http://simtk.org/home/adpt-
23
gromacs .
9 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
MD Protocols. All simulations used the AMBER99SB-ILDN force field with TIP3P explicit
3
water with an integration time step of 2 fs. Temperature was kept constant at 300 K by the Nosè-
4
Hoover algorithm. Electrostatic long-range interactions were calculated with PME and bond-
5
lengths were constrained by the P-LINCS approach. Na+/Cl- ions were added at physiological
6
concentration. Before and after adding the solvent molecules, the structure was energy
7
minimized to remove any sterical clashes that may be the result of homology model building.
8
Each simulation was repeated 5 times with duration of 10 ns. For comparison we performed
9
three different simulation protocols: 1) free MD simulation of a single protein structure, 2)
10
replica-coupled simulation with identical sequences, and 3) replica-coupled simulations with
11
homologous sequences.
12
The computational expense for the replica-coupled simulations is much larger than for the single
13
MD simulations, since the simulation system is eight times larger. The total amount of
14
simulation time equals a single protein simulation in solvent for 4.25 µs.
15 16 17
Sequence Selection Strategy. To build the homologous replicas, seven homologous sequences
18
were searched via BLAST 20 on the RefSeq 21 database. Sequences were manually selected that
19
fulfilled two criteria: 1) their sequence identities with the target sequence needs to be between 50
20
and 80 %, and 2) the sequence identities between all pairs of the 8 sequences should ideally also
21
be in the range 50–80 %. However, for some test cases the second criterion could not be strictly
22
fulfilled. The sequences chosen haven an average sequence identity of 61.8 % to the target
10 ACS Paragon Plus Environment
Page 10 of 19
Page 11 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
structure and are shown in Supplementary Table 2. The homology models used as the replicas
2
were generated with MODELLERv9 22.
3 4
Test case selection strategy. Homology models from the Badretdinov decoy set18 were chosen
5
as test cases. We aimed to cover a wide range of protein properties, such as size, secondary
6
structure composition and shape. The five homology models that were selected represent
7
starting qualities between 2-4 Å RMSD to the solved crystal structure. The sequence lengths
8
vary between 70 and 220 amino acids. The details of the selected models are shown in
9
Supplementary Table 1. The naming scheme of the models is xxxxX-yyyy, where xxxx and
10
yyyy are the PDB IDs of the the template and the target, respectively, and X is the chain ID of
11
the target.
12 13
ACKNOWLEDGMENTS
14
The authors gratefully acknowledge the computing time granted on the supercomputer JUROPA
15
at Jülich Supercomputing Centre (JSC).
16 17
SUPPORTING INFORMATION
18
Mean square displacement and multidimensional scaling plots for different simulation protocols,
19
properties of test targets and homologous replicas. This information is available free of charge
20
via the Internet at http://pubs.acs.org/.
21
11 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
FIGURE CAPTIONS:
2 3
Figure 1: Comparison of different refinement protocols (1–3). Five test models (a–e) have been
4
simulated 5 times with three MD protocols: coupled homologous replicas (1,left column),
5
coupled identical replicas (2, middle column), and free MD simulation (3, right column).
6
dRMSD values were averaged over the 5 independent trajectories (black lines). The standard
7
deviation is shown in pink. dRMSD values below the null refinement line (red line) indicate
8
improved frames. The bottom row (f) shows the average and standard deviation over the 5 test
9
cases for each refinement protocol.
10 11
Figure 2: The average dRMSD for each test case and each refinement protocol is plotted as well
12
as the average over the five test cases (dotted lines). Coupling identical sequences leads to a
13
clear improvement over free MD simulations. Coupling of homologous replicas leads to a
14
consistent refinement of all 5 test cases. Interestingly, the main improvement of using
15
homologous versus identical replicas is observed for the two test cases (1dvrA-1ak2 and1utrA-
16
1utg), for which coupling of identical replicas was not succesful.
17 18
Figure 3: For each test case, the starting structure (purple) is shown together with the best
19
model from the simulations with coupled homologous replicas (green) and the native structure
20
(blue). Refined regions are highlighted by red arrows. Figures were made with Chimera 23.
21
12 ACS Paragon Plus Environment
Page 12 of 19
Page 13 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Journal of Chemical Theory and Computation
REFERENCES
2 3
1
MacCallum, J.L.; Pérez, A.; Schnieders, M.J.; Hua, L.; Jacobson, M.P., Dill, K.A.
4
Assessment of Protein Structure Refinement in CASP9. Proteins: Struct., Funct., Bioinf.
5
2011, 79, 74-90.
6
2
7 8
Refinement Category. Proteins: Struct., Funct., Bioinf. 2014, 82, 98-111. 3
9 10
Nugent, T.; Cozzetto, D.; Jones, D.T. Evaluation of Predictions in the CASP10 Model
Fan, H.; Mark, A.E. Refinement of Homology based Protein Structures by Molecular Dynamics Simulation Techniques. Protein Sci. 2004, 13, 211-220.
4
Zhu, J.; Fan, H.; Periole, X.; Honig, B.; Mark, A.E. Refining Homology Models by
11
Combining Replica exchange Molecular Dynamics and Statistical Potentials. Proteins:
12
Struct., Funct., Bioinf. 2008, 72, 1171-1188.
13
5
Raval, A.; Piana, S.; Eastwood, M.P.; Dror, R.O.; Shaw, D.E. Refinement of Protein
14
Structure Homology Models via Long, All-atom Molecular Dynamics Simulations.
15
Proteins 2012, 80, 2071-2079.
16
6
Mirjalili, V.; Feig, M. Protein Structure Refinement through Structure Selection and
17
Averaging from Molecular Dynamics Ensembles. J. Chem. Theory Comput. 2013, 9,
18
1294-1303.
19
7
Mirjalili, V.; Noyes, K.; Feig, M. Physics based Protein Structure Refinement through
20
Multiple Molecular Dynamics Trajectories and Structure Averaging. Proteins: Struct.,
21
Funct., Bioinf. 2014, 82, 196-207.
13 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
8
Schröder, G. F.; Brunger, A.T.; Levitt, M. Combining Efficient Conformational Sampling
2
with a Deformable Elastic Network Model Facilitates Structure Refinement at Low
3
Resolution. Structure 2007, 15, 1630-1641.
4
9
5 6
with Low-resolution Data. Nature 2010, 464, 1218-1222. 10
7 8
11
12
13
14
Keasar, C.; Tobi, D.; Elber, R.; Skolnick, J. Coupling the Folding of Homologous Proteins. Proc. Natl. Acad. Sci. USA 1998, 95, 5880-5883.
15
17 18
Keasar, C.; Elber, R.; Skolnick, J. Simultaneous and Coupled Energy Optimization of Homologous Proteins: A New Tool for Structure Prediction. Fold Des 1997, 2, 247-259.
15 16
Chothia, C.; Lesk, A.M. The Relation between the Divergence of Sequence and Structure in Proteins. EMBO J 1986, 5, 823-826.
13 14
Huber, T.; van Gunsteren, W.F. SWARM-MD: Searching Conformational Space by Cooperative Molecular Dynamics. J. Phys. Chem. A 1998, 102, 5937-5943.
11 12
Schröder, G.F.; Levitt, M.; Brunger, A.T. Deformable Elastic Network Refinement for Low-resolution Macromolecular Crystallography. Acta Cryst. 2014, D70, 2241-2255.
9 10
Schröder, G.F.; Levitt, M.; Brunger, A.T. Super-resolution Biomolecular Crystallography
Keasar, C.; Elber, R. Homology as a Tool in Optimization Problems: Structure Determination of 2D Heteropolymers. J. Phys. Chem. 1995, 99, 11550-11556.
16
Pronk, S.; Páll, S.; Schulz, R.; Larsson, P.; Bjelkmar, P.; Apostolov, R.; Shirts, M.R.;
19
Smith, J.C.; Kasson, P.M.; van der Spoel, D.; Hess, B.; Lindahl, E. GROMACS 4.5: A
20
High-throughput and Highly Parallel Open Source Molecular Simulation Toolkit.
21
Bioinformatics 2013, 29(7), 845-854.
22 23
17
CASP11 Refinement Results, http://predictioncenter.org/casp11/zscores_final_refine.cgi (accessed May 16, 2015)
14 ACS Paragon Plus Environment
Page 14 of 19
Page 15 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Journal of Chemical Theory and Computation
18
2 3
Badretdinov, A. (1997). Sali Lab Decoy Sets for Model Assessment. ftp://salilab.org/pub/azat/models-a/ (accessed Jan 21, 2006).
19
MDSJ: Java Library for Multidimensional Scaling, Version 0.2. Algorithmics Group,
4
University of Konstanz, 2009. http://www.inf.uni- konstanz.de/algo/software/mdsj/
5
(accessed Jun 12, 2015)
6
20
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman,
7
D.J. Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search
8
Programs. Nucleic Acids Res. 1997, 25(17), 3389-3402.
9
21
Pruitt, K. D.; Tatusova, T.; Maglott, D.R. NCBI Reference Sequences (RefSeq): A
10
Curated Non-redundant Sequence Database of Genomes, Transcripts and Proteins.
11
Nucleic Acids Res. 2007, 35, D61-D65.
12
22
13 14
Fiser, A.; Sali, A. Modeller: Generation and Refinement of Homology-based Protein Structure Models. Methods Enzymol. 2003, 374, 461-491.
23
Pettersen, E.F.; Goddard, T.D.; Huang, C.C.; Couch, G.S.; Greenblatt, D.M.; Meng, E.C.;
15
Ferrin, T.E. UCSF Chimera — A Visualization System for Exploratory Research and
16
Analysis. J. Comput. Chem. 2004, 25, 1605-1612.
17 18 19 20
15 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2 3 4
TOC
5 6 7 8
16 ACS Paragon Plus Environment
Page 16 of 19
Page 17 of 19 restraints Journal of Chemicalrestraints Theory and Computation ensemble ensemble free MDsimulations simulation Replica restraints with with Free MD Replica restraints with with homologous sequences identical sequences homologous sequences of a single protein identical sequences 5
start dRMSD start dRMSD mean dRMSD mean dRMSD
a.1
a.2
a.3
1dvrA-1ak2
1dvrA-1ak2
b.1
b.2
b.3
1hdn-1ptf
1hdn-1ptf
1hdn-1ptf
c.1
c.2
c.3
1lpt-1mzl
1lpt-1mzl
1lpt-1mzl
d.1
d.2
d.3
1pod-1poa
1pod-1poa
1pod-1poa
e.1
e.2
e.3
1utrA-1utg
1utrA-1utg
1utrA-1utg
f.2
f.3
1dvrA-1ak2
+/-
dRMSD (Å)
dRMSD (Å)
dRMSD (Å)
dRMSD (Å)
dRMSD (Å)
1 2 4 3 4 3 5 6 2 7 8 9 5 10 11 4 12 13 3 14 15 2 16 17 5 18 19 4 20 21 3 22 23 2 24 25 26 5 27 28 4 29 30 3 31 32 2 33 34 5 35 36 4 37 38 3 39 40 2 41 42 431.4 441.2 45 1 0.8 460.6 470.4 480.2 49 0 50-0.2 -0.4 51 52 53 54
f.1
dRMSD (Å)
null refinement mean refinement +/- deviation standard mean refinement
0
2
4
6
time (ns)
8
ACS Paragon 0 2 4Plus Environment 6 8 10
10
time (ns)
0
2
4
6
time (ns)
8
10
Figure 2
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 18 of 19
coupling impact coupling + sequence impact
ACS Paragon Plus Environment
Page 19 of 19 1ak2
1 2 3 4 5 6 7 8 9 10 11 poa 12 13 14 15 16 17 18 19 20 21 22 23 24 25 utg 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1lpt-1mzl
Journal of Chemical Theory and Computation 1lpt-1mzl
Figure 3
1hdn-1ptf 1hdn-1ptf
1dvrA-1ak2 1dvrA-1ak2 15%
26%
74%
49%
51%
85% free
1pod-1poa 1pod-1poa
1utrA-1utg 1utrA-1utg 26%
74%
ACS Paragon Plus Environment