A Molecular Dynamics Simulation Study on Factor Xa - ACS Publications

May 6, 2010 - extensive hydrogen bonding network stabilizes the complex and, thus, is very ... chain of an fXa “apo-structure” and an fXa-ligand c...
0 downloads 0 Views 5MB Size
J. Phys. Chem. B 2010, 114, 7405–7412

7405

Stabilizing of a Globular Protein by a Highly Complex Water Network: A Molecular Dynamics Simulation Study on Factor Xa Hannes G. Wallnoefer,†,‡ Sandra Handschuh,† Klaus R. Liedl,*,‡ and Thomas Fox† Computational Chemistry, Department of Lead DiscoVery, Boehringer Ingelheim Pharma GmbH & Co. KG, 88397 Biberach, Germany, and Institute of General, Inorganic and Theoretical Chemistry, UniVersity of Innsbruck, Innrain 52a, 6020 Innsbruck, Austria ReceiVed: February 24, 2010

The role of water molecules is increasingly attracting attention in structural biology, and many studies have demonstrated their crucial contribution to the stability and function of proteins. Here, we present molecular dynamics studies on factor Xa (fXa) to investigate the effect of water molecules in this serine protease. fXa is a key enzyme in the blood coagulation cascade, and thus, an important target for antithrombotic drugs. A reasonable representation of the structure is crucial for an investigation at the molecular level and, thus, a prerequisite for structure-based drug design. Simulations of well-resolved fXa X-ray structures with different sets of water molecules show the importance of a well-determined water set for the simulation. We discuss implications of different water sets on the structure and dynamics of fXa. Introduction Water is generally recognized as the essential medium for any being. However, the deep impact as an essential constructive component of biomolecules is still underestimated.1,2 The interaction patterns of water molecules with up to four hydrogen bonds and, even more, their localization, which is important for the function and dynamics of proteins, RNA, and DNA,3 is often difficult to assess. A number of publications have emphasized the importance of water molecules inside proteins. Fenimore et al.4 showed the crucial role of water molecules not only for the structure but also for many motions of myoglobin. For alanine racemase,5 Mustata and Briggs analyzed the water positions in seven X-ray structures. Clustering of the water molecules resulted in 47 water clusters, many of which were situated in the interface between two monomers of alanine racemase. They showed that an extensive hydrogen bonding network stabilizes the complex and, thus, is very important for the structure. Moreover, the functional importance of water molecules was highlighted as a crucial constituent for proton equilibrium in the binding site. In molecular dynamics (MD) simulations, the consensus water, moreover, was important to keep the alanine racemase dimer structure stable. For serine proteases, several studies have been published. Bottoms et al.6 investigated the water in serine proteases and proposed that waters should be considered as an important structural characteristic. Moreover, 21 water binding sites were identified, which are conserved over the whole protein family.7-9 These sites are distributed over the whole structure and consist of either single water molecules or clusters of water molecules. They play an important role for the structural stability and the function of the enzymes. For Subtilisin Carlsberg, the fundamental role of water was shown via MD simulations.10 A water cluster analysis for trypsin and thrombin, which are other * Corresponding author. Phone: 0043 (0)512 507-5164. E-mail: [email protected]. † Boehringer Ingelheim Pharma GmbH & Co. KG. ‡ University of Innsbruck.

members of the serine protease family and highly homologous to factor Xa (fXa), showed the high structural conservation and important role of several water molecules.11,12 Such a high conservation over several homologous proteins often indicates a structural role of the waters.13 One example for structural conservation of water molecules is a channel that leads from the S1 subpocket of serine proteases, a decisive part of the binding pocket, to the protein surface. In all experimental X-ray structures with resolved water molecules, it is filled with a chain of water molecules that are considered to be important for the catalytic function.14-17 If some of these waters are not seen in a serine protease X-ray structure, this is probably due to low resolution of the experiment.18 It was also shown that missing or badly placed water molecules in fXa may lead to computational artifacts.19-23 fXa is one of the key enzymes in the blood coagulation cascade.24-35 It is the joining point for the extrinsic and the intrinsic pathway of fibrin production. In the case of abnormal function, this important defense of the body against vascular injuries can lead to severe thrombotic diseases, that is, unwanted clot formation. Due to this, fXa is a promising target for antithrombotic drugs, and consequently, a number of pharmaceutical companies have initiated discovery projects to obtain fXa inhibitors.26,36,37 fXa is synthesized in the liver and consists of two chains. A light chain of 139 residues and a heavy chain of 305 residues, which contains the catalytic region, and is connected via a disulfide (Cys132-Cys302) bridge to the light chain. Probably also due to the fact that fXa is a highly interesting target for drug design, a wealth of structural data is available: over 150 crystal structures of the protein or protein-ligand complexes have been deposited in the Brookhaven Protein Data Base (PDB).38 In contrast to this, to our knowledge, only a few computational simulations have been published so far.39-43 The first one, by Daura et al.,40 compared short MD simulations of the heavy chain of an fXa “apo-structure” and an fXa-ligand complex. However, the former structure (PDB-code: 1HCG) is no real apo-structure because the binding pocket is occupied by the

10.1021/jp101654g  2010 American Chemical Society Published on Web 05/06/2010

7406

J. Phys. Chem. B, Vol. 114, No. 21, 2010

C-terminal arginine of the next asymmetric unit. Moreover, residues 145-151 of the starting structure were not resolved in the X-ray experiment, and this gap of six residues forming a loop close to the active site was modeled by three glycines. Another study used MD simulations to compare the dynamics of zymogene and activated fX.43 Again, parts of the protein structures were missing and replaced by modeled residues with factor VIIa as template. A third MD study focused on the oligopeptide inhibitor ecotin from Escherichia coli.42 Studying the holo-structure, the authors proposed two conformations for the binding site. The widely observed conformation of the Phe174, the Tyr99, and the Trp215 of the S4 subpocket is the “open” conformation. In addition, a “closed” conformation resulting from an approach of the aromatic rings of the S4 subpocket, and the formation of π-π stacking interactions was observed. Steered MD was performed to remove ecotin from the binding site and to show the conversion from the “open” to the “closed” conformation in this subpocket. Another MD simulation study used rigid fXa39 to cluster water positions in the binding site and to evaluate ligand binding energies by displacement of the clusters. A common observation in all MD studies is a fluctuating and fairly high overall rmsd (root-mean-square deviation of the atom positions) values for the backbone of the proteins. Generally, MD simulations are considered stable when the backbone rmsd is in the low angstrom area. However, in the case of factor Xa simulations, Daura et al.40 reported rmsd values of over 2 Å for the backbone and over 3 Å for all atoms after of 1 ns of simulation. Venkateswarlu et al.,43 who simulated both the heavy and the light chain, had an overall backbone rmsd up to 7 Å and, again significantly, over 2 Å for the serine protease domain after 6.2 ns of simulation. Singh et al.42 observed rmsd values up to 3 Å for the backbone of the heavy chain. High rmsd’s from the (experimental) starting structure are often associated with inadequacies in the description of the biological systems. This could be due, in principle, to weaknesses in the force field used; however, numerous long-time MD simulations have shown that modern force fields allow the simulation of proteins with high stability.44-49 An alternative explanation could be that factor Xa is exceptionally sensitive to the exact setup of the MD simulation system. Especially, the system of water molecules at the surface or the interior of the protein might play a critical role in stabilizing the protein. Here, we suggest that a well-prepared and carefully chosen set of water molecules inside and in the close vicinity of factor Xa is a prerequisite to obtain stable MD simulations of the heavy chain of the protein, which points to the biological importance of these water molecules. The large amount of structural data available for factor Xa allows a detailed analysis of the X-ray water molecules largely independent of X-ray resolution and possible bias in structure determination. Several protocols to obtain a set of water molecules around fXa are described, and the influence of these water sets on the stability of fXa MD simulations are investigated. Methods Protein Structures. Eighty-four entries with an EC number of 3.4.21.6 (coagulation factor Xa) were extracted from the PDB.38 Entries without resolved water molecules (15 structures) were removed from the data set. The dimers (4 structures, PDB codes: 1XKB, 2GD4, 1IOD, and 3ENS) were divided into two single entries for further analysis, because the water structure differs in the monomeric units. This yielded a total of 73 structures for analysis.

Wallnoefer et al. Preliminary MD simulations in our laboratory showed that the light chain, which is connected to a negatively charged phospholipid membrane in the prothrombinase complex, is highly flexible in aqueous solution and moves several angstroms within a few nanoseconds. An alignment of the available X-ray structures showed a high conformational diversity of the light chain. To focus on the serine protease domain, we decided to cut off this flexible part and to use the globular heavy chain only. This is in line with most of the previously published work. Removing the light chain leaves the heavy chain side of the disulfide bridge Cys132-Cys302 a solvent-exposed mercapto group. In addition, a second independent set of 75 fXa monomers were prepared analogously from a BI (Boehringer Ingelheim) in-house fXa structure.50 Clustering of Water Molecules. All structures were aligned in MOE 2008.10.51 To identify water molecules close to the protein, in all structures, the solvent accessible surface area (SASA) for each water molecule was calculated with MOE. Then all water molecules with a SASA > 0 were removed, yielding 0-31 water molecules per structure (see the Supporting Information, S2). The water molecules of all structures were joined, resulting in well over 1000 coordinate sets. To reduce the number of waters to a manageable size and also to remove bias from individual structures, we decided to cluster the water molecules with an exclusion sphere algorithm.52 The resulting clusters were visually inspected, and positions were selected where molecules would have optimal distances to hydrogen bond partners. In two cases, the size of the cluster suggested that it contains two water positions within a hydrogen bond distance. Consequently, two water molecules were placed accordingly. In the following, we will call the placed waters “water hot spots”. Structure Preparation. For the MD simulations of the pdb data set, the structure with the PDB code 2BOH was chosen. It has a comparatively high resolution of 2.2 Å, a large number of resolved X-ray waters (221), and no gaps in the structure. As an example from the BI data set, a comparable protein with 1.95 Å resolution and 392 water molecules was chosen. In both structures, the ligand and the light chain were removed. One calcium ion and one sodium ion, which are known to be structurally important,53,54 were either kept or modeled according to other X-ray structures. The structures were protonated with the 3D-Protonate tool of MOE55 and then manually reviewed (especially for correct protonation states). The further preparation of the structures was performed in the leap tool of Amber10.56 The parameters for the Ca2+ ion were taken from Bradbrook et al.57 Three different setups for the water placements were performed: (1) all crystallographic water molecules were removed, (2) all crystallographic water molecules were retained, and (3) the crystallographic waters were replaced with the consensus water set obtained from the clustering. Finally, an octahedral box of water molecules with 10 Å distance from the protein to the box wall (about 6100 water molecules) was placed around the protein, resulting in a system containing about 22 000 atoms. MD Simulations. The sander module of Amber1056 with the ff99SB force field58 was used. After an extensive equilibration (see the Supporting Information, S1), 12.5 ns of NPT MD simulation without any additional restraints was performed. The resulting trajectories were analyzed for convergence in the usual MD simulation parameters (energies, density, temperature, etc.). Further analyses were performed with the program ptraj of the

MD Simulation Study on Factor Xa

J. Phys. Chem. B, Vol. 114, No. 21, 2010 7407

AMBER toolkit. A more detailed information of methods and parameters is given in the Supporting Information (S1-S3). Results Clustering. Alignment of the 73 fXa structures from the PDB and applying the SASA cutoff for the water oxygens yielded 1097 water molecules close to the proteins. Clustering reduced this number to 69 water clusters. Twenty-eight of the 69 clusters contained only one entry (see Supporting Information S3), meaning that this site is only occupied exactly once in the 73 protein structures. The largest cluster is associated with a void which is occupied by a water molecule in 68 of the 73 protein structures. Remarkably, after 21 clusters, a significant drop in the population is observed. Presumably, these are very localized and highly stable water molecules and, therefore, best resolved in X-ray structures. The singly occupied clusters were not considered further because here, the probability is high that this water site is very specific for a single protein structure or even an artifact of the X-ray structure. From the other clusters, one representative water molecule was chosen so that a reasonable (concerning the distance) hydrogen bonding network was possible. This resulted in a set of 38 water molecules. The analogous procedure for the BI data set gave 101 water clusters and a final set of 46 water molecules. Comparison of the two representative water sets showed that they share 36 water molecules if one assumes that after alignment of the two proteins (rmsd ) 0.23 Å), their mutual distances are smaller than 1 Å. Although all water hot spots have a SASA of 0, some of them can be designated as deeply buried and some are close to the surface. The latter can presumably exchange with bulk water during a MD simulation, whereas the deeply buried waters are not expected to do so. All of the deeply buried water hot spots are common to both sets. The crystal structures chosen for the MD simulation have a relatively high number of resolved water molecules. Nevertheless, eight water sites identified by the clustering algorithm for the PDB structures are not occupied in 2BOH. These additionally placed waters are as follows: (1) two hot spots in the S2 subpocket of the binding site, which would allow the placement of water molecules without interfering with the ligand; (2) one site located in the S1 subpocket at the beginning of the channel to the surface, again allowing the placement of a water molecule; (3) one site in a void between the side chains of Leu235, Thr210, and Pro124, connecting two R-helices and a β-sheet; and (4) four sites mediating H-bonds close to the surface. This analysis shows that even buried water molecules, which are putatively most important, are not necessarily resolved in an even high-quality crystal structure. Moreover, even a structure with many well-resolved water molecules may benefit from the inclusion of additional information from a larga number of X-ray structures. MD Simulations. rmsd. To elucidate the role of water molecules for the structural integrity of factor Xa, three sets of MD simulations were performed. One simulation started with an fXa structure containing no water molecules before the placement of the simulation water box (setup 1), one simulation started with all waters from the crystal structure (setup 2), and one simulation used water molecules placed on the water hot spots from the clustering (setup 3). Each of these setups was

Figure 1. CR rmsd values compared to the starting structure for the PDB structure 2BOH (top) and for the Boehringer Ingelheim in-house fXa structure (bottom). Green: simulation setup 1 (no waters before adding solvent box). Red: simulation setup 2 (crystallographic water molecules). Black: simulation setup 3 (water at clustered sites).

applied to a protein from the public PDB set (PDB code: 2BOH) and to a protein from the internal BI set, yielding six sets of MD simulations. The first indication of the quality of a simulation is the CR rmsd plot. It shows the conformational stability of the protein backbone relative to the starting structure. In this study, the CR carbon atom rmsd (Figure 1) increases over the course of the 12.5 ns simulation, to 1.58 (PDB) and 1.85 (BI) for simulation setup 3, to 1.65 (PDB) and 2.06 (BI) for simulation setup 2, and to 1.89 (PDB) and 2.42 (BI) Å for simulation setup 1. Thus, they show a clear stabilization of the system if the water molecules on the hot spots of the clustering or the waters from the crystal structure are included in the starting structures. More importantly, not only are the rmsd values smaller in average, but also the fluctuations are reduced, which indicates a smoother conformational behavior. This is especially pronounced for the BI simulation without waters in the starting structure (setup 1), where the rmsd value increases to almost 2.5 Å then decreases to about 1.7 A and again rises to over 2 Å between 5 and 10 ns of simulation. A more detailed picture of the conformational sampling is obtained with 2D rmsd plots, in which the mutual rmsd distances among all snapshots of the MD trajectory are calculated and plotted in a heat map style, which color-codes the conformational difference in the protein backbone. This way, conformational stability or transitions between regions in the phase space can be graphically analyzed. The CR atom 2D-rmsd plots for the 2 times 3 MD simulations are shown in Figure 2. The simulations from setup 1 without structural water (right column of Figure 2) show no stable

7408

J. Phys. Chem. B, Vol. 114, No. 21, 2010

Wallnoefer et al.

Figure 2. CR 2D rmsd plots for factor Xa simulations with PDB structure 2BOH (upper row) and in-house structure (lower row), color coded by rmsd (see color ramp on the right). From right to left, the panels show the results for setup 1 (no waters before soaking), setup 2 (crystallographic water molecules), and setup 3 (clustered water molecules).

regions. Despite the fact that the 1D rmsd value has reached a plateau, the 2D rmsd analysis shows that a conformational drift occurs during the simulation, and conformations from early stages of the simulations are not revisited later. This is especially true for the simulation starting from the BI structure; the deeply red off-diagonal areas indicate that there is decreasing similarity the longer the simulation lasts. A higher conformational stability is found if the crystallographic water molecules are retained in the simulation setup (setup 2, middle column of Figure 2); however, here again, the BI simulation shows a marked conformational drift and two clearly distinct conformations. For the 2BOH simulation, a conformational transition at 10 ns is visible. A further improvement is seen when the simulation setup places water molecules at the cluster sites (setup 3, left column of Figure 2). For the 2BOH simulation, over the whole course of the simulation, the rmsd value is below 1.3 Å, with a short exception at around 11 ns. Again, the BI simulation is less stable, as seen by a conformational transition at the end of the simulation. Overall, the improvement from crystallographic waters (setup 2) to waters placed at the clustering hot spots (setup 3) is less striking than compared to the case in which no water molecules (setup 1) are explicitly placed. This is not surprising because highly localized and presumably important water molecules should be resolved in any crystal structure. Nevertheless, a smoother conformational behavior with the additional information from the clustering can be observed. Therefore, a sampling of conformations close to the experimental picture is possible, whereas the simulations without explicitly placed water molecules in the starting structure show no converged behavior. B-Factors. In Figure 3, the positional fluctuations of the backbone atoms are plotted as B-factors on a per-residue basis. Interestingly, the simulation with the PDB structure and the BI structure show different regions in the protein with high B-factors, meaning that different regions are highly mobile during the simulations. The reason for this is not quite clear,

Figure 3. Calculated B-factors for backbone atoms vs protein residue number for the 2BOH (top) and the BI structure (bottom). Green: simulation setup 1 (no water molecules before adding bulk solvent). Red: simulations with the crystallographic waters (setup 2). Black: simulations with the clustered water structures (setup 3).

but it shows that MD simulations are very sensitive to the starting conditions, and even two structures of the same protein may behave differently: The two simulations sample different motions that are accessible for the protein. However, in both

MD Simulation Study on Factor Xa

J. Phys. Chem. B, Vol. 114, No. 21, 2010 7409

TABLE 1: B-Factors for All Six Simulations (setup 1,2, and 3 for 2BOH and the BI Structure)a 2BOH B-factors all atoms backbone atoms

BI structure

setup 1 setup 2 setup 3 setup 1 setup 2 setup 3 mean max mean max

56.41 608.65 19.82 300.82

47.26 781.90 15.40 220.79

45.96 773.51 13.88 100.9

65.44 997.72 28.95 343.44

58.25 816.97 20.28 186.26

55.68 830.8 17.88 94.66

a In all cases, the addition of the reasonable placed water molecules leads to stabilization. Again, as already shown with the backbone rmsd plots, the higher flexibility of the simulations with the BI structure can be observed.

systems, the simulations with the waters placed on the hot spots show the most homogeneous B-factor distribution over the whole protein. This is exactly what one expects for a globular protein such as the heavy chain of fXa, in which no domain motions relative to each other can occur. Most of the B-factors above 100 are related either to simulation setup 1 (no internal water in the starting structures) or simulation setup 2 (crystal structure waters). We suspect that most of these peaks indicate regions in the protein that are not adequately stabilized by water molecules. The mean and maximum B-factors are shown in Table 1. Here, a clear trend for stabilization, especially for the backbone atoms, is seen when reasonably placed water molecules have been added in the starting structures. As already shown with the rmsd plots, the simulations with the BI factor Xa structure show a higher flexibility. Examples for the Stabilization. In the following, we analyze some of the water hot spots in more detail to illustrate the reasons for the higher mobility observed above. The most occupied water hot spot, forming a water site with one water molecule, is situated at the center of the protein and is present in both crystal structures used as starting points for the structural models. If it was not present at the beginning of the simulation, we never observed during the 12.5 ns trajectory that this site is filled by another water molecule. This putative unphysical void inside the protein has significant consequences for the surrounding protein atoms. Figure 4a shows the structure of this site and the surrounding protein residues with the water and the induced movement if the water is missing. In the X-ray structure, this water is within H-bond distance to the backbone carbonyl oxygens of Gly186 and Leu38 and to the side chain oxygen of Thr30. If the water is not present during equilibration, the side chain of Ile202 turns to fill the empty water cavity as soon as the protocol allows movement of the protein heavy atoms. The van der Waals contact of Ile202 to Val221 breaks and a few picoseconds after the shift of Ile202, the side chain of Val221 flips toward the protein surface. In the residues next to Val221, the backbone angles change, inducing different H-bridge vectors for the carbonyl oxygens of Lys220 and Thr222. Exactly the same behavior can be observed in the BI structure MD simulation without water in the starting structure. Another example is a cluster of two water molecules next to the S4 subpocket of the binding site. Although this water site is close to the protein surface and only shielded by a single side chain from the bulk water, we never observed that a water molecule was entering from the bulk during our simulations. Figure 4b highlights the resulting structural distortion if the important water molecules are missing: Met168 tries to fill the hole with its side chain. This induces a permanent change in the local protein structure, which is not observed in the

simulations with correctly positioned water molecules. Because this is neighboring the binding site, this presumably would also lead to a different behavior of the protein toward a ligand. The third example is the explanation for the extreme difference in B-factors around residue 100 in the three 2BOH simulations (Figure 3). In the X-ray structure, a water site of four water molecules bridges the two residues Asn103 and Asp56 of two loops via a water-mediated loop-loop linkage (Figure 4c). The latter loop is situated inside the protein; therefore, it is restricted in its possibility to change conformation. Loop 100-105, however, is not stabilized by other structural elements, but it seems to be held in place by the bridge formed by the water molecules. During our simulations, we did not observe the spontaneous formation of this bridge, despite the fact that this area is well accessible from the solvent. Thus, unless the simulation setup places water molecules appropriately, loop 100-105 moves significantly, yielding high B-factors. Interestingly, the bulk waters in the comparable simulation of the BI structure do form a similar H-bond network to keep the 100-105 loop in position, resulting in a B-factor comparable to the simulations in which the water molecules are present from the beginning. In the simulation with setup 2 (crystallographic waters), three of the four water molecules are present in both crystal structures. This already leads to a stabilization, although no permanent and complete H-bond bridge appears. Discussion Water molecules have started to gain increased interest in drug design, and their crucial contribution to stability and function of biomolecules is widely accepted. However, their correct and realistic treatment in structural models is far from trivial. Here, we present a clustering approach to generate a representative intraprotein water network for factor Xa on the basis of vast structural information available for this protein. Splitting this into public accessible data from the Brookhaven’s Protein Data Bank and Boehringer Ingelheim in-house X-ray structures has two advantages: First, two completely independent starting points were generated. Simulating two different protein structures should result in a higher probability for the general validity of the results. Second, the protein structures available from the PDB were generated in several different laboratories and by many different crystallographers. Here, collecting all these different methods to measure and interpret electron density and applying different structural templates in a single data set should result in a more heterogeneous data set. On the contrary, the BI in-house data set should be much more homogeneous. The separate analysis of both data sets did, indeed, show that the resulting water sets have a significant overlap: although bestresolved water molecules, which are mostly deeply buried in the protein, are present in both sets, the less populated hydration sites differ between the sets. This is also reflected in the overall dynamic behavior observed in the MD simulations started from the PDB and the in-house protein structure. Whereas the absolute and detailed values (e.g. for rmsd plots and B-factors) differ, it is obvious that the overall trends when using either X-ray resolved waters or waters obtained from the clustering approach in the simulations are similar for both simulation sets. However, we note that some of the detailed differences may also be a consequence of the different protein starting structures. In any case, our results may serve as a caveat that results from a detailed analysis of MD simulation results may very much depend on subtle changes in the starting structure and simulation

7410

J. Phys. Chem. B, Vol. 114, No. 21, 2010

Wallnoefer et al.

Figure 4. (a) Water site 1 represents a buried water in the center of the protein that is surrounded by Ile202, Gly186, Thr39, and Leu38. Green: X-ray conformation; for clarity, the experimentally resolved water molecule is also shown. Cyan: conformation that forms during without the water molecule. The side chains try to fill the empty water site. (b) Water site 5 next to Met168. Green: X-ray structure of 2BOH. Cyan: conformation that forms during equilibration if the water molecules are not present. Experimentally, site no. 5 is filled by two water molecules (red). If they are not present in the starting structure, the side chain of Met168 moves to fill the unoccupied void, changing the environment around the binding subpocket S4. Contours: water occupancy without the waters in the starting structure. (c) Water site 8. The backbone structures at the end of the simulation for setup 1 (magenta), setup 2(cyan), and setup 3 (green) are shown. The side chain of Asn103 is connected to residues Gly55 and Asp56 by a chain of four water molecules. If this chain is present in the starting structure, it is stable throughout the whole simulation. If it is not present as in setup 1, it does not evolve from bulk water, leading to a significant movement of the loop bearing Asn103.

protocol, again highlighting that a number of truly independent simulations would be necessary to assess the dynamic behavior of a protein. Nevertheless, it could be shown clearly that starting a MD simulation with reasonably placed water molecules leads to a structural stabilization of the protein during the simulations. Moreover, a clear relation between microscopic events due to “missing” water molecules and the overall dynamics of the fXa protein structure could be demonstrated: the generation of a reasonable fXa structural ensemble is possible only with an extensive structure equilibration together with a realistic water network. Our molecular dynamics study shows that for the heavy chain of fXa, a globular enzyme with a very high water concentration, a reasonable preorganized water structure in the starting structure is crucial for the whole simulation. A number of important water molecules, which do not necessarily have to be deeply buried, are not replaced by bulk water from the simulation box. The aim of MD simulations is to generate flexible structural models of biomolecules. This is of fundamental importance for an accurate description of the system of interest. Hence, “motion” is the desired aim of a simulation. However, if this leads to unphysical structural distortions, or if the “wrong” or at least undesired part of the conformational space is sampled, an inaccurate structural ensemble is obtained. For further analysis, such as the calculation of free energies of binding, one would like to sample a part of the phase space in the proximity of the experimental structure. Thus, it is essential to check if one samples states that represent the system of interest. As we have shown, only the a priori inclusion of water

molecules at predefined positions in and around the protein ensured a reasonable sampling of phase space. Presumably, a long enough simulation would eventually lead to an equilibrated system that has water molecules at all the important sites; however, the time scale for such an equilibration is beyond current simulation capabilities, at least for routine applications. Conclusions We have demonstrated that for factor Xa, a correct internal water structure and a reasonable H-bond network as a preorganized feature is crucial for the generation of a representative structural ensemble via molecular dynamics simulation. Using all available structural data in a clustering approach was used to generate accurate water positions. Using this water entity was an important prerequisite to generate stable MD simulations. Because missing waters, especially those which are buried far from the protein surface, cannot be replaced by bulk water during the normal time span of a MD simulation, this usually results in more or less pronounced local distortion of the protein structure, and this often translates into overall changes in protein conformation and artifacts in the protein dynamics. The most severe errors that occur from omitting these important structural waters can be avoided by including the resolved X-ray waters. A further improvement can be achieved by using the clustered waters as an integral part of the protein starting structure. In summary, our results show that a thorough analysis of all available data on water molecules in and around proteins is very important. For factor Xa, the integration of important water molecules into the starting structure of a MD simulation was

MD Simulation Study on Factor Xa the key to obtaining stable trajectories. We are confident that such an analysis will improve the quality of generated trajectories also in other globular proteins beyond factor Xa. Supporting Information Available: S1: Detailed description of methods used. S2: Figure of resolved waters vs resolution in factor Xa X-ray structures. S3: Figure with cluster population. This material is available free of charge via the Internet at http:// pubs.acs.org. References and Notes (1) Lu, Y.; Wang, R.; Yang, C. Y.; Wang, S. Analysis of LigandBound Water Molecules in High-Resolution Crystal Structures of ProteinLigand Complexes. J. Chem. Inf. Model. 2007, 47 (2), 668–675. (2) Panigrahi, S. K.; Desiraju, G. R. Strong and Weak Hydrogen Bonds in the Protein-Ligand Interface. Proteins 2007, 67 (1), 128–141. (3) Brooks, C. L.; Karplus, M. Solvent Effects on Protein Motion and Protein Effects on Solvent Motion: Dynamics of the Active Site Region of Lysozyme. J. Mol. Biol. 1989, 208 (1), 159–181. (4) Fenimore, P. W.; Frauenfelder, H.; McMahon, B. H.; Parak, F. G. Slaving: Solvent Fluctuations Dominate Protein Dynamics and Functions. Proc. Natl. Acad. Sci. U.S.A. 2002, 99 (25), 16047–16051. (5) Mustata, G.; Briggs, J. M. Cluster Analysis of Water Molecules in Alanine Racemase and Their Putative Structural Role. Protein Eng., Des. Sel. 2004, 17 (3), 223–234. (6) Bottoms, C. A.; White, T. A.; Tanner, J. J. Exploring Structurally Conserved Solvent Sites in Protein Families. Proteins 2006, 64 (2), 404– 421. (7) Sreenivasan, U.; Axelsen, P. H. Buried Water in Homologous Serine Proteases. Biochemistry 1992, 31 (51), 12785–12791. (8) Guvench, O.; Price, D. J.; Brooks, C. L., III. Receptor Rigidity and Ligand Mobility in Trypsin-Ligand Complexes. Proteins 2005, 58 (2), 407–417. (9) Lesk, A. M.; Fordham, W. D. Conservation and Variability in the Structures of Serine Proteinases of the Chymotrypsin Family. J. Mol. Biol. 1996, 258 (3), 501–537. (10) Cruz, A.; Ramirez, E.; Santana, A.; Barletta, G.; Lo´pez, G. E. Molecular Dynamic Study of Subtilisin Carlsberg in Aqueous and Nonaqueous Solvents. Mol. Simul. 2009, 35 (3), 205–212. (11) Reichmann, D.; Phillip, Y.; Carmi, A.; Schreiber, G. On the Contribution of Water-Mediated Interactions to Protein-Complex Stability. Biochemistry 2008, 47 (3), 1051–1060. (12) Nilsson, M.; Ha¨ma¨la¨inen, M.; Ivarsson, M.; Gottfries, J.; Xue, Y.; Hansson, S.; Isaksson, R.; Fex, T. Compounds Binding to the S2-S3 Pockets of Thrombin. J. Med. Chem. 2009, 52 (9), 2708–2715. (13) Edsall, J. T.; McKenzie, H. A. Water and Proteins. II. The Location and Dynamics of Water in Protein Systems and Its Relation to Their Stability and Properties. AdV. Biophys. 1983, 16, 53–183. (14) Maxwell, M. K.; Enrico, D. C. Conserved Water Molecules in the Specificity Pocket of Serine Proteases and the Molecular Mechanism of Na+ Binding. Proteins 1998, 30 (1), 34–42. (15) Meyer, E.; Cole, G.; Radhakrishman, R.; Epp, O. Structure of Native Porcine Pancreatic Elastase at 1.65 Angstrom Resolution. Acta Crystallogr., Sect. B 1988, 44 (1), 26–38. (16) Silva, J.; Antunes, O. A. C.; de Alencastro, R. B.; De Simone, S. G. The Na+ Binding Channel of Human Coagulation Proteases: Novel Insights on the Structure and Allosteric Modulation Revealed by Molecular Surface Analysis. Biophys. Chem. 2006, 119 (3), 282–294. (17) Caldwell, R. A.; Boucher, R. C.; Stutts, M. J. Serine Protease Activation of near-Silent Epithelial Na+ Channels. Am. J. Physiol.: Cell Physiol. 2004, 286 (1), C190–C194. (18) Carugo, O.; Bordo, D. How Many Water Molecules Can Be Detected by Protein Crystallography. Acta Crystallogr., Sect. D 1999, 55 (2), 479–483. (19) Rao, M. S.; Olson, A. J. Modelling of Factor Xa-Inhibitor Complexes: A Computational Flexible Docking Approach. Proteins 1999, 34 (2), 173–183. (20) Verdonk, M. L.; Chessari, G.; Cole, J. C.; Hartshorn, M. J.; Murray, C. W.; Nissink, J. W.; Taylor, R. D.; Taylor, R. Modeling Water Molecules in Protein-Ligand Docking Using GOLD. J. Med. Chem. 2005, 48 (20), 6504–6515. (21) Doan, B. T.; Fraternali, F.; Do, Q. T.; Atkinson, R. A.; Palmas, P.; Sklenar, V.; Wildgoose, P.; Strop, P.; Saudek, V. A NMR and MD Study of the Active Site of Factor Xa by Selective Inhibitors. J. Chim. Phys. Phys. Chim. Biol. 1998, 95 (2), 443–446. (22) Fraternali, F.; Do, Q. T.; Doan, B. T.; Atkinson, R. A.; Palmas, P.; Sklenar, V.; Safar, P.; Wildgoose, P.; Strop, P.; Saudek, V. Mapping the Active Site of Factor Xa by Selective Inhibitors: An NMR and MD Study. Proteins 1998, 30 (3), 264–274.

J. Phys. Chem. B, Vol. 114, No. 21, 2010 7411 (23) Shokhen, M.; Khazanov, N.; Albeck, A. Screening of the Active Site from Water by the Incoming Ligand Triggers Catalysis and Inhibition in Serine Proteases. Proteins 2008, 70 (4), 1578–1587. (24) Borensztajn, K.; Peppelenbosch, M. P.; Spek, C. A. Factor Xa: at the Crossroads between Coagulation and Signaling in Physiology and Disease. Trends Mol. Med. 2008, 14 (10), 429–440. (25) Bounameaux, H. The Novel Anticoagulants: Entering a New Era. Swiss Med. Wkly. 2009, 139 (5-6), 60–64. (26) Ieko, M.; Tarumi, T.; Nakabayashi, T.; Yoshida, M.; Naito, S.; Koike, T. Factor Xa Inhibitors: New Anti-thrombotic Agents and Their Characteristics. Front. Biosci. 2006, 11 (1 P.1-446), 232–248. (27) McVey, J. H. Tissue Factor Pathway. Bailliere’s Clin. Haem. 1994, 7 (3), 469–484. (28) Broze, J. Tissue Factor Pathway Inhibitor and the Current Concept of Blood Coagulation. Blood Coagul. Fibrinolysis 1995, 6, SUPPL. 1. (29) Davie, E. W.; Fujikawa, K.; Kisiel, W. The Coagulation Cascade: Initiation, Maintenance, and Regulation. Biochemistry 1991, 30 (43), 10363– 10370. (30) Hauptmann, J.; Markwardt, F.; Walsmann, P. Synthetic Inhibitors of Serine Proteinases. XIV. Influence of 3- and 4-Amidinobenzyl Derivatives on the Formation and Action of Thrombin. Thromb. Res. 1978, 12 (5), 735– 744. (31) Hertzberg, M. Biochemistry of Factor X. Blood ReV. 1994, 8 (1), 56–62. (32) Stubbs, M. T.; Bode, W. Coagulation Factors and Their Inhibitors. Curr. Opin. Struct. Biol. 1994, 4 (6), 823–832. (33) Tidwell, R. R.; Webster, W. P.; Shaver, S. R.; Geratz, J. D. Strategies for Anticoagulation with Synthetic Protease Inhibitors. Xa Inhibitors versus Thrombin Inhibitors. Thromb. Res. 1980, 19 (3), 339– 349. (34) Warshel, A.; Naray-Szabo, G.; Sussman, F.; Hwang, J. K. How Do Serine Proteases Really Work. Biochemistry 1989, 28 (9), 3629–3637. (35) Hedstrom, L. Serine Protease Mechanism and Specificity. Chem. ReV. 2002, 102 (12), 4501–4523. (36) Kaiser, B. DX-9065a, a Direct Inhibitor of Factor Xa. CardioVasc. Drug ReV. 2003, 21 (2), 91–104. (37) Nazare, M.; Will, D. W.; Matter, H.; Schreuder, H.; Ritter, K.; Urmann, M.; Essrich, M.; Bauer, A.; Wagner, M.; Czech, J.; Lorenz, M.; Laux, V.; Wehner, V. Probing the Subpockets of Factor Xa Reveals Two Binding Modes for Inhibitors Based on a 2-Carboxyindole Scaffold: A Study Combining Structure-Activity Relationship and X-ray Crystallography. J. Med. Chem. 2005, 48 (14), 4511–4525. (38) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28 (1), 235–242. (39) Abel, R.; Young, T.; Farid, R.; Berne, B. J.; Friesner, R. A. Role of the Active-Site Solvent in the Thermodynamics of Factor Xa Ligand Binding. J. Am. Chem. Soc. 2008, 130 (9), 2817–2831. (40) Daura, X.; Haaksma, E.; van Gunsteren, W. F. Factor Xa: Simulation Studies with an Eye to Inhibitor Design. J. Comput. Aided Mol. Des. 2000, 14 (6), 507–529. (41) Ostrovsky, D.; Udier-Blagovic, M.; Jorgensen, W. L. Analyses of Activity for Factor Xa Inhibitors Based on Monte Carlo Simulations. J. Med. Chem. 2003, 46 (26), 5691–5699. (42) Singh, N.; Briggs, J. M. Molecular Dynamics Simulations of Factor Xa: Insights into Conformational Transition of Its Binding Subsites. Biopolymers 2008, 89 (12), 1104–1113. (43) Venkateswarlu, D.; Perera, L.; Darden, T.; Pedersen, L. G. Structure and Dynamics of Zymogen Human Blood Coagulation Factor X. Biophys. J. 2002, 82 (3), 1190–1206. (44) Freddolino, P.; Liu, F.; Gruebele, M.; Schulten, K. Ten-Microsecond Molecular Dynamics Simulation of a Fast-Folding WW Domain. Biophys. J. 2008, 94 (10), L75–L77. (45) Klepeis, J. L.; Lindorff-Larsen, K.; Dror, R. O.; Shaw, D. E. LongTimescale Molecular Dynamics Simulations of Protein Structure and Function. Curr. Opin. Struct. Biol. 2009, 19 (2), 120–127. (46) Maragakis, P.; Lindorff-Larsen, K.; Eastwood, M. P.; Dror, R. O.; Klepeis, J. L.; Arkin, I. T.; Jensen, M. O.; Xu, H.; Trbovic, N.; Friesner, R. A.; Palmer, A. G.; Shaw, D. E. Microsecond Molecular Dynamics Simulation Shows Effect of Slow Loop Dynamics on Backbone Amide Order Parameters of Proteins. J. Phys. Chem. B 2008, 112 (19), 6155– 6158. (47) Perez, A.; Luque, F. J.; Orozco, M. Dynamics of B-DNA on the Microsecond Time Scale. J. Am. Chem. Soc. 2007, 129 (47), 14739–14745. (48) Romo, T. D.; Grossfield, A.; Pitman, M. C.; Deupi, X.; Cordomi, A.; Kobilka, B. A Microsecond Time Scale Molecular Dynamics Simulation of B2AR in a Membrane. Biophys. J. 2009, 96 (3, Supplement 1), 340a. (49) Wang, Y.; Tajkhorshid, E. Electrostatic Funneling of Substrate in Mitochondrial Inner Membrane Carriers. Proc. Natl. Acad. Sci. U.S.A. 2008, 105 (28), 9598–9603. (50) Nar, H.; Bauer, M.; Schmid, A.; Stassen, J. M.; Wienen, W.; Priepke, H. W. M.; Kauffmann, I. K.; Ries, U. J.; Hauel, N. H. Structural

7412

J. Phys. Chem. B, Vol. 114, No. 21, 2010

Basis for Inhibition Promiscuity of Dual Specific Thrombin and Factor Xa Blood Coagulation Inhibitors. Structure 2001, 9 (1), 29–37. (51) MOE (The Molecular Operating Environment) Version 2008.10, software available from Chemical Computing Group Inc., 1010 Sherbrooke Street West, Suite 910, Montreal, Canada H3A 2R7, http://www. chemcomp.com., 2009. (52) Golbraikh, A.; Tropsha, A. Predictive QSAR Modeling Based on Diversity Sampling of Experimental Datasets for the Training and Test Set Selection. J. Comput. Aided Mol. Des. 2002, 16 (5), 357–369. (53) Underwood, M. C.; Zhong, D.; Mathur, A.; Heyduk, T.; Bajaj, S. P. Thermodynamic Linkage between the S1 Site, the Na+ Site, and the Ca2+ Site in the Protease Domain of Human Coagulation Factor Xa. J. Biol. Chem. 2000, 275 (47), 36876–36884. (54) Griffon, N.; Di Stasio, E. Thermodynamics of Na+ binding to coagulation serine proteases. Biophys. Chem. 2001, 90 (1), 89–96.

Wallnoefer et al. (55) Labute, P. Protonate3D: Assignment of Ionization States and Hydrogen Coordinates to Macromolecular Structures. Proteins 2009, 75 (1), 187–205. (56) AMBER 10; University of California: San Francisco: 2008. (57) Bradbook, G. M.; Gleichmann, T.; Harrop, S. J.; Habash, J.; Raftery, J.; Kalb, J.; Yariv, J.; Hillier, I. H.; Helliwell, J. R. X-ray and Molecular Dynamics Studies of Concanavalin-a Glucoside and Mannoside Complexes Relating Structure to Thermodynamics of Binding. J. Chem. Soc. Faraday Trans. 1998, 94, 1603–1611. (58) Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of Multiple Amber Force Fields and Development of Improved Protein Backbone Parameters. Proteins 2006, 65 (3), 712–725.

JP101654G