J. Phys. Chem. B 2008, 112, 13391–13400
13391
Insight into the Phosphoryl Transfer of the Escherichia coli Glucose Phosphotransferase System from QM/MM Simulations Christophe Jardin,*,† Anselm H. C. Horn,† Gudrun Schu¨rer,‡ and Heinrich Sticht† Bioinformatik, Institut fu¨r Biochemie, Friedrich-Alexander-UniVersita¨t Erlangen-Nu¨rnberg, Fahrstrasse 17, 91054 Erlangen, Germany, and Computer-Chemie-Centrum, Friedrich-Alexander-UniVersita¨t Erlangen-Nu¨rnberg, Na¨gelsbachstrasse 25, 91052 Erlangen, Germany ReceiVed: February 14, 2008; ReVised Manuscript ReceiVed: August 12, 2008
Phosphoryl transfer is a key reaction in many aspects of metabolism, gene regulation, and signal transduction. One prominent example is the phosphoenolpyruvate:sugar phosphotransferase system (PTS), which represents an integral part of the bacterial sugar metabolism. The transfer between the enzymes IIAGlc and IIBGlc in the glucose-specific branch of the PTS is of particular interest due to the unusual combination of donor and acceptor residues involved in phosphoryl transfer: The phosphoryl group is initially attached to the N2 atom of His90 in IIAGlc and then transferred to the Sγ atom of Cys35 in IIBGlc. To gain insight into the details of the transfer mechanism, we have performed a QM/MM simulation which treats the entire active site quantummechanically. The transfer has a high dissociative character, and the N2-P bond gets immediately destabilized after complex formation by numerous interactions formed between residues of IIBGlc and the phosphoryl group. The final formation of a tight Sγ-P bond is accompanied by a reorientation of the side chain of the phosphoryl donor. This reorientation results in the loss of interaction between the imidazole ring and the phosphate group thus hindering the reverse transfer. A comparison to the transfer in protein tyrosine phosphatases, which also use a cysteine as acceptor of the phosphoryl group, reveals significant similarities in the conformation of the active site, the energy profile of the reaction, and in the pattern of interactions that stabilize the phosphoryl group during the transfer. Introduction Phosphoryl transfer, which is accomplished by a vast number of enzymes comprising various kinases and phosphatases, plays an important role in gene regulation, metabolism, and signal transduction.1 One well-known example for a concerted phosphoryl transfer is the phosphoenolpyruvate:sugar phosphotransferase system (PTS), which represents an integral part of the bacterial sugar metabolism.2–4 The PTS consists of several proteins and enzymes that sequentially transfer a phosphoryl group from phosphoenolpyruvate (PEP) to a sugar molecule which becomes translocated across the cell membrane. The individual components and complexes of PTS have been characterized extensively with respect to their structure and function revealing a transfer mechanism that involves a series of three bimolecular protein-protein complexes (Figure 1).2,3,5–10 Enzyme I (EI), which first gets phosphorylated by phosphoenolpyruvate (PEP), subsequently phosphorylates the phosphocarrier protein HPr. Thereafter, HPr transfers the phosphate to the IIA domain of different sugar-specific enzymes II (EII). The IIA, IIB, and IIC components of EII may occur either as separate polypeptides or linked together in a multidomain protein. In the glucose specific branch of PTS, IIBGlc and IICGlc are the cytoplasmic and membrane subdomains of the glucose transporter IICBGlc (Figure 1). For a comprehensive understanding of the underlying transfer mechanism, it is essential to examine the nature of the phosphoryl donors and acceptors of different * Corresponding author. Telephone: +49-9131-8524682. Fax: +49-91318522485. E-mail:
[email protected]. † Bioinformatik, Institut fu ¨ r Biochemie, Friedrich-Alexander-Universita¨t Erlangen-Nu¨rnberg. ‡ Computer-Chemie-Centrum, Friedrich-Alexander-Universita ¨ t ErlangenNu¨rnberg.
Figure 1. Illustration of the phosphoenolpyruvate (PEP)-dependent sugar transport system (PTS) in bacteria showing the transfer of the phosphoryl group from PEP to the sugar. The transfer proceeds through a series of bimolecular protein-protein complexes, involving the proteins enzyme I (EI), HPr, and the sugar-specific three-component enzyme II (EII). The respective components IIAGlc, IIBGlc, and IICGlc of the glucose-specific branch of the PTS are shown. The IIAGlc · IIBGlc complex, investigated in the present study, is highlighted by a dashed box.
PTS proteins: The first steps of the transfer are accomplished via histidines in EI, HPr, and IIA, while in IIBGlc a cysteine functions as acceptor of the phosphate group before it is transferred to a hydroxyl group of a sugar. While histidine phosphorylation is quite common,11 cysteine phosphorylation has up to present only been reported for a few proteins, e.g. eukaryotic protein-tyrosine phosphatases (PTPases).12 PTPases and IIB proteins, however, differ in the nature of the donor of the phosphoryl group: PTPases dephosphorylate a tyrosine of their substrate, while IIB proteins accept their phosphoryl group from a histidine of IIA. This unique combination of donor and acceptor groups involved in phosphoryl transfer renders the IIAGlc · IIBGlc system an interesting candidate
10.1021/jp801319k CCC: $40.75 2008 American Chemical Society Published on Web 09/25/2008
13392 J. Phys. Chem. B, Vol. 112, No. 42, 2008 for structural and mechanistic studies, which are also important for the design of specific inhibitors of the bacterial PTS in the future. A high-resolution structure is available for the unphosphorylated form of the IIAGlc · IIBGlc complex, while the phosphorylated state proved to be not sufficiently long-lived for NMR structure determination. The respective structure suggests a direct phosphoryl transfer mechanism involving a pentacoordinate transition state, but does not allow final conclusions on the dissociative or associative nature of the transfer nor on the energetics of the reaction.7 In such a situation, computational chemistry can play a key role for understanding enzymatic catalysis because it gives insight into structural and energetic details which are difficult to obtain from experiment. Such information includes the exact geometry of the transition state, the role of active site residues in the catalysis of reactions, the energy profile of the transfer, and the interplay of bond breakage and formation events. While the phosphoryl transfer in eukaryotic PTPases has already been the subject of several computational studies,13–22 there exists to the best of our knowledge no complementary study devoted to the transfer mechanism in the IIAGlc · IIBGlc complex. We have therefore investigated the mechanism (energy profile and the species involved) of the phosphoryl transfer in the IIAGlc · IIBGlc complex using a combined QM/MM procedure. Special attention was paid to the analysis of the interactions formed by IIA and IIB that stabilize the phosphoryl group during the transfer. To investigate the stability of both the educt and the product species, additional molecular dynamics (MD) simulations were performed for the histidine and cysteine phosphorylated forms of the complex. Finally, a comparison with the mechanism of the phosphoryl transfer in PTPases was made to understand the similarities and differences of phosphotransfer mechanisms. Computational Methods General Considerations. Theoretical studies on enzyme catalyzed reactions like transphosphorylation, which consist of cleavage and formation of chemical bonds, require a quantum mechanical description of the system. Although high-level ab initio or density functional theory (DFT) methods give good results, they are often computationally too expensive for a sufficiently large model of the active site. Semiempirical molecular orbital theory on the other side allows the treatment of a few hundred atoms but is also not generally applicable to biological systems with many thousands of atoms. Therefore, hybrid quantum mechanical/molecular mechanical (QM/MM) techniques, first introduced and developed by Warshel and coworkers,23 have been widely used to investigate enzyme reaction mechanisms in larger biological systems.24,25 They allow a quantum mechanical description of the active site, and a simultaneous consideration of the long-range effects of the enzyme environment. These techniques require a partitioning of the molecular system into two regions: in the QM-part the atoms are described quantum mechanically, while for the MMpart a molecular mechanical force-field is used. In the present study we chose a semiempirical QM/MM approach because preliminary model building suggested that there are numerous (∼10-15) residues, which might participate in phosphoryl transfer in the IIAGlc · IIBGlc complex. To allow inclusion of all these residues in the QM-part and to avoid potential problems arising from the QM-MM boundary, a large QM-region comprising 20 residues (356 atoms) was defined, that represents about 10% of the whole complex structure. For
Jardin et al. a system of this size, treating the QM-part semiempirically is a good compromise between speed and accuracy of the calculation. Computational Details. All calculations for the transfer reaction pathway were performed with the semiempirical program package VAMP.26 The general strategy for hybrid QM/ MM calculations implemented in VAMP is well established and has been described in detail elsewhere (see, for example, refs 27 and 28). The choice of the actual Hamiltonian for the QM-region took into account previous investigations, which reported a poor agreement between standard AM129 and MP2/6-31+G(d) results for the calculation of thiophosphate formation. However, the authors observed a significant improvement of the S-P bond length, when they used MNDO30,31 exclusively for sulfur and phosphorus atoms, and AM1 for all other elements.15 To verify the efficiency of a mixed parameter set for our purposes, we investigated a number of thiophosphate compounds with respect to the geometry and energetics of the reaction (see Supporting Information). Our results confirm previous findings15 indicating that the S-P bond lengths are well reproduced for this class of molecules by the AM1/MNDO(S,P) method (Table S1, Supporting Information). Furthermore, the reaction energies for several model systems calculated with the mixed method show an accuracy of ∼ 2 kcal mol-1 with respect to high-level ab initio MP2 and DFT B3LYP calculations (Table S2, Supporting Information). A similar accuracy has been reported previously for another hybrid QM-Hamiltonian, that was specifically calibrated to study enzymatic reactions.32 It can therefore be concluded that this accuracy represents the current limit of mixed QM-methods. Therefore, the AM1/MNDO(S,P) method was considered as appropriate method for the QM-region in our study, whereas the MM-region of the IIAGlc · IIBGlc complex was treated with the Tripos force field33 as implemented in VAMP. For polarizable continuum calculations we applied the COSMO model.34 As in our previous studies,28,35–37 we used an alternating geometry optimization strategy for the QM and MM-regions, allowing the MM-region to relax to the changes in the QMregion and thus avoiding too many time-consuming energy computations. No cutoff values were used for the calculation of the Coulomb and van der Waals energies, and all structures were optimized to consistency of both subsystems. The local minima (LM) and transition state (TS) were characterized by calculating the normal modes. Intrinsic reaction coordinates (IRC) calculations were performed to verify that the respective TS connects the neighboring local minima found on the potential energy surface (PES). For the QM/MM calculations of the transit pathway, energies given represent the total energies, which include the QM, MM and interaction energies. For the refined stationary points, free energies were calculated using the THERMO-module of Vamp. Additionally, relative stabilization energies for the stationary points considering the entire QMregion were obtained from density functional theory (DFT) calculations using the program package Turbomole38 with a BP86/def-SV(P)39–42 basis set. In the following we used the nomenclature of the IUPAC-IUB commission43 for the description of the atoms, and residues originating from IIBGlc are denoted in italics. Preparation of the System for the Simulation of the Transfer. The original structure of the unphosphorylated IIAGlc · IIBGlc complex was taken from the PDB (model 1 of 1O2F) and the phosphate group was inserted into the structure using Sybyl.44 The phosphorus atom was attached to the imidazole atom N2 of His90-IIAGlc, and the phosphoramidate
Insight into the Phosphoryl Transfer
J. Phys. Chem. B, Vol. 112, No. 42, 2008 13393
Figure 2. (a) Initial phosphorylated complex between IIAGlc (blue) and IIBGlc (orange) used in the present study. The most important residues in the QM-region (His75, His90-Asp94, and Cys35-Arg38) are shown in sticks and the rest of the complex in ribbons. (b) Schematic representation of the nucleophilic attack of the thiolate anion of the active site cysteine (Cys35-IIBGlc) at the phosphorus atom of the phosphorylated histidine substrate His90-IIAGlc. The stabilizing intramolecular interactions on the phosphoryl group and the thiolate anion of Cys35 deduced from the initial minimized structure of the phosphorylated complex are depicted with dashed red lines.
P-N2 bond length was set to 1.86 Å according to previous investigations of phosphohistidine in our group.45 The three N2-P-O angles were also set according to Homeyer et al.45 and the phosphoryl group was rotated around the P-N2 bond to accommodate as many stabilizing interactions within the surrounding residues as possible. Finally, the phosphorylated structure was minimized in 100 steps using the conjugated gradients method implemented in Sybyl. The structure of the phosphorylated complex, which contains 227 residues with more than 3400 atoms, was divided into a QM and a MM-region. To allow a quantum-mechanical treatment of all interactions relevant for the transfer in the active site, the QM-region includes 20 residues (356 atoms): Asp38, Thr73-His75, His90-Lys99 of IIAGlc and residues Cys35-Arg40 of IIBGlc; the remaining residues form the MM-region. The boundary between the QM and the MM-region was treated using a modified link atom approach.36,37 Additional weak harmonic constraints on the positions of the backbone C- and N-atoms (20 kcal mol-1 Å-2 on the atoms at the QM/MM frontier, 10 kcal mol-1 Å-2 otherwise) ensured that the resulting QM model system retained its position within the MM-region. Atomic charges of the MM environment were generated using the Gasteiger-Marsili method implemented in Sybyl.46 To avoid artificially large forces of the MM-region on the charged active site residues, the total charge in the MM-region was neutralized as described previously.28 The resulting model was used as the initial structure (IS) in our simulations of the transfer reaction. Molecular Dynamics Simulations. MD simulations for the histidine and cysteine phosphorylated complexes were carried out using the AMBER9 program package47 with the parm99SB48,49 parameter set augmented by parameters for phosphocysteine and phosphohistidine.45 The systems were neutralized by adding 12 sodium ions and then solvated in a box with at least a 10 Å water shell between the solute and the border of the box. After a two-stage minimization, the systems were brought up to 300 K during 0.5 ns simulation time using SHAKE constraints on hydrogens and small restraints on the protein. This equilibration period was followed by 10 ns unconstrained MD simulation using standard settings.50 Coordinate snapshots were saved every 1 ps.
Results and Discussion Overall Description of the Starting Structure of the Transfer Complex IIAGlc · IIBGlc. Molecular modeling of the phosphorylated complex based on the experimentally determined structure of the unphosphorylated form reveals that the phosphoryl group can readily be accommodated in the structure (Figure 2). Apart from the additional interactions formed by the phosphate group itself, phosphorylation causes only minor structural changes. The phosphoryl group is stabilized through polar interactions with the backbone amide of Asp94 and with N2H hydrogen atom of His75 (Figure 2). The distance between the donor site His90(N2) and acceptor site Cys35(Sγ) in the minimized phosphorylated structure is 5.64 Å, which is almost unchanged compared to the unphosphorylated structure (5.82 Å). For the acceptor site in IIBGlc we found several intramolecular interactions to the Sγ atom of Cys35 involving the backbone amides of Ile36, Thr37, and Arg38, as well as the NH side chain hydrogen of Arg38 (Table 1, Figure 2). A similar network of stabilizing interactions has been reported previously for PTPases7 and is consistent with the observation that both types of proteins exhibit significantly lowered pKa values of their active site cysteines. The active site cysteine of PTP1B,51 for example, has a pKa of 4.7,52 and the pKa of Cys35 in IIBGlc is ∼6.5.7 The pKa of cysteines is generally ∼8.5 and a reduction in the pKa of this magnitude requires a strong stabilization by surrounding residues. This lowered pKa ensures that the cysteine predominantly exists as a thiolate at physiological pH and is therefore primed for a nucleophilic attack on the phosphoryl group. In addition to the intramolecular hydrogen bonding network stabilizing the Cys35 thiolate, there already exist several polar interactions between residues in IIB and the phosphoryl group in this minimized structure. However, with the exception of the backbone amide of Ile36, the distances are still too large to form proper hydrogen bonds to the phosphoryl group (Table 1). Mechanism of the Phosphoryl Group Transfer in the IIAGlc · IIBGlc Complex. The transfer of the phosphoryl group from IIAGlc to IIBGlc proceeds via the nucleophilic attack of the Cys35 thiolate anion at the phosphorus atom. The nucleophilic attack was simulated using the Sγ-P distance as the
13394 J. Phys. Chem. B, Vol. 112, No. 42, 2008
Jardin et al.
TABLE 1: Calculated Distances (in Å) in the Active Site of IIAGlc · IIBGlc for Key Steps of the Phosphoryl Transfera centers Sγ(Cys35) N2(His90) Sγ(Cys35) NH(Asp94) N2H(His75) NH(Ile36) NH(Thr37) Oδ1H(Thr37) NH(Arg38) NH(Arg38) NηH(Arg38) NH(Ile36) NH(Thr37) NH(Arg38) NH(Arg38) NηH(Arg38) NηH/NH(Arg38)
P(PO3) P(PO3) N2(His90) O(PO3) O(PO3) O(PO3) O(PO3) O(PO3) O(PO3) O(PO3) O(PO3) Sγ(Cys35) Sγ(Cys35) Sγ(Cys35) Sγ(Cys35) Oδ(Asp94) Oδ(Asp38)
IS
LM1
TS
LM2a
LM2b
3.99 1.83 5.64 1.97 3.16 1.97 3.11 3.62 4.99 4.62 4.09 2.78 2.95 3.02 2.70 2.42/2.23 2.61/3.00
3.65 2.55 6.19 2.13 2.08 2.32 2.4 2.03 3.76 3.04 2.40 2.87 3.82 3.00 2.59 2.05/2.09 2.66/3.28
2.98 2.62 5.58 2.18 2.06 2.19 2.14 2.04 3.55 3.20 2.04 2.64 3.68 3.19 3.68 2.06/2.16 2.63/2.85
2.44 2.75 5.17 2.28 2.01 2.06 2.15 2.02 3.31 2.91 2.18 2.87 3.82 3.08 3.99 2.03/2.17 2.60/2.81
2.23 4.39 5.31 3.23 4.07 1.99 2.16 2.07 2.02 1.93 2.04 2.62 3.9 3.75 3.63 2.34/2.41 2.47/2.87
a IS refers to the initial phosphorylated complex obtained after energy minimization using the Sybyl MM force field. The three local minima (LM1, LM2a, LM2b) and the transition state (TS) are the results of the refined QM/MM calculations described in Computational Methods. For each distance the pair of interacting atoms is listed, and the amino acid, to which the respective atoms belong, is given in parentheses. Residues originating from IIAGlc and IIBGlc are shown in regular and italic letters, respectively. The multiple distances listed in the last two lines for each state indicate the presence of multiple interactions originating from the presence of bifurcated salt bridges.
Figure 3. Energy profile for the constrained phosphoryl transfer in the IIAGlc · IIBGlc complex. The nucleophilic attack was simulated using the Sγ-P distance as the reaction coordinate; the respective distance was stepwise reduced from 4.00 to 2.20 Å. The reactant complex (X), the transition state (Y), and the two forms of the phospho-Sγ-cysteine complex (Z1 and Z2) are appropriately labeled.
reaction coordinate, and the respective distance was stepwise reduced from 4.00 to 2.20 Å (Figure 3). The structure at each transit point was analyzed in detail, and we have identified a total of four points (X, Y, Z1, and Z2 in Figure 3) which exhibit structural properties of particular relevance to the transfer. These points are briefly described below and the respective structures were subsequently refined without Sγ-P constraint to obtain stationary points on the potential energy surface (PES). Point X (Sγ-P ) 3.80 Å) corresponds to a shallow minimum in the energy profile of the transfer pathway (Figure 3). The respective structure is characterized by a significant elongation of the phosphoramidate bond (P-N2 ) 2.57 Å) and an increased number of interactions with the phosphate group that mainly originate from IIBGlc. The transfer passes through one energy maximum for the sixth transit point on the PES (point Y in Figure 3). This transit point is close to the TS as emphasized by the small deviation from planarity (1.2°) of the phosphoryl group. Inspection of the distances between the phosphorus and the leaving (2.61 Å) and entering (3.00 Å) atoms gives first evidence for a mainly dissociative character of the TS.
The rest of the reaction pathway corresponds to the formation of the phosphorothioate bond. Interestingly, we observed two different conformations during the formation of the thiophosphate bond. The first one is formed for Sγ-P ) 2.4 Å and the second one by further reducing the Sγ-P distance to 2.2 Å (points Z1 and Z2 in Figure 3). The two conformations differ essentially in the orientation of the imidazole ring of His90. In Z1, the imidazole ring of His90 is still orientated toward the phosphoryl group (P-N2 ) 2.69 Å) allowing the reverse transfer of the phosphoryl group on this residue. In Z2, the P-N2 distance increases to 3.66 Å and the imidazole ring side chain of His90 is now mainly orientated toward the imidazole ring of His75. In order to characterize the conformations of the stationary points on the PES and the energetics of the reaction, the transit points X, Y, Z1, and Z2 were refined using solely the QMregion and removing the constraint on the Sγ-P distance. This approach results in a much faster convergence of the calculations, and because we used a large QM-region in the QM/MM setup, it provides an accurate description of the active site by itself. The absent MM-region was mimicked by a polarizable continuum ansatz ( ) 4.00) via the COSMO model. This led to the local minima (LM1, LM2a, and LM2b) and the transition state TS. The corresponding free energy profile of the phosphoryl transfer is depicted in Figure 4. The LM1 State. The “pre-transition” complex LM1 shows no major conformational changes with regard to X. Compared to the structure of the initial modeled complex (IS), there are several additional noncovalent interactions with the phosphate group, which mainly originate from IIBGlc. The backbone NH and side chain OγH of Thr37 and the side chain guanidine group of Arg38 now form additional hydrogen bonds to the phosphoryl moiety (Table 1). There is a total of seven interacting groups in a close (