9036
J. Phys. Chem. B 2007, 111, 9036-9044
Electrostatic Polarization Is Crucial for Reproducing pKa Shifts of Carboxylic Residues in Turkey Ovomucoid Third Domain Christopher M. MacDermaid and George A. Kaminski* Department of Chemistry, Central Michigan UniVersity, Mt. Pleasant, Michigan 48859 ReceiVed: February 14, 2007; In Final Form: May 15, 2007
We have computed pKa shifts for carboxylic residues of the serine protease inhibitor turkey ovomucoid third domain (residues Asp7, Glu10, Glu19, Asp27, and Glu43). Both polarizable and nonpolarizable empirical force fields were employed. Hydration was represented by the surface generalized Born and Poisson-Boltzmann continuum model. The calculations were carried out in the most physically straightforward fashion, by directly comparing energies of the protonated and deprotonated protein forms, without any additional parameter fitting or adjustment. Our studies have demonstrated that (i) the Poisson-Boltzmann solvation model is more than adequate in reproducing pKa shifts, most likely due to its intrinsically many-body formalism; (ii) explicit treatment of electrostatic polarization included in our polarizable force field (PFF) calculations appears to be crucial in reproducing the acidity constant shifts. The average error of the PFF results was found to be as low as 0.58 pKa units, with the best fixed-charges average deviation being 3.28 units. Therefore, the pKa shifts phenomena and the governing electrostatics are clearly many-body controlled in their intrinsic nature; (iii) our results confirm previously reported conclusions that pKa shifts for protein residues are controlled by the immediate environment of the residues in question, as opposed to long-range interactions in proteins. We are confident that our confirmation of the importance of explicit inclusion of polarization in empirical force fields for protein studies will be useful far beyond the immediate goal of accurate calculation of acidity constants.
I. Introduction Stability and function of proteins is strongly dependent on the acid-base chemistry of the molecules. Therefore, predicting protonation states of ionazable residues is a very important task in computational protein research. However, the question of finding the best technique for the task has not yet been answered. The difficulty stems from the intrinsically many-body nature of the protein residue pKa shifts. These shifts are usually determined with respect to an isolated residue (e.g., a carboxylic acid, such as propanoic, for aspartic and glutamic acid protein residues). The acidity constant for the simple isolated acids are well-known. The prediction task is thus reduced to finding the difference in deprotonation energy between the isolated acid and the same acid group positioned as a residue in the protein involved. The difference in the deprotonation energy is then a result of replacing a part of the solvation environment around the acid molecule by the protein molecule. Therefore, a number of the acid-solvent interactions are replaced by intramolecular residue-residue interactions, and the result of such a substitution is usually highly nonadditive. There have been three primary ways to approach this problem. First and the most obvious one is to simulate the residue in question and the relevant part of the protein with means of quantum chemistry, while treating the remaining aqueous solvent as a dielectric continuum.1 This approach has been successfully employed by some groups. However, there are obvious limitations to this technique. For example, the part of the protein simulated quantum mechanically cannot be too large, the level of the quantum mechanical theory cannot be too high, and the continuum solvation model has to be well parametrized for the * Author to whom correspondence should be addressed. E-mail:
[email protected].
specific quantum model involved. Combined quantum mechanical/molecular mechanical (QM/MM) approaches can partially remedy these problems,1 but they in turn give rise to the issues of QM/MM interface, parametrizing the solvation model which would work well in the presence of such an interface, the correct choice of the QM and MM regions, and so on. Another way to predict/reproduce pKa values and pKa shifts is by fitting a liner free energy relationship using a large database.2 The problem is that, on one hand, the optimal way to produce such a relationship still remains to be found and, on the other hand, such a technique performs well only for cases which are reasonably close to the fitting database. The third approach to calculating pKa shifts in proteins lies in empirical force field calculations explicitly evaluating deprotonation energies for the residue in question. The hydration can be simulated with an explicit model,3 but implicit continuum dielectric solvation models are more commonly used in this area.4,5 There are a number of ways to calculate the energy of solvation and interaction of the residue with the protein environment, but the most typical scheme is to employ fixedcharges description for the protein, and at the same time, to assign an effective dielectric constant to the protein region which is not accessible for the solvent. One example of such a model is the Tanford and Kirkwood technique.5 While generally attractive, these methods are known to perform well only for a limited range of protein residues.6 Assigning an effective dielectric constant to the protein interior has a vague physical meaning in itself, and the optimal value of this constant also remains questionable. Values from 2 to ca. 20 have been used,7 and the accuracy is still not guaranteed. Another problem with many works of this type is that these calculations seem to require taking into the account electrostatic interactions of residues
10.1021/jp071284d CCC: $37.00 © 2007 American Chemical Society Published on Web 06/28/2007
Polarization Crucial for pKa Shifts in OMTKY3
J. Phys. Chem. B, Vol. 111, No. 30, 2007 9037
separated by a large distance, while some experimental works suggest that pKa shifts depend mostly on the immediate environment of the ionizable residue.8 It would be very beneficial to avoid all these pitfalls by making sure the technique for calculating the pKa shifts is based upon a solid physical basis, not explicit fitting. Since the nature of the shifts is intrinsically many-body, this would require using a polarizable protein force field. Moreover, it would be preferable to employ a force field created not specifically for pKa calculations, but for a wider scope of protein simulations, a force field which would be physically robust and would have all the components required for such simulations (i.e., not only the electrostatic part but also the van der Waals, torsional, and all the other energy terms). What we report in this article are results of such calculations. We employed a recently developed complete polarizable force field for proteins, together with a Poisson-Boltzmann solvation model, as well as the OPLSAA force field and surface generalized Born (SGB) solvation technique.9,10 We chose carboxylic residues of the turkey ovomucoid third domain (OMTKY3) protein as our initial target, as pKa values for these residues have been extensively studied both experimentally and computationally. Our results demonstrate that this polarizable force field (PFF) is quite adequate in reproducing the pKa shifts. Moreover, we conclude that only the residues which are relatively close to the one to be ionized are essential in calculating the pKa shifts, a conclusion which is consistent with experimental observations. The rest of the paper is organized as follows. Section II contains a description of the methodology employed in this study. Section III contains the results as discussion. It is followed by conclusions in Section IV. II. Method II.1. General Scheme of the Calculations. To calculate the pKa shifts, the following two processes are considered. First, let us write down deprotonation reaction for the reference: propanoic acid, C2H5COOH for the Asp residues and butanoic acid, C3H7COOH, for the Glu residues of the OMTKY3 protein. All the reactions are occurring in aqueous solution: acid-H f acid- + H+. The change of free energy for this reaction is
∆G1 ) G(acid-) + G(H+) - G(acid-H)
(1)
Similarly, for the protein residue deprotonation reaction: A-H f A- + H+, the free energy change is -
+
∆G2 ) G(A ) + G(H ) - G(A-H)
Equation 5 above is used for calculation of the pKa values for the Asp and Glu residues, with the pKa(acid) values were set to 4.87 and 4.83 units for the propanoic (Asp) and butanoic (Glu) model acids, respectively.11 Values of the energies G(A-), G(A-H), G(acid-), and G(acidH) were obtain via geometry optimizations with PoissonBoltzmann (PBF) and surface generalized Born (SGB) continuum solvent models.12 All the geometry optimizations were carried out with the IMPACT software suite.13 All the pKa values listed in this article, both theoretical and experimental, were obtained at 298.15 K. II.2. Force Fields. We have tested performance of both a polarizable force field (PFF) and the fixed-charges OPLS-AA.14 The procedure for building the PFF and the resulting polarizable force field have been described elsewhere.12,15 Given below is a brief outline. The total energy Etotal is computed as a sum of the electrostatic term Eelectrostatics, the nonelectrostatic van der Waals part EvdW, bond stretching and angle bending energies Ebond and Eangles, and the torsional term Etorsion:
Etotal ) Eelectrostatics + EvdW + Ebonds + Eangles + Etorsion
The total electrostatic energy of the model results from interactions of fixed-magnitudes atomic charges, fixed point dipoles, and inducible dipoles:
U({qij}, {µi}) ) +
pKa(acid) ) ∆G1/(2.303RT)
(3a)
pKa(A) ) ∆G2/(2.303RT)
(3b)
The pKa shift, i.e., the difference between the acidity constants for the residue and the acid is ∆pKa ) pKa(A) - pKa(acid) ) [G(A-) - G(A-H) - G(acid-) + G(acid-H)]/(2.303RT) (4) or
pKa(A) ) pKa(acid) + [G(A-) - G(A-H) G(acid-) + G(acid-H)]/(2.303RT) (5)
(
)
1
∑i χi‚µi + 2µi‚R-1 i ‚µi
1
1
qijSij,k‚µk + ∑ µi‚Ti,j‚µi ∑ qijJij,klqkl + ∑ 2 ij*kl 2 i*j ij,k
(7)
Here Jij,kl is a scalar coupling between bond-charge increments on sites i,j and k,l qij and qkl; Sij,k is a vector coupling between a bond-charge increment on sites i,j and a dipole on site k µk; a rank-two tensor coupling Ti,j describes interactions between dipoles on sites i and j. Ri is the polarizability of site i. Parameters χi describe “dipole affinity” of site i. Following the formalism of the Coulomb interactions, we assume
Jij,kl )
1 1 1 1 - - + rik ril rjk rjl
Sij,k )
(2)
Furthermore, the acidity constants pKa(acid) and pKa(A) for these deprotonation processes are as follows:
(6)
Ti,j )
rik 3
rik
(
-
rjk rjk3
(8)
(9)
)
rik rik 1 1-3 2 3 rij rij
(10)
Inducible dipoles are placed on all the heavy atoms and on some polar hydrogens, as described in ref 15. Fixed charges q and permanent dipoles defined by the “dipole affinities” χ are present on all the atoms. “Virtual sites” representing electron lone pairs were placed on sp2 and sp3 oxygen atoms with the virtual siteoxygen distances set to 0.47 Å in the both cases. Dipole-dipole and charge-dipole screening is used to damp the polarization response when the perturbing site is at short distances. Each dipole and each charge has an individually set screening length. The corresponding length parameters for two atoms are added together to obtain the effective screening length for the pair interaction. The values of the electrostatic parameters were fitted to reproduce quantum mechanically determined electrostatic po-
9038 J. Phys. Chem. B, Vol. 111, No. 30, 2007
MacDermaid and Kaminski
Figure 1. Propanoic acid used as the reference pKa model for aspartate residues. pKa ) 4.87.11
tential around an isolated molecule (permanent charges and dipoles) and the same molecule with dipolar electrostatic probes placed at hydrogen-bonding and random positions (polarizabilities). B3LYP density functional theory (DFT) with the cc-pVTZ(-f) basis set of Dunning16 was employed. The adequacy of the method has been shown in a previous work.15 All the quantum mechanical calculations were carried out with the Jaguar software suite.17 The overall nonelectrostatic pair potential form is
Enb )
Aij/rij12 - Bij/rij6 + Cij exp(rij/Rij) ∑ i