5884
J. Phys. Chem. B 2005, 109, 5884-5890
Accurate Prediction of Absolute Acidity Constants in Water with a Polarizable Force Field: Substituted Phenols, Methanol, and Imidazole George A. Kaminski* Department of Chemistry, Central Michigan UniVersity, Mount Pleasant, Michigan 48859 ReceiVed: January 10, 2005
Validity of a force field with explicit treatment of electrostatic polarization in a form of inducible point dipoles for computing acidity constants was tested by calculating absolute pKa values of substituted phenols, methanol, and imidazole in water with the molecular dynamics technique. The last two systems were selected as tyrosine and histidine side-chain analogues, respectively. The solvent was represented by an explicit polarizable water model. Similar calculations were also performed with a modified OPLS-AA nonpolarizable force field. The resulting pKa values were compared with available experimental data. While the nonpolarizable force field yields errors of about 5 units in the absolute pKa values for the phenols and methanol, the polarizable force field produces the acidity constant values within a ca. 0.8 units accuracy. For the case of imidazole, the fixed-charges force field was capable of reproducing the experimental value of pKa (6.4 versus the experimental 7.0 units), but only at a cost of dramatically underestimating dimerization energy for the imidazolium-water complex. At the same time, the polarizable force field yields an even more accurate result of pKa ) 6.96 without any sacrifice of the accuracy in the dimerization energy. It has also been demonstrated that application of Ewald summation for the long-range electrostatics is important, and substitution of a simple cutoff procedure with Born correction for ions can lead to underestimation of absolute pKa values by more than 5 units. The accuracy of the absolute acidity constants computed with the polarizable force field is very encouraging and opens road for further tests on more diverse organic molecules sets, as well as on proteins.
I. Introduction
reaction for a molecule AH:
Accurate estimation of acidity constants for organic and biological molecules is an important and challenging task for the computational chemistry community. This importance results from a wide variety of organic and biological applications, especially those related to determining pKa values of protein and drug molecules.1 The most popular technique involves fitting parameters of a linear free energy relationship to a large training set of molecules.2 If the descriptors in the relationship are chosen correctly, this approach allows us to reproduce acidity constants for a molecule similar, but not identical, to those in the training set. It is more attractive, however, to be able to predict pKa values for a molecule based on purely physical simulations rather than on knowledge of properties of a set of similar compounds. On one hand, this should allow us to obtain results for any new molecule, even if there are no data for a similar one to fit parameters to. On the other hand, parameter fitting in this case could be less resource consuming. Finally, a number of other important molecular properties could be obtained in the course of such simulations, e.g., ionic salvation energies, molecular structure in solutions, etc. This is why a number of attempts have been made in this direction.3 Nevertheless, a level of accuracy of ca. 0.8 pKa units cannot be achieved without introducing extra correction terms or/and explicit fitting to an extensive training set. The reason for this is in the nature of the pKa calculations. Let is consider deprotonation * E-mail:
[email protected].
∆G(aq)
AH(aq) 98 A-(aq) + H+(aq) Then the acidity constant value for AH is defined as
Ka ) [A-][H+]/[AH]
pKa ) - log(Ka)
pKa ) ∆G(aq)/(2.303RT)
(1) (2)
To determine the pKa experimentally, one needs to measure the concentrations of the protonated and deprotonated forms of the substance, [AH] and [A-]. On the other hand, to find the value of the acidity constant computationally, it is usually necessary to compute the free energy change ∆G(aq), which includes a difference of two energies with large magnitudess gas-phase deprotonation free energy for AH and difference in the hydration energies of AH and A-. Each of these terms typically has a magnitude of hundreds of kcal/mol, and, to obtain the absolute pKa within 0.8 units accuracy, it is necessary to compute the total ∆G(aq) value with the accuracy of ca. 1.1 kcal/mol. At the same time, a true thermodynamic sampling in the form of either Monte Carlo or molecular dynamics is desirable in computing the AH-A- solvation energy change to avoid problems arising because of incomplete configuration space sampling. The true free energy change can then be computed with the statistical perturbation theory (SPT).4 The most practical way to conduct such simulations is with an empirical force field, such as, for example, OPLS-AA,5
10.1021/jp050156r CCC: $30.25 © 2005 American Chemical Society Published on Web 03/09/2005
Absolute Acidity Constants in Water
J. Phys. Chem. B, Vol. 109, No. 12, 2005 5885
AMBER,6 MMFF,7 etc. The problem is, however, that all these widely used force fields do not include electrostatic polarization in any explicit form. Therefore, one can expect that they might not perform adequately in simulations, in which a significant change of the electrostatic properties of the system involved takes place. Change from a neutral AH molecule to the A- anion in aqueous solution and gas phase is certainly one of such cases. For example, it has been demonstrated in an earlier work that, although the OPLS-AA yields excellent liquid-state properties for organic molecules, the same force field overestimates magnitudes of gas-phase dimerization energies for some of the systems by more than 2 kcal/mol.8 This is why this work was carried out with a polarizable empirical force field, and the superiority of this approach over using a nonpolarizable modified OPLS-AA force field proved to be possible to demonstrate. The rest of the paper is organized as follows. Section II describes the methodology employed in these calculations. Results and discussion are given in section III. Finally, section IV contains conclusions and a brief description of the future directions of the presented research. II. Method
EVdW )
A. Force Fields. We have tested performance of both a polarizable force field (PFF) and a modified fixed-charges OPLS-AA. The procedure for building the PFF has been described elsewhere.8 Given below is a brief outline. The total energy Etotal is computed as a sum of the electrostatic term Eelectrostatics, the nonelectrostatic van der Waals part EVdW, bond stretching and angle bending energies Ebond and Eangles, and the torsional term Etorsion:
Etotal ) Eelectrostatics + EVdW + Ebonds + Eangles + Etorsion
(3)
The total electrostatic energy of the model results from interactions of fixed-magnitudes atomic charges, fixed-point dipoles, and inducible dipoles:
U({qij},{µi}) )
permanent dipoles defined by the “dipole affinities” χ are present on all the atoms. “Virtual sites” representing electron lone pairs were placed on sp2 and sp3 oxygen atoms with the virtual siteoxygen distances set to 0.47 Å in the both cases. Dipole-dipole and charge-dipole screening is used to damp the polarization response when the perturbing site is at short distances. Each dipole and each charge has an individually set screening length. The corresponding length parameters for two atoms are added together to obtain the effective screening length for the pair interaction. The values of the electrostatic parameters were fitted to reproduce quantum mechanically determined electrostatic potential around an isolated molecule (permanent charges and dipoles) and the same molecule with dipolar electrostatic probes placed at hydrogen-bonding and random positions (polarizabilities). B3LYP density functional theory (DFT) with the ccpVTZ(-f) basis set of Dunning9 was employed. The adequacy of the method has been shown in a previous work.8 All the quantum mechanical calculations in this work were carried out with the Jaguar software suite.10 The overall nonelectrostatic pair potential form is
(
1
∑i χi‚µi + 2µi‚Ri-1‚µi
)
+
1
∑
qijJij,klqkl + 2 ij*kl 1 qijSij,k‚µk + µi‚Ti,j‚µi (4) 2 i*j ij,k
∑
∑
Here Jij,kl is a scalar coupling between bond-charge increments on sites i,j and k,l, qij and qkl; Sij,k is a vector coupling between a bond-charge increment on sites i,j and a dipole on site k, µk; a rank-two tensor coupling Ti,j describes interactions between dipoles on sites i and j. Ri is the polarizability of site i. Parameters χi describe “dipole affinity” of site i. Following the formalism of the Coulomb interactions, we assume
Jij,kl )
1 1 1 1 - - + rik ril rjk rjl
Sij,k )
Ti,j )
rik
rjk
rik
rjk3
(
3
(5)
(6)
)
rikrik 1 1-3 2 3 rij rij
(7)
Inducible dipoles are placed on all the heavy atoms and on some polar hydrogens, as described in ref 8. Fixed charges q and
Aij/rij12 - Bij/rij6 + Cij exp(rij/Rij) ∑ i