7664
J. Phys. Chem. B 1997, 101, 7664-7673
Calculating Acid-Dissociation Constants of Proteins Using the Boundary Element Method Andre´ H. Juffer,*,† Patrick Argos,‡ and Hans J. Vogel† Department of Biological Sciences, The UniVersity of Calgary, Calgary, Canada, and European Molecular Biology Laboratory (EMBL), Heidelberg, Germany ReceiVed: May 14, 1997; In Final Form: June 30, 1997X
The boundary element method (BEM) in combination with continuum electrostatics is employed to compute the acid-dissociation constants Ka (pKa ) -log Ka) of titrating sites in proteins. The boundary element method determines the electrostatic potential from the solution of two coupled integral equations, valid on a triangulated is computed from the shift of the pKa of a surface enclosing the macromolecule. First the intrinsic pKintr a model compound containing the titrating site upon its transfer from solution to the protein; during this step interactions between titrating sites are neglected. Subsequently, interactions between titrating sites are included by means of a Monte Carlo scheme to sample protonation states of the protein. A convenient vector-matrix formulation in terms of the BEM is given which allows the use of a single atom or a detailed charge model (or a combination of both) to describe the titrating sites. The method has been applied to four proteins: bovine pancreatic trypsin inhibitor, calbindin, lysozyme, and ovomucoid third-domain. Different choices for the dielectric constant of proteins ranging from 4 to 78.5 were investigated in a systematic fashion. Comparisons are made with pKa values calculated by the finite difference method and those determined experimentally for these four proteins. Our results indicate that accurate pKa values are obtained with the BEM when a dielectric constant for the protein of 20 or higher is used. For calbindin, different choices for the ionic strength were considered and comparison was made with pKa values obtained experimentally and from a simulation model using explicit ions for the solvent.
Introduction One major determinant for acid-dissociation constants Ka (pKa ) -log Ka) of titrating sites in proteins is electrostatic interactions.1,2 For the calculation of electrostatic potentials in and around solvated proteins, the technique most commonly employed is the finite difference method.1,3-11 In this method, the protein is placed in a cavity surrounded by a highly polarizable solvent (water or electrolyte) considered as a dielectric continuum. Electrostatic potentials are computed by solving the (non)linear Poisson-Boltzmann differential equation. For this purpose, the protein/solvent system is positioned on a three-dimensional lattice and the PB equation is solved at the lattice points. The position of the protein/solvent relative to the grid is a potential source of error for any finite difference type of calculation, and this has been the subject of a recent investigation carried out by Bruccoleri et al.12 Given the lattice potentials, the potential at any position in the protein/solvent system can be computed. The most sophisticated procedures use a focusing and rotational averaging scheme to improve the accuracy of computed electrostatic potentials.5 The finite difference method has been successfully applied to the calculation of solvation free energies, reaction rates, pKa’s and binding free energies.9 The efficiency and simplicity of implementation of the finite difference method makes it a rather attractive method for the calculation of titration curves of proteins. Consequently, it has been extensively used for the calculation of pKa values of titrating sites in proteins. In most studies the procedure * Author to whom all correspondence should be addressed: Department of Biological Sciences, Faculty of Science, The University of Calgary, 2500 University Drive N. W., Calgary, Alberta T2N 1N4, Canada. Phone: +1403-220-4193/7382. Fax: +1-403-289-9311. email:
[email protected]. ucalgary.ca. † University of Calgary. ‡ EMBL. X Abstract published in AdVance ACS Abstracts, August 15, 1997.
S1089-5647(97)01594-0 CCC: $14.00
developed originally by Bashford and Karplus13 has been employed.14-22 This procedure entails a partitioning of the calculation into two steps. First the intrinsic pKa (pKintr a ) is computed from the shift of the pKa of a model compound containing the titrating site upon transfer of the model compound from solution to the protein. At this stage, interactions between titrating sites are neglected. In the second step, an additional shift is calculated by considering the interactions between titrating sites. This is done by means of a Monte Carlo scheme which samples protonation states at a certain pH.15 Earlier attempts, such as those carried out by Tanford and Roxby,23 used an iterative scheme to reach an average distribution of protons distributed over titrating sites. Bashford and Karplus14 showed that the Tanford and Roxby method entails the use of a mean field approximation which results in large errors when strong interactions between titrating sites exist. Alternative methods are based on a variational approach in combination with a distance dependent dielectric constant and a modified Born equation24 and the binding polynomial treatment of multiple ligand binding.25,26 A recent development in the calculation of titration curves for protein residues through the finite difference method is presented in the work of Antosiewicz et al.2 These authors use a “full-charge” titration model in the sense that upon protonation of a titration site, the proton charge is distributed over several atoms instead of placing it at a single atom (simple charge model). They observed improvement with respect to the simple charge model, but they also found more sensitivity of the computed pKa on the details of the full-charge model. A fullcharge model for pKa calculations was also employed by Yang et al.17 The concept of a full-charge titration model is also used in this work. As an alternative to the finite difference method, the boundary element method (BEM)27 may be used to calculate electrostatic potentials in and around solvated macromolecules.28 The BEM © 1997 American Chemical Society
Calculating Acid-Dissociation Constants
J. Phys. Chem. B, Vol. 101, No. 38, 1997 7665
depends on a triangulated surface (or any other set of surface elements with a proper connectivity and orientation) which envelopes the molecule of interest. By solving two coupled integral equations valid on the surface, the potential inside and outside the surface can be computed. The method has been demonstrated to be more accurate than the finite difference method: the relative errors for calculated electrostatic potentials inside a spherical surface are close to zero,28-32 while for the finite difference method the errors are 5-15%.5,13 Unfortunately, the efficiency of the BEM strongly depends on the number of triangle vertices of the triangulated surface because of a dense matrix involved in the computation. Therefore the method requires considerably more computer memory than the finite difference method. This makes any BEM application somewhat less efficient when applied to macromolecules, and therefore the method has not been used very often for solvated proteins as of yet. In addition, the determination of an accurate triangulated surface around a protein is not a trivial matter. Nevertheless, several applications of the BEM have already been reported in the literature, mainly for small molecules and in combination with quantum chemistry methods.33-35 An important application of the BEM is the computation of the electrostatic contribution to solvation free energies.36 A few applications of the BEM have been reported for proteins. For example, Yoon and Lenhoff37 calculated the interaction free energy for a protein (ribonuclease) with a charged surface with credible results, while Zhou38a considered the electrostatic interaction energy between two proteins by means of the BEM. The last author also developed a method to calculate translational and intrinsic viscosity by exploiting the relations between hydrodynamics and electrostatics based on a BEM approach.38b,c Some work has been done to combine the finite difference method and the BEM39,40a and to improve the efficiency of the BEM.32,41 In this work pKa values are computed using the BEM in combination with a Monte Carlo simulation to sample the protonation states of the protein of interest. Essentially the change of the pKa of a model compound (free amino acid) upon transfer from solution to the protein is determined.13 Juffer et al.31 and Zhexin et al.40b have already employed the BEM to calculate the pKa of titrating sites in protein with reasonable success. However, these authors did not include any site-site interactions. In the present work, this part is included leading to much improved pKa values compared to experimental determined values. Methods pKa Shift. pKa values of titrating sites in protein can be calculated from the shift of the pKa of a model compound containing a titrating site upon transfer from the solution to the protein.13,14 The intrinsic pKintr is defined as the pKa of a a titrating site having no interaction with other titrating sites in the protein and can be computed from the shift of the pKa of a model compound upon transfer from solution into the protein while neglecting interactions with other sites. This results in
pKintr a
∆∆GQ + 2.30RTpKa ) 2.30RT pKa
(1)
Here, pKa is the negative logarithm of the acid-dissociation constant of a model compound, R is the gas constant, and T is the temperature. ∆∆GQ is the difference in the standard Gibbs free energy change ∆GQ for the deprotonation reaction
HA + H2O T H3O+ + A
(2)
where A is the model compound in solution and in the protein. ∆∆GQ can be calculated from differences in solvation free energies (change of free energy upon transfer of a solute molecule from vacuum to solution) according to Vfs Vfp ∆∆GQ ) ∆∆Gsolv,A/AH - ∆∆Gsolv,A/AH
(3)
The first term is the difference in solvation free energy between A (deprotonated state) and AH (protonated state) when transferred from vacuum to solvent (superscript v f s ). The second is the difference in solvation free energy between A and AH when transferred from vacuum to the protein (superscript v f p). In the most simple way, the solvation free energy for a solute molecule may be determined from a simple thermodynamic cycle.42 First, in vacuum, the solute molecule is discharged. Then, a cavity is created in the solvent large enough to fit the solute molecule (apolar contribution). Next, the molecule is charged again in the presence of the solvent. Only the electrostatic contribution to the solvation free energy is required for ∆∆GQ in eq 1, since the apolar terms cancel (at least within the approximations inherent to the method employed in this work). Boundary Element Method and Continuum Electrostatics. The electrostatic contribution to the solvation free energy for a solute molecule is calculated from continuum electrostatics by means of the BEM. The solute is considered to be a linear dielectric medium placed in a cavity (solute region) surrounded by a highly polarizable solvent (water or electrolyte). The cavity surface Σ smoothly follows the shape of the solute molecule. The charge distribution of the solute is given by a set of N partial charges qi located at positions ri which act as source charges for the polarization of the surrounding solvent. The reaction field due to the induced bound charges on the surface and in the solvent region in turn polarizes the solute molecule. The solute polarization is modeled by assigning a dielectric constant to the solute region. The electrostatic potential φ(r) at position r in the cavity satisfies the Poisson differential equation N
∇2 φ(r) ) -
qi
δ(r - ri) ∑ i)1
(4)
1 0
Here, 1 and 0 refer to the dielectric constant of the solute molecule and the vacuum permittivity respectively and δ(r ri) is the Dirac delta function. The electrostatic potential ψ(r) at r in the solvent region is assumed to satisfy the linear Poisson-Boltzmann equation according to
∇2ψ(r) ) κ2ψ(r)
(5)
Here, κ is the inverse Debye screening length defined by
κ2 )
2IF2 20RT2
(6)
In this equation, I is the ionic strength, F is Faraday’s constant, 2 is the dielectric constant of the solvent, R is the gas constant, and T is the temperature. Instead of solving this set of differential equations (4, 5), the BEM converts these equations into two coupled integral equations valid on the surface.28-31 From the solution of the integral equations, the total electrostatic potential at any position (in or outside the surface) can be determined. The integral equations on the surface employed in this work are31
7666 J. Phys. Chem. B, Vol. 101, No. 38, 1997
( )
2 1 1 + φ(r0) 2 1
[
∫Σ L1(r;r0) φ(r) + L2(r;r0) F(ri;r0) ∑ i)1
( )
∫Σ N
)
[
]
W)
∂φ(r) L3(r;r0) φ(r) + L4(r;r0) dσ ∂n qi ∂F(ri;r0)
∑ i)1
1 0
(8)
∂n0
Here, F(ri;r0) ) 1/(4π|ri - r0|). The kernels Li are complicated functions depending on the geometry of the surface (and also on 1, 2, and κ).31 Yoon and Lenhoff30 derive rather different integral equations, while Zauhar and Morgan28,29 do not include ionic strength. For the zero ionic strength case (κ ) 0), a single integral equation for φ(r0)31 or ∂φ(r0)/∂n28,29 on the surface is required. This set of integral equations can be solved numerically for the unknown functions φ(r0) and ∂φ(r0)/∂n by means of the BEM. For this purpose, the surface Σ is triangulated. The triangle vertices serve as collocation points; that is, at these points the unknown functions exactly satisfy the integral equations. The integrals in eqs 7 and 8 are replaced by sums of integrals over triangles. This procedure results in a matrix equation28-31 according to
Sx ) b ) Bq
(9)
The 2M × 2M (M is the number of triangle vertices of the triangulated surface) surface matrix S depends on the geometry of the system and contains the kernels Li, and the vector b with dimension 2M corresponds to the source terms of the integral equations (the summation terms in eqs 7 and 8). This vector can be written in terms of a matrix B and a vector q: B contains F(ri;r0) and its normal component, while q contains the source charges of the solute molecule. B has dimension 2M × N, and q has dimension N. The vector x with dimension 2M contains the electrostatic potential and the normal component of the electric field at the collocation points. This matrix equation may be solved for x by means of LU-decomposition methods.43 The electrostatic potential inside the surface at a point r0 is given from
φ(r0) )
∫Σ
[
]
∂φ(r)
L1(r;r0) φ(r) + L2(r;r0)
∂n N
dσ + qi
F(ri;r0) ∑ i)1
(10)
1 0
The surface integral gives the reaction potential, and the summation term is the electrostatic potential, from source charges located in a polarized medium (solute). Using BEM, eq 10 can be written as
φ ) Rx + Dq
surface depends only on the total potential31 or the normal component of the electric field28,29 on the surface. The amount of electrostatic work W to assemble the charge distribution qi in the presence of the solvent is
(7)
1 0
1 ∂φ(r0) 1 1+ 2 2 ∂n
]
∂φ(r) dσ ∂n
qi
N
)
Juffer et al.
(11)
Here, the matrix R is the “reaction potential matrix” of dimension N × 2M. The matrix D with dimension N × N is the direct Coulomb term of eq 10, and φ is a vector with dimension N containing the total electrostatic potential at all N charges. A similar equation for the potential in the solvent region can be derived as well.31 For the case of zero ionic strength, the total electrostatic potential inside (and outside) the
1
N
∑qiφ(ri)
2 i)1
(12)
where φ(ri) is the total electrostatic potential at ri according to eq 10. In matrix-vector notation, W can be written as
1 1 1 W ) qTφ ) qTRx + qTDq 2 2 2 1 1 1 ) qTRS-1Bq + qTDq ) qTAq 2 2 2
(13)
where A ) RS-1B + D. A is symmetric and has dimension N × N. The superscript T means “transpose”. The electrostatic contribution to the solvation free energy is the difference ∆Wsolv in solvent and vacuum. In vacuum, A is the matrix C ) 1D (there is no reaction potential and no solute polarization), where C corresponds to the vacuum Coulomb interaction.
1 ∆Wsolv ) qT(A - C)q 2
(14)
Background and Site-Site Interaction. Given the BEM expression for the total electrostatic energy of a solvated solute molecule (eq 13), one can distinguish between background terms and site-site interactions. First of all, the vector q in eqs (11, 13, and 14) can be written as
q)t+b
(15)
The vector t of length N refers to titrating sites’ charges (charges depending on the pH of the solution), while b also of length N refers to background charges (pH independent charges). It is important to realize that both vectors contain zero elements. If there are Nt titrating charges, then t contains N - Nt zeros. The vector t can be arranged such that for instance the first Nt elements corresponds to the Nt titrating sites charges, while the remaining N - Nt elements are zeros. For b this means that the first Nt elements are zeros, while the remaining N - Nt elements correspond to background charges. Note that rearranging q also requires rearranging of A in eq 13, since A depends on the positions of source charges. Employing the symmetry of A, eq 13 can now be written as
1 1 W ) tTAt + tTAb + bTAb 2 2
(16)
The terms on the right correspond to the work to be done to assemble the charges t in their own field (including self-terms and site-site interactions), the work to be done to assemble t against the field from the background charges, and the work to be done to assemble the charges b in their own field, respectively. It is always possible to write A ) P + Q. Here, P reflects the site-site interactions, but it does not include the self-terms. It contains zero elements along the diagonal (or a band of zeros along the diagonal if one uses a full-charge model to define titrating sites) to exclude the self-terms. Therefore Q ) A P reflects the remaining terms and includes the self-terms. One can now write
Calculating Acid-Dissociation Constants
1 1 1 W ) tTPt + tTQt + tTAb + bTAb 2 2 2
J. Phys. Chem. B, Vol. 101, No. 38, 1997 7667
(17)
Here, the first term is the site-site interaction term which is exploited in the Monte Carlo scheme (see discussion below). The remainine three terms are used in the calculation of the pKintr values. Since differences of ∆∆GQ (∆∆Wsolv) are a required, the last term of eq 17 cancels and can be left out in the calculation such that one only needs to consider the difference of changes in the self-energy of the titrating sites’ charges and the interaction energy between the titrating sites’ charges and background charges for A and AH in the model compound and in the protein. This is precisely what has been done by Bashford and Karplus.13 In fact, the sum of the second and third term on the right of eq 17 is the amount of work to be done to assemble the charge distribution t against the selfterms and the field of the background charges. Equation 17 is completely general in the sense that it is valid for any kind of titration model to describe the titrating sites, because the partitioning of solute charges into titrating site charges and background charges is always possible (eq 16). It will apply also for the case when a mixed titration model is used: certain sites may be described by a simple charge model while for others a full-charge model is used (as is done in this work). Such a mixed model will not change the form of eq 17 but will rearrange the vectors and matrices involved. Monte Carlo Simulation. The titration curve of a protein is44
θ)
∑σ sσpσ
(18)
Here, θ is the total occupancy (total number of bound protons) and ranges from 0 to the total number of titrating sites, σ refers to a protonation state, sσ is the amount of bound protons for state σ, and pσ is the probability of observing state σ and is given by2,13
pσ )
∑isσ,i(pKiintr - pH) - Wσ/RT) ∑σ exp(2.30 ∑isσ,i(pKiintr - pH) - Wσ/RT) exp(2.30
(19)
In this equation, sσ,i is the protonation state of a titrating site i when the protein is the protonation state σ and is equal to 0 or 1 and Wσ corresponds to the site-site interaction term in eq 17. In eq 17, t is depending on σ, and upon change of the protonation state of the protein some of the Nt charges may become equal to zero. Equation 18 is correct only for protein solutions in which the concentration of protein is low such that protein-protein interaction can be neglected.44 Implementation. A charge distribution for a protein was taken from the molecular dynamics simulation package Gromacs developed by Berendsen and co-workers.45 Only polar hydrogens were included. For Lys, His, and the C- and N-terminal titrating site, a full-charge model for the deprotonated and protonated state was utilized. Charge distributions for Arg and Tyr are available only for the protonated state. For the titrating sites located in the side chains of a Asp and Glu residue, the site consisted of a single atom (CG and CD atom, respectively). All Cys side chains were excluded from titration because of the complication of Cys side chains involved in disulfide bridges. An overview of the charges for the protonated and deprotonated state for all titrating residues is given in Table 1. Atoms carrying a partial charge other than those listed in Table 1 are background charges. According to the Gromacs force field,45 Coulomb interactions between bonded atoms and second neighbor atoms are to be excluded from the direct Coulomb potential (the C and D
matrices). This is essential: inclusion of these interactions gave completely unrealistic solvation free energies, since these interactions are strong, making it unfavorable to place the molecule in the solvent. The structures of model compounds for titrating sites were taken to correspond to the structure of the corresponding residues in the protein. If the C- or N-terminal residue has a titrating side chain, this site was removed from the model compound when employed for the C- or N-terminal titrating site, while the N- or C-terminal titrating site was removed when the model compound was employed for the side-chain titrating site. The pKa values of the model compounds were Arg 12.0, Asp 4.0, Glu 4.4, His 6.3, Lys 10.4, Tyr 9.6, N-terminus 7.5, and C-terminus 3.8. The BEM algorithm developed earlier31 is employed to estimate the electrostatic contribution to the solvation free energy. To achieve higher accuracy while using a coarser triangulated surface (less and therefore larger triangles), the technique of “peak separation” was utilized. The two unknown functions (eqs 7 and 8) on the triangulated surface are separated in a smooth (or weakly varying) and a nonsmooth (or strongly varying) part: φ ) φ j + φ ˆ and the same for the normal component. The nonsmooth part (indicated by the “hat” symbol), for which an explicit formula is given in ref 31, shows up in the right-hand side of the matrix eq 9), so that only the smooth part (indicated by the “bar” symbol) needs to be calculated: the smooth part instead of the total potential and normal component becomes the solution of the matrix equation in eq 9. The same technique is used for eq 10. For example, using this procedure, errors in computed electrostatic potential at points close to the surface inside a spherical surface with radius 2.0 nm can be made almost zero while using only about 200-500 triangles. A recently developed triangulation procedure for proteins is described in full detail elsewhere.46 In short, the protein/solvent system is mapped onto a three-dimensional lattice with a specified lattice spacing. To each lattice point, a box is assigned. Boxes in the solvent region or in the interior of the protein are removed from the set. The remaining set of boxes define the outer contour of the protein. The set of box sides facing the solvent are triangulated by dividing each side in two triangles. This also defines the orientation of the triangulated surface. After smoothing this initial triangulated surface somewhat, the triangles vertices are mapped onto the dotted solvent-accessible surface computed by different means.47 This latter step ensures that the final triangulated surface corresponds as much as possible to the solvent-accessible surface. A lattice spacing of 0.18 nm was employed throughout this work. A Monte Carlo program has been written employing the sitesite interaction term in eq 17. The corresponding interaction matrix needs to be calculated only once such that given that matrix, the Monte Carlo program is very efficient. To avoid that strongly coupled sites are trapped in a minimum which would result in poor sampling of the protonation states of these sites, the program was allowed to change the protonation states of several sites simultaneously.16 In this work, at the most two sites were changed simultaneously. Titration curves for all titrating sites were obtained for the pH range -10 to +20 at an interval spacing of 0.5 except for pH larger than 14 and smaller than -1.0, where a spacing of 1.0 was employed. The site pKa was estimated from the corresponding site titration curve by assuming a linear relation between pH and site occupancy around the pH for which the site occupancy is about 0.5. The pKa is defined as the pH for which the site occupancy is 0.5.
7668 J. Phys. Chem. B, Vol. 101, No. 38, 1997
Juffer et al.
TABLE 1: List of Charges for Deprotonated and Protonated Statesa residue
atom
deprotonated
protonated
N-term
N H1 H2 H3 CA CZ OH HH CE NZ HZ1 HZ2 HZ3
-0.83 0.415 0.415 0.0 0.0 -0.66 -1.15 0.0 0.0 -0.83 0.415 0.415 0.0
0.129 0.248 0.248 0.248 0.127 0.34 -0.548 0.398 0.127 0.129 0.248 0.248 0.248
Arg Tyr Lys
a
residue
atom
deprotonated
protonated
Asp Glu His
CG CD CG ND1 HD1 CD2 CE1 NE2 HE2 C OT O HO
0.27 0.27 0.0 0.0 0.19 0.13 0.26 -0.58 0.0 0.27 -0.635 -0.635 0.0
1.27 1.27 -0.05 0.38 0.3 0.0 -0.24 0.31 0.3 0.53 -0.38 -0.548 0.398
C-term
This is based on the Gromacs force field.45
TABLE 2: pKa Values for BPTIe residue
ref 20
a pKintr a
pKaa
b pKintr a
pKab
c pKintr a
pKac
d pKintr a
pKad
expt
N-term Arg-1 Asp-3 Glu-7 Tyr-10 Lys-15 Arg-17 Arg-20 Tyr-21 Tyr-23 Lys-26 Tyr-35 Arg-39 Lys-41 Arg-42 Lys-46 Glu-49 Asp-50 Arg-53 C-term
7.2 18.1 3.4 5.1 9.9 10.4 12.2 13.1 10.1 11.3 10.4 9.5 12.2 10.3 13.0 10.0 3.7 2.4 12.9 3.7
6.3 11.3 3.5 5.7 10.7 10.6 11.8 11.0 10.2 11.5 10.2 9.9 11.5 11.2 12.6 9.0 3.9 0.8 15.3 4.0
5.3 19.6 2.6 3.3 10.0 10.9 12.4 14.6 9.9 9.2 10.3 8.8 12.3 12.2 12.5 8.3 3.7 -1.6 14.7 3.6
7.4 12.1 3.8 4.5 9.7 10.4 11.9 11.8 9.9 9.6 10.4 9.3 11.8 10.7 12.2 10.1 4.3 3.3 13.2 3.9
7.2 14.4 3.2 3.5 9.2 10.5 12.0 13.1 9.8 9.3 10.5 8.6 12.3 11.2 12.3 10.0 3.9 2.1 13.2 3.5
7.4 12.1 3.9 4.3 9.6 10.4 12.0 11.9 9.9 9.7 10.3 9.2 11.8 10.7 12.1 10.2 4.2 3.4 12.9 3.8
7.2 14.0 3.7 3.4 9.1 10.5 12.1 12.9 9.7 9.1 10.5 8.7 12.3 11.0 12.2 10.1 4.0 2.3 12.9 3.4
7.7 12.3 3.9 4.3 9.5 10.4 12.0 12.1 9.8 9.6 10.5 9.2 11.9 10.7 12.2 10.3 4.3 3.7 12.7 3.9
7.6 13.3 3.4 3.5 9.1 10.5 12.1 12.8 9.6 9.3 10.5 8.8 12.3 11.0 12.4 10.3 4.0 2.9 12.8 3.5
0.1 3.0 3.7 9.9 10.6 10.4 11.5 10.6 11.1 10.8 10.6 3.8 3.4 12.9 3.6
a Dielectric constant of protein was set to 4. b Dielectric constant of protein was set to 20. c Dielectric constant of protein was set to 36. d Dielectric constant of protein was set to 78.5. e Ionic strength was 0.15 M, the temperature was 298 K, and the dielectric constant of the solvent was 78.5. The second column contains the finite difference results obtained by Antosiewicz et al.20
Results The pKa values of titrating sites in bovine pancreatic trypsin inhibitor (BPTI, entry code 4PTI48 in the Brookhaven Protein Data Bank49), calbindin (entry code 3ICB50), lysozyme (triclinic form, entry code 1LZT51), and ovomucoid third-domain (entry code 1PPF52) were calculated. The pKa values obtained are summarized in Tables 2 to 5. BPTI. The results for BPTI are presented in Table 2. The second column lists the pKa values computed using the finite difference method by Antosiewicz.20 The third and fourth columns list the results from this work with the dielectric constant of BPTI set to 4. In general the agreement with experimental results (last column in Table 2) is credible. The negative pKa value obtained for Asp-50 is not close to the experimental result of 3.4, and the same is true for Arg-53. This is in part due to a strong interaction between these two sites, since the distance between both sites is 0.4-0.5 nm. Both sites are described with a simple charge model. At low pH, where Arg-53 is protonated, the positive charge on the CZ atom of Arg-53 (see Table 1) will promote the dissociation of the proton from Asp-50. This lowers the pKa with respect to the intrinsic value for Asp-50. At high pH, where Asp-50 is deprotonated, the positive charge on the CG atom of Asp-50, will promote the deprotonation of Arg-53. In addition, there is a favorable interaction between the negative CZ of Arg-53 in the deprotonated state with the positive charge of CG of Asp-50 in the deprotonated state. This all results in a lower pKa with respect
to the intrinsic value for Arg-53. Otherwise, all arginines pKa values are in correspondence with the finite difference results, except for Arg-53. In general, the finite difference calculation gives a somewhat better agreement with experimental result with the exception of Glu-7. The pKa of Glu-7 is calculated by BEM to be 3.3, while the experimental result is 3.7; the finite difference calculation results here in 5.1. The calculations for BPTI were also performed with dielectric constants of 20, 36, and 78.5 for the protein (columns 5-10 in Table 2). The value of 20 was suggested by Antosiewicz et al.,19 while the value of 78.5 was used by Kesvatera et al.53 The value of 36 was suggested by Smith et al.54 These authors estimated the dielectric constant of BPTI by means of molecular dynamics simulations. For several titrating sites, a much better agreement is found with experimental results, while for other sites almost no change in the calculated pKa was seen (e.g. Lys15, Lys-26, C terminus). The most notable improvement is found for Asp-50 and Arg-53: their interaction is clearly screened more effectively by using a higher dielectric constant. Also the pKa values for the N terminus and Lys-46 correspond better with experimental results. A further increase of the dielectric constant to 36 and 78.5 gives for a few sites some improvement (e.g. N terminus, Arg-53), but the changes are not large. Calbindin. Table 3, parts A-C, lists the computed pKa values for the lysine residues in the Ca2+ form of calbindin (there are two Ca2+ ions bound to the protein molecule). We only
Calculating Acid-Dissociation Constants
J. Phys. Chem. B, Vol. 101, No. 38, 1997 7669 TABLE 5: pKa Values for Lysozymea
TABLE 3: pKa Values for Lysine Residues in Calbindin: Temperature Was 298 K and Dielectric Constant of the Solvent Was 78.5
ref residue 13b pKintr pKa expt residue a
A. Dielectric Constant of Calbindin ) 4 a residue ref 53d pKintr a
Lys-1 Lys-7 Lys-12 Lys-16 Lys-25 Lys-29 Lys-41 Lys-55 Lys-71 Lys-72
12.0 12.3 11.8 11.4 12.8 11.5 10.8 12.1 11.1 11.8
pKaa
14.5 13.8 14.4 13.6 14.2 13.3 13.4 11.5 >20.0 >20.0 14.1 12.8 12.3 10.8 16.4 16.1 12.1 10.3 13.7 11.6
b c pKintr pKab pKintr pKac expt a a
13.7 11.9 12.0 11.6 18.5 12.3 10.9 13.6 10.6 11.8
13.7 11.3 12.0 11.0 18.5 11.6 10.3 13.5 10.1 10.6
15.0 11.2 10.9 11.4 19.1 12.2 10.8 12.7 10.6 11.8
14.9 10.5 11.5 11.2 19.2 11.6 10.1 12.6 10.3 10.9
10.6 11.4 11.0 10.1 11.8 11.0 10.9 11.4 10.7 11.0
B. Dielectric Constant of Calbindin ) 20 a pK a pKintr b pK b pKintr c pK c residue ref 53d pKintr expt a a a a a a
Lys-1 Lys-7 Lys-12 Lys-16 Lys-25 Lys-29 Lys-41 Lys-55 Lys-71 Lys-72
12.0 12.3 11.8 11.4 12.8 11.5 10.8 12.1 11.1 11.8
13.5 13.5 13.4 13.0 15.0 13.3 12.5 14.0 12.4 13.3
12.9 13.0 12.5 11.7 14.9 12.1 11.2 13.6 11.1 12.9
11.3 10.9 11.0 10.6 12.7 11.2 10.7 11.3 10.3 10.9
11.2 10.9 10.7 9.9 12.7 10.7 10.3 11.1 10.1 10.6
10.9 10.4 10.5 10.2 12.2 10.9 10.5 10.6 9.9 10.4
10.8 10.3 10.2 9.6 12.2 10.6 10.4 10.5 9.9 10.3
10.6 11.4 11.0 10.1 11.8 11.0 10.9 11.4 10.7 11.0
C. Dielectric Constant of Calbindin ) 78.5 residue
ref 53d
a pKintr a
pKaa
b pKintr a
pKab
expt
Lys-1 Lys-7 Lys-12 Lys-16 Lys-25 Lys-29 Lys-41 Lys-55 Lys-71 Lys-72
12.0 12.3 11.8 11.4 12.8 11.5 10.8 12.1 11.1 11.8
11.5 11.5 11.3 11.1 11.9 11.3 10.9 11.7 10.9 11.4
11.3 11.3 11.1 10.8 11.8 10.9 10.6 11.5 10.6 11.1
11.2 11.6 11.1 10.9 11.6 11.0 10.7 11.4 10.7 11.1
11.1 11.1 10.9 10.7 11.6 10.8 10.5 11.3 10.5 10.9
10.6 11.4 11.0 10.1 11.8 11.0 10.9 11.4 10.7 11.0
a Ionic strength was set to zero. b Ionic strength was 0.1 M. c Ionic strength was set to 0.3 M. d pKa values from Kesvatera et al.53 These were calculated for zero ionic strength (no added salt, counterions only).
TABLE 4: pKa Values for Ovomucoid Third-Domain: Ionic Strength Was 0.1 M, Temperature Was 298 K, and Dielectric Constant of the Solvent Was 78.5 ref a pK a pKintr b pK b pKintr c pK c pKintr d pK d expt residue 20 pKintr a a a a a a a a N-term Asp-7 Glu-10 Tyr-11 Lys-13 Glu-19 Tyr-20 Arg-21 Asp-27 Lys-29 Tyr-31 Lys-34 Glu-43 His-52 Lys-55 C-term
5.3 3.4 3.2 7.1 6.8 3.2 5.5 2.5 5.3 2.3 3.5 6.0 0.7 6.1 4.1 9.9 14.6 10.1 12.7 10.0 12.5 12.6 16.3 12.6 13.7 2.5 4.7 4.2 4.8 4.1 10.8 11.6 7.9 10.4 9.3 12.4 12.5 12.3 12.7 12.9 5.5 6.6 6.5 5.3 4.6 11.6 12.1 12.0 11.8 11.8 14.1 >20.0 >20.0 14.7 14.4 14.4 14.0 13.6 12.5 13.0 4.4 5.1 4.8 5.5 5.1 6.3 7.4 9.3 7.2 7.5 10.5 9.4 13.5 10.5 11.7 2.9 4.9 2.3 4.5 3.1
7.0 4.4 4.9 10.9 11.3 4.2 9.9 12.3 4.4 11.2 12.5 11.5 4.5 6.6 10.3 4.1
7.0 3.1 3.6 9.1 12.1 3.9 9.1 12.6 3.7 11.3 12.2 11.8 4.4 6.8 11.2 3.1
7.3 4.3 4.8 10.5 11.3 4.3 9.8 12.3 4.2 11.1 11.2 11.4 4.7 6.7 10.4 4.0
7.2 3.2 3.8 9.2 11.7 3.7 9.1 12.6 3.5 11.4 10.8 11.5 4.4 6.7 11.1 3.1
2.6 4.1 3.2 2.3
4.7 2.3
a
Dielectric constant of protein was set to 4. b Dielectric constant of protein was set to 20. c Dielectric constant of protein was set to 36. d Dielectric constant of protein was set to 78.5. The second column contains the finite difference results obtained by Antosiewicz et al.20
computed the lysine residues’ pKa’s rather than the total titration curve because detailed information is only available for these residues.53,55 Table 3A-C contains also pKa values calculated
ref 13b
pKintr a
pKa
expt
N-term 6.4 Lys-1 9.6 Arg-5 Glu-7 2.1 Lys-13 11.6 Arg-14 His-15 4.0 Asp-18 2.9 Tyr-20 14.0 Arg-21 Tyr-23 11.7 Lys-33 9.6 Glu-35 6.3 Arg-45 Asp-48 1.0 Asp-52 7.0
A. Dielectric Constant of Lysozymne ) 4 -1.2 -3.8 7.9 Tyr-53 >20.0 >20.0 >20.0 12.1 10.1 8.9 10.8 Arg-61 >20.0 >20.0 13.3 13.7 Asp-66 1.7 4.5 -0.6 2.0 4.7 2.9 2.6 Arg-68 11.3 15.2 14.6 17.3 10.5 Arg-73 13.1 12.7 14.2 13.0 Asp-87 1.2 4.0 0.0 3.6 9.0 5.5 5.8 Lys-96 10.4 13.6 15.3 10.8 3.2 2.9 Lys-97 10.6 13.5 13.3 10.3 15.3 12.2 10.3 Asp-101 7.9 12.1 9.8 4.1 15.6 15.5 Arg-112 15.0 14.7 14.1 14.1 9.8 Arg-114 14.2 13.8 14.2 13.6 10.6 Lys-116 9.9 11.2 11.9 10.4 11.1 9.0 6.1 Asp-119 3.2 4.2 3.5 2.5 9.5 5.6 Arg-125 13.5 13.2 8.4 5.2 4.3 Arg-128 12.7 12.6 10.0 7.0 3.6 Cterm 2.3 6.6 2.7 2.8
N-term 6.4 Lys-1 9.6 Arg-5 Glu-7 2.1 Lys-13 11.6 Arg-14 His-15 4.0 Asp-18 2.9 Tyr-20 14.0 Arg-21 Tyr-23 11.7 Lys-33 9.6 Glu-35 6.3 Arg-45 Asp-48 1.0 Asp-52 7.0
B. Dielectric Constant of Lysozyme ) 30 7.0 6.2 7.9 Tyr-53 >20.0 12.5 10.7 10.4 10.8 Arg-61 13.7 12.0 12.0 Asp-66 1.7 4.0 4.0 3.0 2.6 Arg-68 12.1 10.9 11.4 10.5 Arg-73 12.3 12.5 12.5 Asp-87 1.2 4.0 7.0 5.9 5.8 Lys-96 10.4 10.9 3.8 2.7 2.9 Lys-97 10.6 11.2 10.2 9.0 10.3 Asp-101 7.9 4.9 12.5 13.4 Arg-112 12.5 10.0 9.4 9.8 Arg-114 12.3 10.5 10.1 10.6 Lys-116 9.9 10.3 5.1 3.8 6.1 Asp-119 3.2 3.9 11.8 12.3 Arg-125 12.2 4.1 3.2 4.3 Arg-128 12.1 4.9 3.5 3.6 C-term 2.3 4.1
a
11.9 13.8 2.9 12.7 12.3 2.6 11.3 11.3 3.4 12.6 12.2 10.2 3.3 12.3 12.1 2.7
12.1 2.0 3.6 10.8 10.3 4.1 10.4 2.5 2.8
b
Ionic strength was 0.15 M and the temperature was 298 K. Finite difference results obtained by Bashford and Karplus.13
by Kesvatera et al.53 (first column). These authors used explicit ions (no added salt, counterions only) and the same dielectric constant (of water, 78.5) for protein and solvent, so polarization effects were not included. Table 3A shows pKa values computed by the BEM where the ionic strength ranged from 0 to 0.3 M and the dielectric constant of calbindin was set to 4. For zero ionic strength, the BEM results are occasionally off by 2 pKa units or more with respect to experiment (e.g. Lys-25 and Lys-55). Lys-25 is close to Asp-47 (about 0.35 nm), which explains the high intrinsic pKa value for this site. The site-site interaction is apparently not strong enough to bring the pKa down significantly. Lys-55 is close to the CdO group of Glu-65 (about 0.3 nm), which results also in a rather high pKa. The pKa values improve by increasing the ionic strength from 0 to 0.1 M. For all lysines, the values for the pKa go down. A further increase of ionic strength results for five sites in a further decrease in pKa, while others display an increase. With the exception of Lys-1 and Lys-25, this increase is minor. With ionic strength included, the overall agreement with the computational results from Kesvatera et al.53 and experimental results is much better. Finally, as is to be expected, all pKa values are lower than the corresponding intrinsic pKa value due to the unfavorable sitesite interaction. The only exception is Lys-12 when the ionic strength is 0.3 M, which shows an increased pKa with respect to the intrinsic value. Table 3B presents the same series of calculations as in Table 3A except that the protein dielectric constant is set to 20 now. For the zero ionic strength case, the most notable differences with respect to Table 3A is Lys-25 and Lys-55, which both have lower pKa values (14.9 and 13.6, respectively). For 0.1 M (the fifth and sixth columns of Table 3B), the correspondence
7670 J. Phys. Chem. B, Vol. 101, No. 38, 1997 with experimental pKa values is much better with respect to the same calculation in Table 3A. A further increase of ionic strength does not change the pKa values as calculated by the BEM much. Better agreement of calculated pKa values with the ones calculated by Kesvatera et al.53 is found for all values of ionic strength employed. Table 3C lists the calculated pKa values when the dielectric constant of the protein is set equal to that of the solvent (78.5). The calculations were done for two values of the ionic strength (0.1 and 0.3 M). For 0.1 M (third and fourth columns) there is now a close correspondence between experimental and calculated pKa values (except maybe for Lys-16). A further increase of the ionic strength to 0.3 M (fifth and sixth columns) decreases the pKa values only slightly. Ovomucoid Third-Domain. The results for ovomucoid third-domain are given in Table 4. The third and fourth columns list the results when a dielectric constant of 4 was employed for the protein. The first column lists the finite difference results obtained by Antosiewicz et al.20 The agreement of the pKa from the BEM with the finite difference method and experiment is credible, but there are some large deviations. The most pronounced difference between experiment and the BEM exists for Glu-10. This site is close to Lys-13 (at a distance of about 0.5 nm), except for the HZ1 atom of Lys-13, which is at a distance of only 0.39 nm from the CD atom of Glu-10. At low pH, the positive charge on Lys-13 side chain (especially on the HZ1 atom) promotes the dissociation of the proton from Glu10, resulting in a lower pKa of 0.6 compared to the pKintr a value of 6.0. The pKa of Lys-13 is higher than the intrinsic pKa. It is likely that, at high pKa, the positive charge on the CD atom of Glu-10 favors the protonated state because of the smaller positive charge HZ1 (0.248) compared to 0.415 in the deprotonated state, in spite of the fact that the NZ atom of Lys-13 is negatively charged (Table 1). Apparently, the details of the charge distribution assigned to a titrating site becomes important at small distances. The pKa for Asp-27 was calculated to be 6.5, while the experimental result is 2.3. The BEM results are close to the finite difference result (5.5), though. His-52 displays a rather high pKa. The same applies for Tyr-31, which is predicted to have a pKa larger than 20. The finite difference calculation resulted in 14.1 with a detailed charge model for Tyr, but gave 20.820 when a simple model was used. On the other hand, the BEM result for Tyr-11 was in agreement with the finite difference result. Table 4 also shows the results for the pKa values using a higher dielectric constant (20) for ovomucoid third-domain (fifth and sixth columns). For some titrating sites the effect of a higher dielectric constant is rather dramatic. For Glu-10 the calculated pKa is now 4.1 (was 0.7), which is in full agreement with experiment. Correspondingly, the pKa value of Lys-13 is now 13.7, which is to be compared with 16.3 when a dielectric constant of 4 was used. For the N terminus a more reasonable value of 6.8 instead of 3.2 was calculated. Asp-27 went from 6.5 to 4.6. On the other hand, the pKa of some sites are not much different from the value when a dielectric of 4 for the protein was employed (e.g Glu-19). In other cases, the pKa value when a dielectric constant is taken to be 4 is closer to the experimental value than when a dielectric constant of 20 was used (e.g. C terminus). A futher increase of the dielectric constant to 36 does not result in large differences. Glu-10 now has a pKa of 3.6, which is lower than the experimental value (4.1), while the pKa of Asp-27 is further decreased to 3.7 (was 4.6). No major changes were observed when the dielectric constant of ovomucoid third-domain was set to 78.5 (columns 9 and 10 in Table 4).
Juffer et al. Lysozyme. In Table 5A the results for lysozyme are presented when the lysozyme dielectric constant was set to 4. The discrepancies for the computed pKa from both the BEM and finite difference method (second and seventh columns, calculated by Bashford and Karplus13) with respect to experiment are for some sites rather large (Asp-52, Tyr-53, Asp-87, Asp-101). In addition, the BEM results for some other sites are also quite far off the experimental results (e.g. Glu-35, Asp66). The most notable ones are the N terminus, which has a negative pKa value, and Asp-18, which failed to titrate at all. Some other sites are in close agreement with experiment (such as the C terminus and Glu-7), and some sites are closer to experiment than the finite difference computed value (e.g. Asp48 and HIS-15). Overall, the agreement with experiment is somewhat better for the finite difference method than for the BEM. The large disagreement with experimental results (or what would be expected if no experimental values are available) for some sites can be explained in terms of interactions with nearby sites or background atoms. The large intrinsic pKa for Asp-48 results from a interaction with the nearby negatively charged ND2 atom of ASN-59. Arg-61 displays a high pKa. This site is close to Asp-48. The interaction between Tyr-53 and Asp66 results also in large deviations in the pKa values with respect to experimental values. The HH atom of Tyr-53 is close (0.25 nm) to the CG atom of Asp-66, which results in a further decrease of the pKa of Asp-66. The same applies for Arg-45 and Arg-68, which are close to each other: the former has a lower pKintr than Arg-68 such that it is more susceptible for a deprotonation that Arg-68. In turn, the latter displays a higher pKa due to the favorable interaction between the negatively charged CZ atom of Arg-45 in the deprotonated state and the positively charged CZ of Arg-68 in the protonated state. The discrepancy in pKa with respect to the experimental value for the N terminus is in part due to an interaction with the Lys-1 side chain titrating site. The unfavorable interaction between these two sites results in further decrease of the pKa with respect to the intrinsic value. Glu-35 is buried and the basic pKa value of 9.0 apparently arises from a strong interaction with background charges. Table 5B lists the computed pKa value when a dielectric constant of 3054 was used. Clearly, the overall agreement with experimental results is excellent, although some discrepacies still exists (e.g. Glu-35). The most notable change is for Tyr53, for which the pKa is calculated to be 11.9 and close to the experimental result of 12.1. Correspondingly, the pKa of Asp66 is now 2.9 (was -0.6). The pKa of the N terminus is now found to 6.2 (was -3.8). Also, Asp-18, which failed to titrate when the dielectric constant of lysozyme was 4, now has a pKa of 2.7 (experiment 2.9). The pKa value of the C terminus did not change upon increase of the dielectric constant, while His15 displays only a minor change. Asp-119 is about one pKa unit off the experimental value but is in correspondence with the finite difference result. In Figure 1A,B, the calculated versus the experimentally determined pKa values for lysozyme are plotted. These plots are rather typical for all proteins employed in this work. The correlation coefficient rc improves when a higher dielectric constant 1 for the protein is used. For the case of lysozyme rc ) 0.77 for 1 ) 4.0 and rc ) 0.97 for 1 ) 30.0. Discussion In this work, the boundary element method (BEM) has been employed to calculate acid-dissociation constants Ka (pKa ) -log Ka) of titrating sites in proteins. In general, the results are promising, but occasionally large deviations from experi-
Calculating Acid-Dissociation Constants
Figure 1. (A) Calculated (x-axis) versus experimentally (y-axis) determined pKa values for lysozyme. Ionic strength was 0.15 M, and the temperature was 298 K. The dielectric constant of lysozyme was set to 4. The correlation coefficient is 0.77, the intercept is 3.41, and the slope is 0.43. (B) Calculated (x-axis) versus experimentally (yaxis) determined pKa values for lysozyme. Ionic strength was 0.15 M, and the temperature was 298 K. The dielectric constant of lysozyme was set to 30.0. The correlation coefficient is 0.97, the intercept is 0.62, and the slope is 0.95.
mental results exist, especially when the dielectric constant of the protein under investigation is taken to be 4. In all these cases, the large deviation from experiment can be explained in terms of a strong interaction of the titrating site with a nearby charged or polar group or another titrating site implying that the screening effect of the solvent is too small or the Coulomb interaction is simply too strong. The interaction can be made smaller by taking a larger value for the dielectric constant of the protein, although this would reduce the reaction potential also (since it depends on the ratio of the protein and solvent dielectric constant). Alternatively, one could utilize the molecular surface instead of the solvent-accessible surface as employed in this work, which would magnify the effect due to solvent polarization. For all proteins studied, a close agreement of computed pKa values with experimental pKa values were obtained when a higher dielectric constant (e.g 20 or 30 instead of 4) for the protein was employed. The choice of the dielectric constant for a protein remains a point of discussion. Values between 2 and 80 have been suggested.1,53,54,56-58 Gilson and Honig56 concluded that the dielectric constant of a protein is low, between 2.5 and 4. This range was obtained from a modified version of the KirkwoodFro¨hlich dielectric theory60 where fluctuations in the total electric moment are determined from a normal mode analysis. On the other hand, molecular dynamics simulations carried out for
J. Phys. Chem. B, Vol. 101, No. 38, 1997 7671 several proteins54,59 suggested a higher dielectric constant ranging from 10 to 40 and are also based on the KirkwoodFro¨hlich dielectric theory. These higher values with respect to the result of Gilson and Honig56 arise from the highly flexible charged side chains located on the surface of the protein.59 This contribution is not utilized in the work of Gilson and Honig,56 which included vibrational modes only. This explains the lower dielectric constant determined by these authors. On the other extreme, a single dielectric constant of water was used to predict B f Z transition of DNA57 and to estimate pKa values53,61 and ion binding constants.62,63 The calculations of the latter two are based on a continuum description for water, but ions are included explicitly by means of a Monte Carlo simulation scheme. Nevertheless, the calculated binding constants were close to experimental results. The calculated pKa values for lysine residues in calbindin reported by Kesvatera et al.53 were determined at zero ionic strength. This is not completely correct since in the Monte Carlo simulation carried out by Kesvatera et al.53 counterions were included, so there is always a degree of screening due to ions. In addition, the lysine residues are located on the surface of calbindin and display a higher mobility than buried residues. Therefore, it seems very appropriate to use a high dielectric constant. In relation to pKa calculations, Antosiewicz et al.19 found that the “best” results for pKa values were obtained when the dielectric constant of a protein was taken to be 20 and suggested that a higher value for the dielectric constant would account for conformational relaxation and specific ion binding which would influence the shift in pKa. These effects are normally not included. Conformational flexibility was included in the work of You and Bashford,64 and Beroza and Case.65 These authors conclude that the inclusion of conformational degrees of freedom will improve agreement of calculated pKa values with experimental results. On the other hand, using a detailed charge model for the description of titrating sites while using a low dielectric constant for the protein also resulted in a better agreement with experimental results,2 although occasionally large deviations from experiment still remained. Our results indicate that an increase of the dielectric constant did not improve the value of the pKa for all titrating sites to the same degree, suggesting that a model containing different values for the dielectric constant for different regions of the protein (e.g a low value for the core of the protein and a rather high one for outer protein region) seems to be the most appropriate one. This conclusion forms the basis of the work of Demchuk and Wade,11 which indeed used two dielectric constants for the protein (15 and 80) and obtained credible estimates of pKa’s for several proteins using the finite difference method. These authors use a desolvation criterion to decide whether a titrating site belongs to the low or high dielectric constant region. The Gromacs45 charge distribution employed in this work was not originally developed for the calculation of pKa values. Nevertheless, it appears that the Gromacs charge distribution is performing well in the sense that good estimates of the pKa values of titrating sites in proteins can be obtained. Bashford and Karplus13 relied on the CHARMM66 charge distribution for nontitrating atoms, while formal charges of 0 and +1/-1 were used for a titrating site (simple charge model). Apparently, this leads to different intrinsic pKa values. For instance, for lysozyme, the pKintr value for Tyr-53 was found to be larger a than 20 (employing 4 for the dielectric constant of lysozyme, Table 5), while Bashford and Karplus13 found a value of 12.3. Although, the concept of intrinsic pKa is useful for the partitioning of the effects on pKa into several terms (such as background terms, self-terms), it appears that it depends too much on the details of the charge distribution taken. In that
7672 J. Phys. Chem. B, Vol. 101, No. 38, 1997 respect, it should not be given too much emphasis since it is only an intermediate result and could lead to incorrect conclusions concerning the relative importance of several terms contributing to the observed pKa shift. Although different choices for the charge distribution may result in rather different intrinsic pKa values, in the end rather similar pKa values are obtained, so the effects of differences between charge distributions employed in pKa calculations appear to be small. The computational efficiency of any BEM applications remains a complication. A large protein requires a substantial number of triangles to describe the surface; this will reduce the efficiency of the method. The most time-consuming part appears to be the calculation of the inverse surface matrix and the matrices involved in the site-site interactions. This is in part due to the peak separation method employed.31 Recently, Zauhar and Varnek41 developed a faster and more space-efficient BEM method. Their method calculates explicitly only interactions between surface elements that are relatively close to each other, while interactions between elements at larger distances are computed from a grid-based multipole expansion. This results in a rather sparse surface matrix, which decreases memory requirements. The method has been developed for the zero ionic case only. The approach employed in this work results in a smaller but dense matrix instead of a larger matrix which is sparse. To be able to employ BEM on a routine basis in future work, it is essential to use a relative low number of triangles. The combination of peak separation31 (to achieve the same accuracy but with a smaller number of triangles) and Zauhar and Varnek’s41 space efficient method (which should be extended to be able to include ionic strength also) could be a useful combination for further improvement of current BEM algorithms. Finally, we want to discuss the major differences between the finite difference method and the BEM. As may be clear from the Methods section, for the BEM the Coulomb potential is calculated exactly, while the reaction potential is computed by means of a numerical scheme according to the BEM. This means that any error in the electrostatic potential arises from the reaction potential term only. This reaction potential strongly depends on the location of the triangulated surface. The finite difference method calculates the total electrostatic potential. Solvation free energies for some solute molecules are calculated from two finite difference calculations: solute in vacuum and in solvent, respectively.42 With the BEM, the electrostatic contribution to solvation free energies can be calculated from a single calculation36 (see also the discussion leading to eq 14). An important extension of continuum electrostatics is the inclusion of polarizabilities assigned to specified locations (e.g. atoms and bonds) in proteins. This can be done with the BEM in a rather straightforward manner, as was demonstrated by Rashin67 and van Duijnen.68 The concept of polarizabilities replaces a low dielectric constant assigned to the protein molecule. The inclusion of polarizabilities makes the polarization of the protein due to the reaction field also more dependent on the location of the surface. Employing a single dielectric constant for the protein region will screen the direct interactions between source charges in the same way, namely, by a factor of 1. The strength of an induced dipole moment due to the reaction field depends on the location of the polarizability with respect to the dielectric boundary. Therefore, one can expect to see some influence on the computed pKa values. Polarizabilities cannot be included in finite difference methods at the same level of accuracy. A crude approximation would be the assignment of different values of dielectric constants to specified regions within the protein.69 Taken together, we conclude that the BEM should become a powerful method for those cases in
Juffer et al. which continuum electrostatics can be employed (such as pKa calculations). Conclusions This work has demonstrated that it is feasible to use the boundary element method for the calculation of acid-dissociation constants of titrating sites in proteins with high accuracy. The most time-consuming part of the computation appears to be the calculation of the surface matrix and the matrices involved in the site-site interactions. These matrices need to be calculated only once, though. In spite of the currently lower efficiency of the boundary element method in comparison to the finite difference method, the BEM is an attractive alternative to the finite difference method. Credible results were obtained for the proteins studied in this work. The better agreement was found when a higher dielectric constant of 20 instead of 4 was employed. A further increase of the dielectric constant generally did not greatly improve the results much more, suggesting that as long as the dielectric constant is taken rather high (even as high as 78.5) accurate values for the pKa can always be obtained. On the other hand, a single uniform dielectric constant for the protein is not completely appropriate, since for some titrating sites, changes in the dielectric constant did not change the pKa of these sites very much or resulted even in a poorer agreement with experimental results. This reflects the variation of dielectric properties of individual titrating sites. Consequently, a model including different values for the dielectric constant for different regions of the protein under investigation is the most appropriate one. Finally, inclusion of explicit ions into the description of solvent as was done by Kesvatera et al.53 does not greatly improve the pKa’s with respect to BEM or finite difference results as long as a relatively high dielectric constant for the protein is utilized. Acknowledgment. The work was supported by grants for A. H. J. from the Alberta Heritage Foundation for Medical Research (AHFMR) and the Human Frontier Science Program (HFSP). H. J. V is the recipient of a Scientist Award of the AHFMR. The authors would like to thank Dr. Johan Kemmink from the European Molecular Biology Laboratory (EMBL, Heidelberg, Germany) and Dr. Jacob de Vlieg from the Unilever Corporation (Vlaardingen, The Netherlands) for helpful discussions. The authors are indebted to Craig Shepherd from the University of Calgary for carefully reading the manuscript. References and Notes (1) Harvey, S. C. Proteins: Struct. Funct. Genet. 1989, 5, 78. (2) Antosiewicz, J.; McCammon, J. A.; Gilson, M. K. Biochemistry 1996, 35, 7819. (3) Warwicker, J.; Watson, H. C. J. Mol. Biol. 1982, 157, 671. (4) Klapper, I.; Hagstrom, R.; Fine, R.; Sharp, K.; Honig, B. Proteins: Struct. Funct. Genet. 1986, 1, 47. (5) Gilson, M. K.; Sharp, K. A.; Honig, B. H. J. Comput. Chem. 1987, 9, 327. (6) Davis, M. E.; McCammon, J. A. Chem. ReV. 1990, 90, 509. (7) Sharp, K.; Honig, B. J. Phys. Chem. 1990, 94, 7684. (8) Mohan, V.; Davis, M. E.; McCammon, J. A.; Pettitt, B. M. J. Phys. Chem. 1992, 96, 6428. (9) Honig, B.; Sharp, K.; Yang, A.-S. J. Phys. Chem. 1993, 97, 1101. (10) (a) Holst, M. J; Kozack, R. E.; Saied, F.; Subramaniam, S. Proteins: Struct. Funct. Genet. 1994, 18, 231. (b) Holst, M. J.; Saied, F. J. Comput. Chem. 1995, 16, 337. (11) Demchuk, E.; Wade, R. W. J. Phys. Chem. 1996, 100, 17373. (12) Bruccoleri, R. E.; Novotny, J.; Davis, M. E.; Sharp, K. A. J. Comput. Chem. 1997, 18, 268. (13) Bashford, D.; Karplus, M. Biochemistry 1990, 29, 10219. (14) Bashford, D.; Karplus, M. J. Phys. Chem. 1991, 95, 9556. (15) Lim, C.; Bashford, D.; Karplus, M. J. Phys. Chem. 1991, 95, 5610. (16) Beroza, P.; Fredkin, D. R.; Okamura, M. Y.; Feher, G. Proc. Natl. Acad. Sci. US.A. 1991, 88, 5804.
Calculating Acid-Dissociation Constants (17) Yang, A.-S.; Gunner, M. R.; Sampogna, R.; Sharp, K.; Honig, B. Proteins: Struct. Funct. Genet. 1993, 15, 252. (18) Oberoi, H.; Allewell, N. M. Biophys. J. 1993, 65, 48. (19) Antosiewicz, J.; McCammon, J. A.; Gilson, M. K. J. Mol. Biol. 1994, 238, 415. (20) Antosiewicz, J.; Briggs, J. M.; Elcock, A. H.; Gilson, M. K.; McCammon, J. A. J. Comput. Chem. 1996, 17, 1633. (21) Beroza, P.; Fredkin, D. R. J. Comput. Chem. 1996, 17, 1229. (22) Warwicker, J.; Gane, P. J. FEBS Lett. 1996, 385, 105. (23) Tanford, C.; Roxby, R. Biochemistry 1972, 11, 2192. (24) Mehler, E. K. J. Phys. Chem. 1996, 100, 16006. (25) Schellman, J. A. Biopolymers 1975, 14, 999. (26) Gilson, M. K. Proteins: Struct. Funct. Genet. 1993, 15, 266. (27) Brebbia, P. K.; Walker, S. Boundary Element Methods in Engineering Science; McGraw-Hill: New York, 1980. (28) Zauhar, R. J.; Morgan, R. S. J. Mol. Biol. 1985, 186, 815. (29) Zauhar, R. J.; Morgan, R. S. J. Comput. Chem. 1988, 9, 171. (30) Yoon, B. J.; Lenhoff, A. M. J. Comput. Chem. 1990, 11, 1080. (31) Juffer, A. H.; Botta, E. F. F.; van Keulen, B. A. M.; van der Ploeg, A.; Berendsen, H. J. C. J. Comput. Phys. 1991, 97, 144. (32) Bharadwaj, R.; Windemuth, A.; Sridharan, S.; Honig, B.; Nicholls, A. J. Comput. Chem. 1995, 16, 898. (33) Miertus, S.; Scrocco, E.; Tomasi, J. J. Chem. Phys. 1981, 55, 117. (34) Pascual-Ahuir, J. L.; Silla, E.; Tomasi, J.; Bonaccorsi, R. J. Comput. Chem. 1987, 8, 778. (35) (a) de Vries, A. H.; van Duijnen, P. Th. Biophys. Chem. 1992, 43, 139. (b) de Vries, A. H.; van Duijnen, P. Th.; Juffer, A. H.; Rullmann, J. A. C.; Dijkman, J. P.; Merenga, H.; Thole, B. T. J. Comput. Chem. 1995, 16, 37. (36) (a) Rashin, A. A.; Namboodiri, K. J. Phys. Chem. 1987, 91, 6003. (b) Rashin, A. A. J. Phys. Chem. 1990, 94, 1725. (c) Rashin, A. A.; Bukatin, M. A. Biophys. Chem. 1994, 51, 167. (37) Yoon, B. J.; Lenhoff, A. M. J. Phys. Chem. 1993, 96, 3130. (38) (a) Zhou, H.-X. Biophys. J. 1993, 65, 955. (b) Zhou, H.-X. Biophys. J, 1995, 69, 2286. (c) Zhou, H.-X. Biophys. J. 1995, 69, 2298. (39) Vorobjev, Y. N.; Grant, J. A.; Scheraga, H. A. J. Am. Chem. Soc. 1992, 114, 3189. (40) (a) Zhexin, X.; Fuhua, H.; Yunyu, S. J. Phys. Chem. 1994, 98, 12782. (b) Zhexin, X.; Fuhua, H.; Yunyu, S. J. Comput. Chem. 1995, 16, 512. (41) Zauhar, R. J.; Varnek, A. J. Comput. Chem. 1996, 17, 864. (42) Sitkoff, D.; Sharp, K. A.; Honig, B. J. Phys. Chem. 1994, 98, 1978. (43) Press, W. H.; Flannerly, B. P.; Teukolsky, S. A.; Vetterling, W. T. Numerical Recipies. The Art of Scientific Computing; Cambridge University Press: Cambridge, 1986.
J. Phys. Chem. B, Vol. 101, No. 38, 1997 7673 (44) Hill, T. L. J. Chem. Phys. 1955, 23, 623. (45) van der Spoel, D.; van Buuren, A. R.; Apol, E.; Meulenhoff, P. J.; Tieleman, D. P.; Sijbers, A. L. T. M.; van Druunen, R.; Berendsen, H. J. C. Gromacs User Manual; University of Groningen: Groningen, 1996. (46) Juffer, A. H.; Vogel, H. J. J. Comput.-Aided Mol. Des., submitted. (47) (a) Eisenhaber, F.; Argos, P. J. Comput. Chem. 1993, 14, 1272. (b) Eisenhaber, F.; Lijnzaad, P.; Argos, P.; Sander, C.; Scharf, M. J. Comput. Chem. 1995, 16, 273. (48) Marquart, M.; Walter, J.; Desenhofer, J.; Bode, W.; Huber, R Acta Crystallogr. (Sect. B) 1983, 39, 480. (49) Berstein, F. C.; Koetzle, T. F.; Williams, G. J. B.; Meyer, E. F.; Brice, M. P; Rodgers, J. R.; Kennard, T.; Shimanouchi, T.; Tasumi, M. J. Mol. Biol. 1977, 112, 535. (50) Szenenyi, D. M. E., Moffat, K. J. Biol. Chem. 1986, 261, 8761. (51) Hodson, J. M.; Brown, G. M.; Sieker, L. C. Acta Crystallogr. B 1990, 46, 54. (52) Bode, W.; Wei, A.-Z.; Huber, R.; Meyer, E.; Travis, J.; Neumann, S. EMBO J. 1986, 5, 2453. (53) Kesvatera, T.; Jo¨nsson, B.; Thulin, E.; Linse, S. J. Mol. Biol. 1996, 259, 828. (54) Smith, P. E.; Brunne, R. M., Mark, A. E.; van Gunsteren, W. F. J. Phys. Chem. 1993, 97, 2009. (55) (a) Zhang, M.; Vogel, H. J. J. Biol. Chem. 1993, 268, 22420. (b) Zhang, M.; Thulin, E.; Vogel, H. J. J. Protein Chem., 1994, 13, 527. (56) Gilson, M. K.; Honig, B. H. Biopolymers 1986, 25, 2097. (57) Soumpasis, K.-M. Proc. Natl. Acad. Sci. U.S.A. 1984, 81, 5116. (58) King, G.; Lee, F. S.; Warshel, A. J. Chem. Phys. 1991, 95, 4366. (59) Simonson, T.; Brooks, C. L., III. J. Am. Chem. Soc. 1996, 118, 8452. (60) Fro¨lich, H. Theory of Dielectrics; Clarendon Press: Oxford, 1949. (61) Svensson, B.; Jo¨nsson, B. J. Comput. Chem. 1995, 16, 370. (62) Fushiki, M.; Svensson, B.; Jo¨nsson, B.; Woodward. C. Biopolymers 1991, 31, 1149. (63) Svensson, B.; Jo¨nsson, B.; Woodward. C. Biophys. Chem. 1990, 38, 179. (64) You, T. J.; Bashford, D. Biophys. J. 1995, 69, 1721. (65) Beroza, P.; Case, D. A. J. Phys. Chem. 1996, 100, 20156. (66) Brooks, B.; Bruccoleri, R.; Olafson, B.; States, D.; Swaminathan, S.; Karplus, M. J. Comput. Chem. 1983, 4, 187-217. (67) Rashin, A. A. Int. J. Quantum. Chem., Quantum Bio. Symp. 1988, 15, 103. (68) van Duijnen, P. Th.; Juffer, A. H.; Dijkman, H. P. J. Mol. Struct. (THEOCHEM) 1992, 260, 195. (69) Sharp, K.; Jean-Charles, A.; Honig, B. J. Phys. Chem 1992, 3822.