4132
J . Phys. Chem. 1990, 94, 4732-4739
Determinatkm of Net Atomic Charges Uslng a Modified Partial Equalization of Orbital Electronegativity Method. 1. Application to Neutral Molecules as Models for Polypeptides Kyoung Tai No,+J. Andrew Grant, and Harold A. Scheraga* Baker Laboratory of Chemistry, Cornell University, Ithaca. New York 14853-1 301 (Received: October 2, 1989; In Final Form: January 4, 1990)
Modified methods for the determination of net atomic charges, based on an existing scheme utilizing the partial equalization of orbital electronegativity method, are proposed. The new methods introduce (as constraints) experimental dipole and, where possible, quadrupole moments of a set of molecules chosen to resemble the different atomic environments found in proteins. As a consequence, the atomic charges reproduce accurately the experimentalvalues of molecular dipole and quadrupole moments. The electrostatic potential surface, calculated with the new net atomic charge set, compares well to a surface obtained from a moderate level ab initio calculation (6-31G**). However, it is observed that the new charges are of twice the magnitude of those obtained from investigations in which the charges are determined from a least-squares fit to the calculated ab initio electrostatic potential (potential-derived method). The methodology outlined leads to a direct calculation of the permanent molecular charge distribution represented as a set of distributed monopoles, dependent only on the connectivityof the molecule. Hence, the method circumvents assumptions concerning the transferability of charges and can therefore approximately describe any redistribution of charge density upon assembly of amino acid residues to form a polypeptide. It is envisioned that the method introduced here will be of use in providing an approximate representation of molecular charge distributions in extended systems, specifically large polypeptide molecules.
Introduction Accurate but compact descriptions of molecular charge distributions are very important for calculating molecular structures and properties by using the relatively recent methods of computer simulation. As a consequence, numerous schemes of differing reliability and efficiency have been devised as solutions to the problem.'-'O They can be classified broadly as methods involving distributed multipolar multipoint charge distributions," and point net atomic charge distribution^.^-'^ Atomand/or bond-centered multipole expansions are the most accurate of these three methods, accounting in a very precise way for the anisotropic nature of molecular electron charge distributions. Multipoint charge models have been introduced as a simpler if less elegant and less precise way to describe the distortions in molecular charge distributions. In these models, monopoles are positioned within the molecular framework at places other than atomic centers. However, by far the simplest method for describing the distribution of charge in a molecule, particularly large biomacromolecular systems, is to use atom-centered monopoles (point charges). Despite the clear concept that the molecular charge density is approximated by a superposition of atom-centered spherical charge distributions, there exists no clear definition of how such net atomic charges should be defined. Early methods made use of quantum mechanical calculations, employing population schemes to partition electron density onto atomic centers. Such techniques, although often providing useful qualitative interpretations, are usually both very method and basis-set dependent, and it can be unclear as to what therefore governs a suitable choice of net atomic charges to incorporate into an empirical potential. An improved approach is to calculate an ab initio molecular electrostatic potential (generally less sensitive to basis-set size), and fit a set of atomic monopoles that best reproduces such a potential (potential-derived (PD) Charges derived by this method usually reproduce the permanent electrical moments of a given molecule better than a population analysis but still are subject to inherent limitations of the Hartree-Fock wave function, and the method is applicable only to those molecules for which an ab initio calculation is feasible.
To treat large molecules using techniques for the direct determination of net atomic charges (circumventing the necessity to invoke assumptions based on transferability of such charges from smaller model compounds), several purely empirical schemes have originated. The Huckel-Del Re procedure is a combination of the Hiickel method15 to treat n-bonded systems and the Del Re methodI6 to treat u-bonded systems. This has been employed for calculations of net atomic charges of large molecules, such as segments of DNA and a transfer RNA.7J7J8 The essence of the model is a simple parametrization of the rigorous equations determining the dynamics of electrons and may be employed for large biomolecular systems. There are, in addition, computationally simple methods for the determination of net atomic charges based on the concept of electronegativity equalization (EN). The first of these techniques was the full equalization of atomic electronegativity devised by Sander~0n.l~This method, although widely used, has the disadvantage that it does not take into account the nature of the chemical bond between a given atom type. In response to this ( I ) Bentley, J . Chemical Applications of Atomic and Molecular Electrostatic Potentials; Politzer, P., Truhlar, D. G., Eds.; Plenum Press: New York, 1981; p 54. (2) Stone, A. J . Chem. Phys. Lett. 1981, 83, 233. (3) Stone, A . J.; Alderton, M. Mol. Phys. 1985, 56, 1047. (4) Snir, J.; Nemenoff, R. A.; Scheraga, H. A. J . Phys. Chem. 1978,82, 2497, 2527. (5) Singh, U. C.; Kollman, P. A. J . Comput. Chem. 1984, 5, 129. (6) Matsuoka, 0.;Clementi, E.; Yoshimine, M. J . Chem. Phys. 1976,64, 1351.
(7) Poland, D.; Scheraga, H. A. Biochemistry 1967, 6, 3791. ( 8 ) Momany, F. A.; McGuire, R. F.; Burgess, A. W.; Scheraga, H. A. J . Phys. Chem. 1975, 79, 2361. (9) Abraham, R. J.; Hudson, B. J . Comput. Chem. 1985,6, 173. (IO) Mortier, W. J.; Ghosh, S . K.; Shankar, S. J . Am. Chem. Soc. 1986, 108, 4315.
( I I ) Scrocco, E.; Tomasi, J. Adu. Quantum Chem. 1978, 1I , 1 15. (12) Momany, F. A. J . Phys. Chem. 1978,82, 592. (13) Smit, P. H.; Derissen, J. L.;van Duijneveldt, F. B. Mol. Phys. 1979, 37. 521. (14) Cox, S. R.; Williams, D. E. J . Comput. Chem. 1981, 2, 304. ( 1 5 ) Hiickel, E. 2.Phys. 1932, 76,628. (16) Del Re, G. J . Chem. Soc. 1958,4031. (17) Lavery, R.; Zakrzewska, K.; Pullman, A . J . Comput. Chem. 1984, 1
5. .,
'On leave from the Department of Chemistry, Soong Si1 University, Sang Do 1 - 1 , Dong Jak Gu,Seoul, Korea. *To whom correspondence should be addressed.
0022-3654/90/2094-4732$02.50/0
~~~
363.
(18) Abraham, R. J.; Smith, P. E. J . Comput. Chem. 1987, 9, 288. ( I 9) Sanderson, R. T. Chemical Bonds and Bond Energv; Academic Press: New York, 1976; p 15.
0 1990 American Chemical Society
Determination of Net Atomic Charges. 1 difficulty, Gasteiger et a1.zo*21 developed a method utilizing the concept of orbital electronegativity, introduced by Hinze and This enables the differentiation of atoms in molecules in different hybridization states. Further, recognizing that total equalization of electronegativity produces chemically incorrect results as well as being physically unrealistic, they introduced a simple model facilitating only the partial equalization of orbital electronegativity (PEOE). The underlying feature of this method is that charge is allowed to flow throughout the covalent framework of the molecule, which in turn determines the orbital electronegativity of the constituent atoms. Allowing for charge transfer generates an electrostatic field that makes the further transfer of charge more difficult. This effects a damping of the charge transfer, which ultimately inhibits the full equalization of electronegativity. In the PEOE formalism, this process is modeled by the introduction of an empirical damping factor. Since the method is iterative, the net atomic charge of a given atom depends on the entire connectivity of the molecule, despite the fact that only through-bond interactions are considered. A more sophisticated treatment of electronegativity equalization expresses an effectiue electronegativity of an atom as a function of the charge on all the other atoms in the molecule. In this method, the damping factor of the PEOE method is replaced by the explicit calculation of changes in external fields due to the redistribution of charge during the formation of chemical bonds. In this formalism, referred to as the full equalization of orbital electronegativity ( FEOE),Z3the net atomic charges are now a function of molecular conformation. Descriptions of these three methods,Iez3 their parametrization, and applications are summarized in refs IO and 24. An attractive feature of all of these methods is that they involve direct calculation of the charge distribution for a given molecule. Unlike the PD method, they are all able, at various levels of sophistication, to model the redistribution of charge resulting from the assembly of molecular fragments, such as amino acid residues, into larger molecules. In general, the parameters in electronegativity equalization methods are usually taken from atomic properties, viz., valence-state ionization potentials and electron affinities. In the original PEOE method of Gasteiger et a1.,20-2'these parameters are used without modification for describing atomic orbital electronegativities in molecules. This had the tendency to result in partial charges whose dipole moments were too low. To overcome this difficulty, Abraham and Smith1*proposed a parametrization or orbital electronegativity based on experimental dipole moments. This was found to improve the values of calculated dipole moments. The goal of the work reported here is to obtain a set of parameters suitable for describing orbital electronegativities of the functional groups found in polypeptide molecules. Hence, net atomic charges can subsequently be determined for an arbitrary amino acid sequence of a polypeptide. Since it is desirable, in any approach to determine the structure of polypeptide molecules, that the net atomic charges remain independent of conformation while having the capacity to describe the wide range of chemical bond types found, a modified PEOE method is well suited for this purpose. The method is parametrized by using a set of suitable gas-phase experimental dipole and quadrupole moments as constraints as part of an optimization procedure. It is hoped that the introduction of multipolar molecular properties into the parametrization will lead to an improved description of the valence orbital electronegativity, essential to model more accurately the "flow" of charge throughout the covalent framework. In addition, the mechanism of electron transfer upon formation of chemical bonds is modified by introducing a number of damping factors describing the different possible bond types in a polypeptide molecule.
The Journal of Physical Chemistry, Vol. 94, No. 11, 1990 4133
Methods The total net atomic charge Q" on an atom A is obtained from an iterative procedurez1that attains convergence when the transfer of charge between all bonds in a molecule is zero. The degree of charge transfer (dqAB'"') from atom A to atom B, in a given bond during the nth iteration of the iterative procedure is described byZo
where xA("-I)and xB("-I)are the electronegativities of atoms A and B at the (n - 1)th iteration, and xA+is the electronegativity of the positive ion of atom A. Unlike the original method?O in which only a single damping factor is used, fAB now represents the damping factor for a given bond type, in this case the bond A-B. The damping factor, although formally only a parameter in the optimization procedure, corresponds in a crude way to those properties that govern the nature of the electron distribution in different chemical bonds, such as the polarizabilities and hybridization states of the participating atoms. We do not, however, wish to stress a precise physical correspondence of this parameter to any such molecular properties. The net atomic charge on atom A at the nth iteration, QA("),is obtained from Q A ' ~ ) = QA'"
+ xxdqAB(n) n B
(Ib)
where QA(0)represents the initial net atomic charge, which is set equal to zero for the atoms in neutral molecules, and the index B sums over the atoms covalently bonded to atom A, while the index n sums over the number of iterative cycles required before convergence is attained. Both the damping factors and the sets of electronegativity parameters are to be obtained by using j dipole and k quadrupole moments as constraints; hence, the following function F will be minimized:
where S =
(3) In this equation, airepresents one of the i unknown parameters, either an electronegativity term or a damping factor, while (a) represents the complete set of modified PEOE parameters to be determined. The dipole moment and quadrupole moment tensor of a certain molecule are given by P and 8,respectively, while w pand we are statistical weighting factors. The ratio wp!we was taken as 3 throughout this work, on the basis of experience in carrying out the optimization. This value arises largely due to the fact that the magnitudes of the quadrupole moments are larger than those of the dipole moments and that they are measured to less precision. The calculated dipole and quadrupole moments are obtained by using the distribution of net atomic charges. The dipole moment is thus given by N
P = Eriei i
(4)
and the elements of the quadrupole tensor in traceless Cartesian form are given by N
W x x= 0 . 5 2 Q i ( 2 x ? - y t - z t ) (20) (21) (22) (23) (24)
Gasteiger, J.; Marsili, M. Tetrahedron 1980, 36, 3219. Guillen, M. D.; Gasteiger, J. Tetrahedron 1983, 39, 1331. Hinze, J.; JaffE, H. H . J . Am. Chem. Soc. 1962,84, 540. Nalewajski, R . F. J . Am. Chem. SOC.1984, 106, 944. Mortier, W. J.; Genechten, K. V.; Gasteiger, J. J . Am. Chem. SOC. 1985, IO?, 829.
i
e',
N
= 1.5cQixgi i
(6)
where N is the number of charge sites (atoms) in the molecule,
4134
The Journal of Physical Chemistry, Vol. 94, No. 11, 1990
TABLE I: Classification of Damping Factors (f)According to Bond Type, As Used for the Various PEOE Methods damping factors
ORG,ORGDH, DPI, DQPI'
DPS, DQPS
fi
f
f2 f3
f4
fs
bond type H-sp3 atom H-sp2 atom sp'-sp' atoms sp3-sp* atoms sp2-sp2 atoms
'For all these models a single damping factor was used for all bond types, but with a different numerical value for each model.
No et al. led to difficulties in the original method for parametrizing the electronegativity set for hydrogen. To overcome these difficulties, the electronegativity of the proton xH+ is treated here as an entirely independent parameter in the optimization. Hence, unlike the other atoms, there is no physical significance attached to the parameter xH+other than that it scales the transfer of charge during the equalization of electronegativity. During the course of the optimization, the dipole moments of 25 molecules and quadrupole moments of 7 molecules were used. These molecules, chosen as model compounds to resemble the functional groups of polypeptides, are amides, urea, ketones, alcohols, aldehydes, acids, sulfides, ethers, and amines. Results and Discussion
rI(xlryI,zI) denotes the vector from the center of mass to the ith atom, and represents elements of the nondiagonal quadrupole moment tensor. The quadrupole moment tensor, calculated by using eqs 5 and 6, is then diagonalized to enable comparison with experimentally available data; hence @2alC = Rte'R (7) where R is a unitary transformation matrix that essentially describes the orientation of the Cartesian frame in diagonal form to that of the original (arbitrarily chosen) frame. The diagonal components of 8' are the 8, ebb, and 8, in eq 3. Only the first nonzero electric multipole moment is independent of the origin.25 Care was taken to ensure that, for polar molecules, the choice of origin used for the calculated quadrupole moment coincided with that implicit in the reported experimental data, this generally being the center of mass. The observed quadrupole moments as well as the definitions of the relevant coordinate systems are summarized in refs 26 and 27. In the work presented here, the originalZo(denoted ORG) as well as several modified PEOE methods are investigated. In the model ORGDH, the original electronegativity set was preserved, but the damping factor and the proton electronegativity, xH+,were optimized. For the models DPl and DQPl a single damping factor was employed while all the electronegativity parameters were optimized by minimizing the function F given by eq 2 . Finally, for the models denoted DP5 and DQP5 five damping factors were used, corresponding to five different bond types that were optimized along with all of the electronegativity parameters. The notation used to describe the models is intended to indicate the constraints used in the minimization as well as the number of damping factors used. Thus, the models labeled DPl and DP5 use only dipole moments as constraints during optimization; similarly, the models denoted DQPl and DQP5 use both dipole and quadrupole moments as constraints. A description of the damping factors for the different bond types is given in Table I. In this calculation, the orbital electronegativity of an atom i in the nth iteration is described as a function of charge associated with a given atomic valence state; thus
xi(") = a, + b,Q,(")
(8)
where a, is the electronegativity of the neutral atom and 6, is a coefficient describing the response of the electronegativity of the atom in a given valence state to the changes in electron density. Introduction of a quadratic term, cI(QI("))*, in eq 8 does not have much influence in reproducing the experimental dipole and quadrupole moments, i.e., it produces only negligible changes in the value of F of eq 2. Since inclusion of the quadratic term increases the number of unknown parameters and has led to the calculation of electronegativity sets that differ significantly from those of the neutral atom, it is not considered in this work. The uniqueness of the hydrogen atom, in that it lacks an electronic core, distinguishes hydrogen-containing bonds from any others in a molecule. The lack of a second ionization potential (25) Buckingham, A. D. Q.Rev. Chem. SOC.,London 1959, 13, 183. (26) (a) Gierke, T. D.; Tigelaar, H. L.; Flygare, W. H. J. Am. Chem. SOC. 1972, 94, 330. (b) Flygare, W. H.; Benson, R. C. Mol. fhys. 1971, 20, 225. (27) Rabinowitz, J. R.; Rein, R. Int. J. Quantum Chem. 1972, 6, 669.
Table I1 summarizes the optimized electronegativity sets and damping factors of the different computational models, parametrized according to the methods described above. The corresponding molecules and multipole moments are given in Tables I11 and IV. A notable feature of the results presented in Table I1 is the difficulty in achieving a consistent description of the behavior of the electronegativity of the hydrogen atom and the proton. The charge coefficient of the hydrogen atom (bH) is determined to be between 28 and 34 throughout all the optimization procedures. This compares to a similar value, 32.83, obtained in a study of haloalkanes,z8 also using dipole moments as a constraint, and highlights the sensitivity of the electronegativity of bonded hydrogen atoms during the process of bond formation. The value of xH+,which is responsible for scaling the amount of charge transferred between hydrogen and atoms bonded to it, lies between 2.8 and 3.5, compared to 20.02 in the original methodZo and a value of 12.85 obtained in the study of haloalkanes.z8 The low value in this study compensates for the high value obtained for the charge coefficient by facilitating the transfer of charge away from the hydrogen atom. The large and perhaps unexpected difference in the description of the hydrogen atom between our method and the earlier methodZomerely reflects the difficulty in describing bond formation involving hydrogen atoms by utilizing a simple charge-transfer mechanism, in the absence of an inner electron core for the hydrogen atom. The charge coefficient of sp3 carbon, bc(sp3),was found to be between 4.6 and 5.2. The sp3 carbon atoms are usually located between high- and low-electronegativity atoms or atomic groups in the molecules chosen in this study. The low charge coefficient of sp3 carbon is necessary for the transfer of sufficient charge from low- to high-electronegativity atoms. When a single damping factor was used (ORGDH, DPl, and DQPI), its optimized value was found to be quite near to 0.5, as in the original PEOE method.20 In DP5 and DQP5, the four types of damping factors used for single bonds have values between 0.43 and 0.66, whereas the damping factor fsfor the double bonds, C=O, N-C-0, and 0 4 - 0 , is close to 1. This corresponds to highly delocalized electron distributions in which the orbital electronegativities of such bonded atoms become nearly equivalent. The limitations of an atom-centered point charge model prevent it from describing correctly the sometimes complex features of anisotropic molecular charge distributions. This is especially important for molecules possessing nonbonding electron pairs and double bonds, such as those found in amides or amino acids. Nonetheless, the results presented in Tables I11 and IV suggest that an atomic-centered charge model parametrized by using information from dipole and quadrupole moments is physically more meaningful than a model using dipole moment considerations alone. In Table 111, the dipole moments computed by using each model are listed and compared with experimental data. The original PEOE method yields poor dipole moments in general, but specifically for the amides and acids possessing double bonds. This is not surprising since the orbital parameters for this model were obtained purely from atomic properties, which cannot be expected (28) Hammarstrom, L. G.; Liljefors, T.;Gasteiger, J. J . Comput. Chem. 1988, 9, 424.
The Journal of Physical Chemistry, Vol. 94, No. 1 1 , 1990 4735
Determination of Net Atomic Charges. 1 TABLE II: Electronegativity Parameter Sets for the Various Models
atomic species
params of eq 8
WP2)
a
CbP?
a
=O
b a b
-0-
a
>N-=
b a b
>N-
a
b
b
-S-
a
H atom
a
b b XH*
damping factor
ORG" (ORGDH) 8.97 9.32 7.98 9.18 17.07 13.79 14.18 12.92 12.87 11.15 11.54 10.82 10.14 9.13 7.17 6.24
8.060 17.256 1.976 4.625 18.060 7.015 12.975 14.624 15.102 7.478 12.555 1 1.533 10.684 10.040 7.674 28.689
DQP 1 7.582 10.508 7.893 5.180 16.986 13.252 12.043 11.314 13.008 11.150 11.052 16.809 10.1 19 8.137 7.145 33.569
DP5 8.290 10.074 8.092 4.862 14.668 13.662 12.325 14.167 14.261 11.914 10.981 12.650 9.995 6.833 7.831 32.8 13
DQP5 8.218 8.288 7.967 4.862 14.284 13.857 12.941 12.808 15.478 11.914 12.184 13.538 10.435 5.162 7.71 1 31.957
20.02 (9.315) 0.5 (0.505)
3.435 0.505
2.787 0.495
3.186
f,0.428
fs 0.944
3.389 0.482 0.569 0.501 0.530 0.972
0.181
1.526 (0.511)b
DPI
/2
0.566
f,0.659 h0.494
value of S in eq 3
26.54 (15.51)
0.484
2.495 (0.819)b
"The parameters of the ORG model were taken from ref 20. *Contributionto the value of S from dipole moments only. TABLE 111: Comparison of Calculated and Experimental Gas-Phase Dipole Moments"
dipole moment, D molecule
ORG
ORGDH
HCHO CHSCHO CHjCHzCHO e(CH,),CHCHO t-(CH,)2CHCHO CH,OH CH,CH2OH N H2C H 2CH20H HCOOH C H 3COOH CHjCH2COOH NHZCHO CHiNHCHO (CH,),NCHO t-CH,CONHCH, NH2CONH2 CH,COCH, CH,COCH,CH, H 2s CHjSH CH,SCH, CHpSSCH, CHlNHCH, CH,OCH, CH,CH2CH,
2.45 2.61 2.62 2.66 2.72 1.95 1.60 2.29 1.08 1.47 1.53 2.34 1.67 2.53 1.69 2.20 2.77 2.50 0.87 1.31 1.25 1.42 0.86 1.83 0.02
2.92 2.94 2.98 3.04 3.06 2.39 1.78 2.79 1.15 1.14 1.19 3.08 2.65 3.03 2.32 3.07 2.97 2.63 1.25 1.66 1.46 1.65 0.90 2.07 0.02
DP 1 2.44 2.65 2.65 2.70 2.76 1.73 1.39 2.96 1.62 1.68 1.61 3.93 3.76 3.37 3.48 4.56 2.83 2.67 1.23 1.41 1.52 1.98 1.14 1.54 0.02
DQPl 2.66 2.69 2.66 2.69 2.8 1 1.71 1.28 2.85 1.54 1.65 1.57 3.89 3.75 2.83 3.71 4.50 2.93 2.69 1.16 1.45 1.66 1.97 1.16 1.62 0.06
DP5 2.34 2.64 2.58 2.68 2.85 1.70 1.30 3.09 1.48 1.65 1.47 3.91 3.76 3.63 3.67 4.40 2.98 2.77 1.13 1.32 1.47 2.04 0.95 1.30 0.03
DQP5 2.26 2.65 2.63 2.66 2.72 1.71 1.38 2.90 1.30 1.57 1.53 3.90 3.70 3.57 3.71 4.45 3.05 2.89 1.12 1.44 1.73 1.98 1.08 1.57 0.02
exptl 2.34 2.69 2.52 2.69 2.86 1.69 1.41 3.05 1.415 1.70 1.46 3.71 3.8b 3.8* 3.8b 4.56 2.93 2.78 0.97 1.532 1.50 1.98 1.01 1.30 0.08
"The geometries and dipole moments of gas-phase molecules are taken from refs 30 and 31, respectively. bTaken from ref 18. to reproduce molecular properties such as multipole moments. In contrast, dipole moments calculated with all of the parameter sets DPI, DQPl, DP5, and DQP5 agree well with the experimental data, particularly the DQP5 set, which shows an excellent correlation. Calculated and observed quadrupole moments are presented in Table IV. Just as for dipole moments, the original PEOE methods exhibit large deviations from the experimental data. Given the inherent limitations in using only monopole functions to describe molecular quadrupole moments, the parameter set DQP5 reproduces the available experimental data well. Table V lists and compares dipole moments calculated by using methods other than those based upon equalization of electronegativity. This table includes a set of dipole moments obtained by fitting to the electrostatic potential and field, calculated from
an ab initio wave function (6-31G** basis set). The precise form of the fitting function is given in footnote c of Table V, and these charge sets, Q,henceforth will be referred to as the field-potential derived (FPD) charge set. A key feature of these results is that dipole moments calculated by using the FPD method are larger than the experimental values. This is related in part to the limitations of the S C F wave function; therefore, it has been suggestedIs that this kind of overestimation be reduced by applying a scaling factor of 0.91 to the calculated net atomic charges. The application of somewhat arbitrary scaling parameters reduces the appeal of a method intended to predict properties ab initio. In addition, the direct application of this method is limited to a set of molecules for which an S C F wave function can be computed. In Table VI, the net atomic charges for the formamide molecule are reported for the various schemes discussed so far. It is evident
N o et al.
4736 The Journal of Physical Chemistry, Vol. 94, No. 11, I990 TABLE IV: Comparison of Calculated and Experimental Gas-Phase Quadrupole Moments" molecule HCHO
ORG 0.60 0.37 -0.97 2.08 -0.1 1 -1.97 3.78 -0.05 -3.72 4.67 -0.08 -4.59 2.90 -1.21 -1.69 2.10 -0.97 -1.12 3.51 -1.41 -2.10
CHJCHO CHIC H 2C H O HCOOH NHZCHO CH3SCHI CHIOCH,
quadrupole moment, D DP 1 DQPl 0.54 0.46 0.33 0.28 -0.87 -0.74 2.32 1.90 -0.10 0.20 -2.22 -2.10 4.1 1 3.61 -0.08 0.30 -4.03 -3.91 6.32 5.50 -1.01 -0.45 -5.31 -5.05 4.40 3.17 0.24 -0.10 -4.65 -3.09 2.48 2.74 -I .3 1 -1.21 -1.28 -1.43 2.93 3.09 -1.23 -1.16 -1.86 -1.17
ORGDH 0.93 0.57 -1 .50 2.92 -0.15 -2.41 5.1 1 -0.65 -4.45 6.68 -0.89 -5.78 3.91 -1.31 -2.60 2.46 -1.13 -1.34 4.01 -1.64 -2.37
A DP5 0.43 0.26 -0.70 1.92 0.15 -2.08 3.66 0.23 -3.88 5.51 -0.54 -4.98 3.32 -0.81 -2.51 2.42 -1.17 -1.25 2.48 -0.98 -I .50
DQP5 0.40 0.24 -0.64 1.76 0.15 -1.92 3.26 0.36 -3.62 5.48 -0.47 -5.01 3.36 -0.30 -3.06 2.82 -1.37 -1.45 2.98 -1.17 -I .80
exptP 0.2 -0. I -0.1 1 .o
0.2 -1.2 3.1 1.1
-4.2 5.2 0.1 -5.3 3.4 -0.3 -3.1 3.2 -1.5 -1.7 3.3 -1.3 -2.0
'Quadrupole moments are written (in three successive lines) in order of their magnitudes. bThe observed quadrupole moments are summarized in refs 26 and 27.
TABLE V: Comparison of Calculated Dipole Moments with Tbose Obtained by Using Different Methods
DM" HCHO CH3CHO HCOOH CHjCOOH NH2CHO CHINHCHO (CH3)zNCHO r-CH3CONHCH3 NH2CONHI CHICOCH,
dipole moment, D FPDC DP5 2.77 2.34 2.64 1.59 1.48 1.65 4.27 3.91 3.76 3.63 3.67 4.40 2.98
CNDO~
2.53 2.68 1.51 1.75 3.75 3.73 3.68 3.82 4.67 2.82
1.94 2.50 0.90 1.17 3.52 3.33 3.33 3.89 4.88 2.91
DQP5 2.26 2.65 1.30 1.57 3.90 3.70 3.57 3.7 1 4.45 3.05
exptl 2.34 2.69 1.42 1.70 3.71 3.gd 3.gd 3.8d 4.56 2.93
"Calculated with the dipole moment parametrized Hockel-Del Re method.I8 bCalculated from the wave function of the CNDO method.'* 'The electrostatic field and potential were calculated at 900 points located between molecular van der Waals radii + I and +4 A by using the 6-31G** basis set. The optimum atom-centered point charges, which describe the potentials and field of ab initio calculations, were obtained by minimizing the following function, D N
D=
L(x x N i
N
+ E(vb- Via'c(Q))2)
(Eiyb- Eijaic(Q))2
j=xy.r
i
where N is the number of points at which the components of the electrostatic field ( E ) and electrostatic potential (V), are calculated with the set of point charges Q. ab, ab initio; calc, calculated. dTaken from ref 18.
TABLE VI: Net Atomic Charges of Formamide Calculated by Using Several Methods
charge, e N
0 C H(C) H,, Hcis dipole moment, D quadrupole moment, D
DC
A
FPD STO-3G
FPD' 6-31G**
-0.787 (-0.353)* -0.433 (-0.330) 0.650 (0.260) -0.071 (0.030) 0.325 (0.196) 0.325 (0.196)
-1.002 (-0.336) -0.588 (-0.444) 0.728 (0.173) -0.007 (0.134) 0.439 (0.236) 0.439 (0.236)
2.82 (2.80)
4.26 (4.23)
2.17 ( I .95) -0.86 (-0.66) -1.31 (-1.30)
3.55 (3.16) -1.17 (-0.98) -2.38 (-2.17)
1.040 (1.061)
0.019 (0.047)
method Mulliken pop STO-3G 6-31G**
" FPD charges are calculated based on the formula in footnote c of Table V .
ORG
PEOE DP5
DQP5 -0.525 -0.367 0.119 0.105 0.311 0.311
exptl
-0.445 -0.279 0.254 0.494 0.213 0.208
-0.729 -0.581 0.562 0.105 0.328 0.315
-0.366 -0.293 0.217 0.132 0.155 0.155
-0.397 -0.388 0.118 0.159 0.254 0.254
2.29
4.24
2.34
3.90
3.90
3.71
2.06 -0.56 -1.05
4.1 1 -1.88 -2.24
2.90 -1.21 -1.69
3.32 -0.81 -2.51
3.36 -0.30 -3.06
3.41 -0.3 -3.1
3.318
0.296
2.370
0.075
0.1 I 5
bThe values in parentheses are obtained at a local minimum; see text. the electrostatic potential, the second term in the formula for D in footnote c of Table V, was used as constraints for obtaining the charge (unit of D: e4/aO2).
The Journal of Physical Chemistry, Vol. 94, No. 11, 1990 4737
Determination of Net Atomic Charges. 1
TABLE VII: Calculation of Gas-Phase DiPole Momentso for the Molecules Not Included in the ODtimization of the Modified PEOE Parameters dipole moment, D DP5 molecule tetrahydropyran I-isopropyl alcohol pivalaldehyde t-n-propyl alcohol cyclohexanone n-morpholine Opa
DQP5
exP
Pa
Mb
WtOt
Pa
Pb
WtOl
Wa
Mb
WIOI
1.71 1.55 2.75 1.22 2.98 2.41
0.58 0.42 0.24 0.65 1.19 0.05
1.82 1.61 2.76 1.38 3.21 2.42
1.52 1.37 2.86 1.46 2.81 2.20
0.82 0.86 0.21 0.31 1.33 0.05
1.73 1.62 2.67 1.50 3.12 2.20
1.53 1.40 2.56 1.54 2.74 1.68
0.82 0.73 0.76 0.21 0.86 0.30
1.74 1.58 2.67 1.55 2.87 1.71
and kb are the components of the dipole moment. Because of the symmetry, only two components are given.
that there are large differences in magnitude between the empirical PEOE schemes and the FPD method. Clearly, direct comparison of the different charge sets is meaningless since such a quantity does not correspond to a physical observable but rather serves as a convenient mechanism for providing an approximate description of the zeroth-order electrostatic properties in molecules, such as multipole moments, the electrostatic potential, and electrostatic interactions between two molecules or different fragments of a large molecule. It is the prediction of these properties that serves as the criterion for testing empirical net atomic charges. Following this prescription, Table VI illustrates the prediction of the dipole and quadrupole moments of formamide; the charges reproduce the ab initio electrostatic potential calculated by using a 6-31G** basis set, as judged by the small value of D. As expected, the modified PEOE methods, parametrized with respect to the multipole moments, reproduce these quantities well, and the FPD method gives the smallest statistical deviation (smallest value of D)in predicting the electrostatic potential. However despite their differences in magnitude, the FPD and DPS/DQPS charge sets predict very similar potential surfaces (small values of D). The differences in magnitude of charges can be depicted pictorially employing the bond dipole descriptions of these charge distributions as shown in Figure la,b. A feature of the FPD method is the presence of near-optimal (nonminima) solutions (NONMS) at which the value of the fitting function, D,is only a little higher than that at the minimum. For example, the charge set given in parentheses in Table VI for the FPD method corresponds to an NONMS, the charge distribution of which is very close to that obtained from the empirical PEOE methods. The bond dipole description of this charge distribution is given in Figure 1b, which illustrates that it is very similar to that of the DP5 bond dipole description (Figure IC). The electrostatic potential surfaces of formamide drawn in the plane of the molecule, computed from an ab initio S C F wave function as well as from the different empirical charge sets, are presented in Figure 2. It is important to consider throughout the following discussion that the a b initio potential surface is subject to the limitations of the Hartree-Fock method. It is envisioned that the accuracy achieved by the calculation, however, reproduces at least the qualitative features of the true electrostatic potential. The surface obtained by using the distributed multipole analysis of Stones (Figure 2b) is capable of reproducing the ab initio potential surface (Figure 2a) up to the region defined by the van der Waals surface to very high accuracy. It is likely that this accurate description of charge density' will be of immense value in the development of potentials used in the simulation of small rigid molecules, although there are obstacles still to be overcome before it can be applied to potentials describing charge distributions of large molecules with many conformational degrees of freedom. The comparison of the FPD potential surface (Figure 2c) to the a b initio surface illustrates the limitations of atomcentered monopole models in general, which are unable to describe angular dependencies close to the van der Waals surface. However, in the region 1-2 A from the van der Waals surface, significant since it represents typical intermolecular separations, agreement between these two surfaces appears good. Likewise, the potential surfaces produced from the modified PEOE charge sets reproduce the ab initio potential surface in this region well (Figure 2e,f). The potential surface produced from the Mulliken
a
6-31GnnPD ( a t global minimum)
b
6-310"' PD
C
DP5
(at an NONMSI
Figure 1. Bond dipole descriptions of the net atomic charges of formamide calculated from (a) the 6-31G** FPD method (at the global minimum), (b) the 6-31G** FPD method (at an NONMS), (c) the modified PEOE method with the DP5 parameter set. The magnitudes of the bond dipole moments are described by arrows, drawn to a scale of 0.3 e A cm-I. The point charges used for the calculation of the bond dipole moments are all described in Table VI.
point charges (Figure 2d) appears to be unsuitable for calculations of electrostatic energies, as does the original PEOE charge set (Figure 2g), which produces a very weak electrostatic potential at both the positive and negative regions of the formamide potential surface. A further evaluation of the quality of the modified PEOE charges has been undertaken by predicting the dipole moments of six molecules not used in the parametrization of the model and comparing to experimental data (Table VII). The calculated components of the total dipole moment are obtained by projecting the net dipole moment onto the principal molecular rotation axes (see the following paperz9 for details). Close agreement with experiment is found for the molecules tetrahydropyran, t-isopropyl alcohol, pivalaldehyde, and t-n-propyl alcohol. However, for the molecules cyclohexanone and n-morpholine, the predicted dipole moments are slightly larger than the experimental ones. However, the error originates mainly from the poor experimental geometry30J1which is apparent since when this geometry is used there remains a large discrepancy between the calculated and independently measured moment of inertia for these molecules. Of the two revised parameter sets, DQPS provides the better description of both dipole and quadrupole moments, which further suggests that this is the set to be utilized in future applications of the modified PEOE method.
Summary The modified PEOE method described here provides a simple and direct approach for calculating net atomic charges for those groups found in polypeptides and proteins. The procedure is not complicated to implement. All that is required for input data is a molecular geometry and atom label list. From the input data it is necessary for the code to determine the connectivity of the (29) No, K. T.; Grant, J. A.; Jhon, M. S.; Scheraga, H. A. J . fhys. Chem., following paper in this issue. (30) Lundolt-B6rnstein; Hellwege, K.-H., Hellwege, A. M., Eds.; Springer-Verlag: Berlin, 1976; Vol. 7, Structure Data of Free Polyatomic MolauL. (31) (a) Reference 30, 1974; Vol. 6, Molecular Constants. (b) Nelson, R. D., Jr.; Lide, D. R., Jr.; Maryott, A. A. Natl. Stand. ReJ Data Sef., Natl. Bur. Stand. 1967. No. 10.
4738
The Journal of Physical Chemistry, Vol. 94, No. 1 I 1990
No et ai.
~
,
.
,
-4
-2
-4
-2
,
,
,
.
,
.
0 ANGSTROM
2
4
-4
-2
0 ANGSTROM
2
4
0
2
4
-4
-2
0 ANGSTROM
2
4
2
4
-4
-2
0
2
4
ANGSTROM
-4
-2
0
ANGSTROM
ANGSTROM
4-
2.
I a
L
0 -
z -2.
-4
1
E -4
-2
0 ANGSTROM
2
4
Figure 2. Electrostatic potential surfaces of formamide calculated (a) from the a b initio 6-31G** wave function, (b) from the distributed multipole analysis2*’(DMA,truncated at quadrupole) with the 6-31G** basis set, (c) with the 6-31G** FPD p i n t charges, (d) with Mulliken point charges of the 6-31G** basis set, (e) with modified PEOE charges calculated with the DP5 parameter set, (f) with modified PEOE charges calculated with the DQP5 parameter set, and (g) with PEOE charges calculated with the original PEOE parameter set. The units of the electrostatic potentials on the contours are kJ/mol. The full lines indicate regions of positive potential, while the dashed lines indicate regions where the potential is negative.
The Journal of Physical Chemistry, Vol. 94, No. 11, 1990 4739
Determination of Net Atomic Charges. 1 Modified PEOE Procedure input coordinalar can be
31 internal coord~nal(ls
determine
initio! Choiges reod l h e modified PEOE poromelers
iterative procedure
equmn
a
u s e equation l o
Aq"';.'
for a l l the bond pairs
u s e equotion l b
usuaIIy converges for
n < 10
Figure 3. Description of the algorithm to calculate the net atomic charges of the atoms of a molecule by using the modified PEOE procedure. -0.362
0.020 HC N --
"T 1 10.075
H 0.020
0.147
-0.357
"T ' I
Q.025
-0.374
C 0.119
H 0.324
0.040
H-C-H
O.OI4
I
H 0.014
"7-
-0.369
7
0.025
C-H
0.154 L
10.066
H 0.324
H 0.025
O.OI4
Figure 4. Partial charges of N-acetylalanineNhethylamide, computed by the modified PEOE method, with the DQPS parameter set (see Figure 3).
molecule (number and types of bonds associated with each atom) and characterize the hybridization state of the individual atoms, according to the classes given in Table 11. This done, it is straightforward to code the iterative PEOE procedure, described
in Methods, obtaining the electronegativity parameters from a set of internally stored "look up" parameters. A pictorial representation of the method is given in Figure 3. An example of a set of conformation-independent charges for N-acetylalanine N'-methylamide, computed by this method, is given in Figure 4. It should be possible to interface any such code as that just described to already existing application codes designed for molecular mechanics calculations. The cpu requirement is approximately linear with the number of atoms in the molecule under consideration but remains trivial even for a large protein. It has been demonstrated that, despite the simplicity of the scheme, it is capable of predicting experimental dipole and quadrupole moments accurately and reproduces within the limitations of the monopole model sections of the electrostatic potential in the region of typical intermolecular separations. In the PEOE model, appreciable transfer of charge takes place during the iterative procedure between atoms separated by one, two, or three covalent bonds; hence, in an approximate way, the method describes the dependence of the charge distribution on atomic environment. Like most empirical methods, with the exception of FEOE,23 the net atomic charges describing the molecular system remain independent of conformation. The difference in the parameter sets DP and DQP arises from the presence of delocalized electron distributions in molecules containing double bonds, especially amides, carboxylic acids, and urea. Since the quadrupole moments play an important role in determining the net atomic charge of these molecules, it is the parameter set DQP5 that is suggested as the most suitable for the calculation of net atomic charges in polypeptide molecules. The extension of this work to ionic species, amino acid zwitterions, and aromatic molecules is presented in the following paper.29 Acknowledgment. We thank Drs. D. R. Ripoll and R. L. Williams for many helpful discussions. This work was supported by the National Institute of General Medical Sciences, National Institutes of Health (GM-14312), by the National Science Foundation (DMB84-018 1l), and by the National Foundation for Cancer Research. K.T.N. thanks the Korea Science and Engineering Foundation for support. The research was conducted using the Cornell National Supercomputer Facility, a resource of the Cornell Theory Center, which receives major funding from the National Science Foundation and the IBM Corp., with additional support from New York State and members of the Corporate Research Institute.