Determination of partial atomic charges from ab initio molecular

Mar 1, 1978 - Edward G. Look and Harry D. Gafney. The Journal of ..... Carlos Alemán, M. Cristina Vega, Lydia Tabernero, and Jordi Bella. The Journal...
0 downloads 0 Views 1MB Size
592

The Journal of Physical Chemistry, Vol. 82, No. 5, 7978

Frank A. Momany

Determination of Partial Atomic Charges from Ab Initio Molecular Electrostatic Potentials. Application to Formamide, Methanol, and Formic Acid Frank A. Msmany Department of Chemistry, Memphis State University, Memphis, Tennessee 38 152 (Received Juk 29, 1977) Publication costs assisted by Memphis State University

A method is presented for obtaining partial atomic charges from a combination of molecular orbital and experimental data. The procedure involves the calculation of an ab initio molecular electrostatic potential, at the minimal STO-3G and 4-31G basis set levels, between a neutral molecule and a positive charge located at various points beyond the van der Waals contact region surrounding the molecule. Partial atomic charges located on the atoms were determined by fitting the classical electrostatic Coulomb potential to the potential obtained from molecular orbital calculations. Electroneutrality was preserved and the magnitude and direction of the experimental dipole moments of the molecule were used in the fitting procedures. The differences between classical Coulombic potentials and ab initio electrostatic potentials are discussed for atomic charges obtained from other procedures using molecular orbital methods. The extension of the procedure described here to other methods of defining the charge distribution is discussed.

Introduction For many years a variety of computational methods have been used to determine the net atomic charges of atoms in mo1ecules.l-l2 The most widely used technique is that of the “population analysis” proposed by Mulliken.z Other procedures, primarily modifications of the Mulliken scheme, have also been p r o p o s e d . l ~In ~ ~each ~ ~ ~case, the problem arises as to how to partition the overlap charge between atoms of different atomic number. In a different approach,8J1 the electronic charge associated with a given atom is obtained by integrating the electronic density over the region of space belonging to that atom. Such a procedure requires a criterion for partitioning the space of the molecule.s The criterion that the gas phase dipole moment of the molecule should be reproduced has not been used in any of the above methods, although the Lowdinl definition conserves the dipole moment of a two-center charge distribution. Thus, even though the dipole moment calculated from the wave functions may be in agreement with the experimental value, the partial atomic charges obtained from the wave functions may give a poor value for the dipole moment. Clearly, when one wishes to use the point charges in a calculation of the classical Coulombic electrostatic potential, as is currently being done in conformational energy calculations on polypeptides, proteins, and nucleic acids, this failure to reproduce the correct magnitude and direction of the dipole moment must be considered. In order to circumvent these problems the molecular electrostatic potential obtained from molecular orbital calculations will be used here to deduce partial atomic charges. The electrostatic interaction energy, W(r), between a molecular electronic and nuclear charge distribution and an external unit positive charge placed a t a point r in the neighboring space around the molecule is given by13

Wr) = 4V(r)

(1)

where, V(r), the electrostatic potential is defined as13-16

When q is chosen to be a positive unit charge (+e), then 0022-3654/78/2082-0592$01 .OO/O

the numerical value of V(r) is the same as W(r), and for convenience the term “electrostatic potential” will be retained throughout to designate W(r). In eq 2, ZAis the nuclear charge of atom A, located a t RAand PPyare elements of the first-order density matrix obtained by summing the products of the coefficients of the SCF-MO’s over all occupied molecular orbitals. The first summation runs over the nuclei of the molecule, the second summations are carried over the electrons. The integrals appearing in the second summation are the nuclearelectron attraction integrals between a hydrogen nucleus and the distribution of atomic orbitals, x*(r’).xV(r’). When V(r) is calculated using the SCF-MO’s for the isolated molecule, it will be designated the unperturbed electrostatic potential, implying that the electronic density is not allowed to redistribute itself (i.e., polarize) because of the perturbing influence of the positive point charge. By allowing the complete system (Le., molecule and point charge) to be carried to self-consistency, the perturbed electrostatic potential, which includes charge transfer and polarization effects, would be obtained. Only the unperturbed potential will be considered in this paper, the results of the perturbed potential will be presented e1~ewhere.l~ The electrostatic potential described above is a rigorously defined physical property of a molecule,18 since in principle no approximations are necessary in order to determine V(r). Further, V(r) is the expectation value of a one-electron operator so that its SCF approximation is correct to one order higher than the SCF wave function emp10yed.l~In contrast, partial atomic charges obtained from a population analysis represent an attempt to condense the charge distribution into a set of point charges, whose magnitudes depend strongly upon the method used to partition the charge distribution, and also on the basis set employed. Recently (see ref 19 and references therein), it has been shown that when the same partitioning method was used with different atomic basis sets, the point charges differed greatly from one basis set to another.lg It will be shown here that by using the electrostatic potential to determine partial atomic charges, the dependence upon basis set differences can be reduced by a simple scaling procedure. Further by introducing an experimental quantity such as the dipole moment into the 0 1978 American Chemical Society

Molecular Electrostatic Potentials

The Journal of Physical Chemistry, Vol. 82, No. 5, 1978 593

and the atoms of the molecule at the distances described above, the Coulombic expression for the electrostatic potential should be valid. In fact, the procedure developed here should provide information about the effects of Method “through-bond’’ interactions on classical Coulombic interaction potentials. This problem will be discussed Molecular Electrostatic Potential. The quantume1~ewhere.l~ mechanical treatment applied here to obtain the electrostatic potential was the ab initio STO-NG m e t h ~ d . ~ J J ~ Data Fitting Procedure. The method used to obtain a least-squares fit of the Coulombic point charge electrostatic The two different basis sets used were the minimal potential to that obtained from ab initio calculations by STO-3G and extended 4-31G basis. Most calculations were varying the partial atomic charges is described below. carried out with the minimal basis set, but selected points Three procedures were adopted in this work. In the first on the potential surface were compared to results comone, only the electostatic potentials (Le,, V(r) and V&)) puted with the 4-31G basis set. Using these basis sets and were used to obtain an optimized set of partial charges on eq 2, the molecular electrostatic potential was calculated the atoms of the molecule. In this case the dipole moment at each coordinate position on a grid of points surrounding resulting from the charges obtained from the optimization the molecule. From the solution of the Roothaan equaof the fit of the potential would not be expected to agree t i o n ~a, first-order ~~~ density matrix7 is obtained for the with the experiment, but would tend to agree more closely isolated molecule, and this density matrix (expanded with to charges obtained from a MO population analysis. In zeros for the unoccupied 1s hydrogen basis function) was the second case, the magnitude and direction of the extaken as the density matrix for calculation of the elecperimental dipole moments were fit, simultaneously with trostatic potential for the unperturbed state (i.e., neutral the electrostatic potentials. In this latter case, the resulting molecule plus + l e charge). The calculated energy (corcharges must reproduce the experimental dipole comporesponding to the isolated molecules density matrix) at nents very closely. However, the fit to the electrostatic each grid point thus gave the electrostatic potential depotentials may not be as good as in case one, since an sired. added constraint has been introduced into the fitting The electrostatic potential was calculated for each point procedure. In the third case, the ab initio electrostatic in space defined by the coordinates of the + l e charge potential is first scaled by a constant factor to obtain an probe. The grid must be chosen carefully. For example, electrostatic potential which when fit by both previous the ab initio potential exhibits bonding properties at methods (Le., case one and two above) results in two sets distances of the order of bond lengths between the charge of atomic charges which are nearly identical. In all three probe and some atoms of the molecule. Further, a recases, electroneutrality of the molecule is preserved. pulsive interaction becomes important a t short contact The procedure adopted was to minimize the deviations distances, even for atoms which exhibit attractive interbetween the ab initio and classical potential using the actions with the +le charge at longer distances. For these function reasons, the grid of points was chosen so that no contacts shorter than normal van der Waals radii were included. 1/2 Further, for marginally close contacts, weighting factors were included to reduce the weight given to these points in the refinement of the partial atomic charges. The long-range electrostatic potential was accurately fit by I-. including experimental dipole moment data in the fitting where u is the standard deviation V(ri) and Vel(ri)are the procedure. Thus grid points did not have to be taken at ab initio and classical values (obtained from a starting set large distances from the molecule. In fact, a grid of only of partial atomic charges) of the electrostatic potential, several Angstroms in depth between the inner van der respectively, and the summations are over the number of Waals distance and outer limit, and of a density of 0.5 A atoms in the molecule, M , and the number of grid points, between points was found to be sufficient. A set of partial N,at which the potential is calculated. The weighting atomic charges obtained from other methods was used as factor f,(N) was chosen so that close interactions (Le., those a starting set in the optimization procedure, and tests for at contact distance slightly within the van der Waals convergence to an optimum fit of the potentials as a radius) will have less weight in the fitting procedure (1.0 function of the number and density of grid points used A for hydrogens, 1.5 A for carbons, nitrogens, and oxygens) were carried out. Where possible, molecular symmetry was than those points beyond the cutoff distances. For these used to help select the most preferable positions of the grid close points, f,(Nwas found to best eliminate distortions of electrostatic potential points. when a value of 0.25 was used; a value of 1.0 was used for Empirical Electrostatic Potential. The classical anaall other points. The minimization was carried out by logue of the ab initio electrostatic potential is the Couvarying the partial atomic charges (starting from some lombic potential. The electrostatic potential of interaction initial set described in the Results section), by use of a of a +le charge located at any point, ri,with a set of partial modified multidimensional Newton-Raphson procedure, atomic charges can be calculated from the equation as described by Shipman et a1.21 For any variation in charge, electroneutrality was maintained by requiring that (3) the charges sum (algebraically) to zero. To determine charges which will give the best fit to the where Vel (kcal/mol) is the classical potential, ri (A) is a experimental dipole moment as well as the ab initio grid point, q, is the partial atomic charge on the j t h atom electrostatic potential, three constraints were imposed. which is positioned at Rj (A), M is the number of atoms, The first was the condition of electroneutrality, which and 332.1 is the conversion to kcal/mol. The summation eliminates one variable. The other two constraints arose is taken over all atoms (with charge) in the molecule. If from the requirement to fit the magnitude and sign of two we ignore problems such as “through-bond” interactions, of the dipole moment components. This effectively reand maintain the contact distances between the +le charge duced the number of independent variable charges in a charge fitting procedure, the final set of partial atomic charges are brought into best agreement with both theory and experiment.

c

9

594

The Journal of Physical Chemistry, Vol. 82, No. 5, 1978

Frank A. Momany I

I

TY,

A 3

1

1

I

I

2

I

0 -I

-2 I

I

I

I

I

a

1

,

I

I

1

I

c -0

H-N1

c -0

H-NI

- 2 - 1 0 1 2 3 4 Figure 1. Isopotential STO-3G unperturbed electrostatic potential map for formamlde. Energy units are kcal/mol and distance units are Angstrams, A shows the y - r plane, B the x-z plane.

-

molecule from M to M 3. Although it is in principle possible to introduce three constraints from the dipole moment equations (Le., px, py, p J , the molecules studied here generally have only two major dipole components; thus, even in those cases where all three dipole components are nonzero, only two constraints were imposed while fitting all the dipole moment components. The ( M - 3) independent variables were then allowed to change in the minimization procedure to achieve a least-squares fit of the classical to the quantum mechanical electrostatic potential (eq 4). As each charge was varied, a new solution to the dipole moment equations and equation for electroneutrality was obtained, so that a simultaneous fit to all the data was achieved. Convergence was accepted when the deviation in fitting parameter, 8, varied by less than 0.001 kcal/mol from one cycle to the next. In each case, several starting sets of charges were examined (see the Results section for differences in starting sets), and all converged to the same final values.

Results The molecules studied were formamide, methanol, and formic acid. Atomic coordinates were taken from experimental gas phase m i c r o ~ a v e ~or~ *electron ~~ d i f f r a ~ t i o nstudies, ~~ and dipole moments were those determined from microwave spectroscopy.22~23~26~z6 The coordinates and dipole moments are given in Table I. Formamide. In Figure 1A,B is shown the isopotential map of the electrostatic potential obtained from a MOSTO-3G basis set, for the (y-z and x-z) planes of the unperturbed formamide molecule. The regions of the maps where V(r) < 0 correspond to attractive interactions, V(r) > 0 to repulsive ones. The strongly attractive regions appear around the electronegative atoms such as the carbonyl oxygen, and above the plane of the molecule over the nitrogen and oxygen. The map of the unperturbed molecule is in agreement with similar maps calculated using different basis sets and somewhat different geometry.13J7 One notable difference between the map calculated here and that of Bonaccorsi et alaz7is the lower energy of our map perpendicular to the plane of the molecule and above the nitrogen. It appears that by using

- 2 - 1 0 I 2 3 4 Figure 2. Fbrmamide isopotential energy difference map. A V = VSTm - V,,,. The Mulliken population charges used to calculate V,,, are the STO-3G population charges shown. The standard deviation IS 13.8 kcal/mol. Energy and distance units are the same as in Figure 1. All charges must be divided by 1000 to obtain atomic units.

I

I

I

- 2 - 1

I

I

I

I

I

I

0 I 2 3 4 Figure 3. Formamide. Same as Figure 2. The charges used are CNDOlP charges. The standard deviation was 13.3 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

the microwave geometry in which the NH2hydrogens are slightly puckered (in the negative x direction in Figure l), the interaction of the + l e charge with the nitrogen becomes more attractive than when the planar geometry is used.27 In Figures 2-4, the isopotential energy difference maps for energies calculated from several sets of atomic charges obtained from molecular orbital calculations compared to the ab initio energy map of Figure 1 are shown. The charges on the atoms are those obtained from a Mulliken population analysis for the particular calculation. In Figure 2, the STO-3G population charges found here are used, and the energy difference map in the region around

The Journal of Physical Chemistry, Vol. 82, No. 5, 1978 595

Molecular Electrostatic Potentials

TABLE I: Experimental Cartesian Coordinates and Dipole Moments of Formamide, Methyl Alcohol, and Formic Acid Cartesian coordinates, A Atomic X Y z Dipole moment, D no., Z FormamideC 8

0

0.99136

6

0

0

7 1

0 0

1 1

2.03967 1.37600 0 1.81072 - 0.51 0 3 1 - 0.46271

0 -1.01264 - 0.84320 0.89561

-0.18051 -0.10944

fix =

0.2a

p y = 2.36; pz = 2.87

ptot = 3.72‘

Methanold 0

8 6

0 0

-1.42600 0.32833 -1.79050 - 1.79050 -1.79050

1 1 1 1

- 0.90210 1.03156 -0.51573 -0.51573

1

0

-1.10600 0.44535 0.74086 1.42887

8 8

1

- 1.28608 0.96551 -1.25517

f i x = 0.98f f i y = 1.38 f i z = 0.0 @tot= 1.70

0.893 27

- 0.89327

Formic Acide 0

0

6

0 0 0 0

0 0 0 0 0

p x = 0.26g p y = 1.39 fiz =

0.0

ptot= 1.415

Transformed from the principal axis a Assumed value because of nonplanarity of experimental microwave structure. Reference 23. e Reference 24. ReferReference 22. and b; the experimental values are f i a = 3.62 D,p b = 0.85 D. ence 25. g Reference 26. I

I

4 ’

A

I

I

I

32-

::@-

-I

-

-2

-

represented by the CND0/2 charges than by the STO-3G charges, but the -NH2 end is again not as positive in the classical potential as the STO-3G electrostatic potential. The dipole moment calculated from the CNDO/2 charges is also low, being 2.3 D. The isopotential energy difference map calculated using Mulliken population charges obtained from an extended ab initio basis set29(designated 11.7.1) and compared to the §TO-3G unperturbed potential is shown in Figure 4. In this case, the energy region around the carbonyl oxygen is more negative when calculated by the classical calculation, and slightly too positive at the -NH2 end. The dipole moment calculated from these atomic charges is 3.6 D, in close agreement with the experimental value. Starting from the three sets of charges shown in Figures 2-4, the data fitting procedure described in the previous section was carried out. It was found that the set of charges resulting from the optimization procedure were relatively insensitive to the number of grid points used in the fitting procedure, but only after 10 points/atom (or -60-100 points/molecule) were included. In some initial grid sets, very close contacts were included to investigate their effect on the minimization procedure. Points which were less than 1.0A. from the hydrogen atoms were found to cause dramatic changes in the final set of charges. Similarly, points within 1.3 A of oxygen and nitrogen could also distort the final charge set. These close contacts were then eliminated from the grid map, and marginally close contacts (Le., between 1.3 and 1.5 A from oxygen) at or near the cutoff radius defined previously were given reduced weights (i.e., f 0.25) in the fitting procedure. At this point, addition of more grid points at longer distances from all the atoms of the molecule had a very little effect on the final charge distribution. The set of charges obtained without the dipole fitting criterion and the isopotential energy difference map using these charges are shown in Figure 5. The charge set and difference map obtained when the dipole constraint is included is shown in Figure 6. Clearly, the fit to the electrostatic potential, without the dipole condition, is also better than that when the dipole constraint was used, and is also better than that obtained with any set of Mulliken charges. The atomic

-

I

H-N-CC=O I

I

I

1

1

I

- 2 - 1

0

I

2

3

4

I

Figure 4. Forrnarn!.de. Same as Figure 2. The charges used are 11.7.1 Mulliken population charges. The standard deviation was 7.8 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

the carbonyl oxygen shows clearly that the carbonyl oxygen does not exert as much attractive energy in the Coulombic electrostatic energy as does the unperturbed molecular orbital electrostatic potential calculated using the same basis set as that from which the atomic charges were obtained. The -NH2 region is not as positive in the classical potential as the §TO-3G unperturbed potential. The dipole moment calculated using the STO-3G atomic charges is only 2.1 D, as compared to the 2.6 D moment obtained from the wave functions, and both are low compared to the experimental value of 3.72 D. Figure 3 shows the isopotential energy difference map using charges obtained from CND0/2 calculations.28 The region around the carbonyl oxygen is somewhat better

-

596

The Journal of Physical Chemistry, Vol. 82, No. 5, 1978

Frank A. Momany

I

A 3-

I

I

I

1

I

I

I

I

I

I

I

- 2 - 1 0 I 2 3 4 Flgure 5. Formamide. Same as Figure 2. The charges are those resulting from fitting to the STO-3G potential with electroneutrality. The standard deviation is 3.7 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

I

.”’

I

1

I

1

I

I

I

I

I

I

I

I

- 2 - 1 0 I 2 3 4 Flgure 6. Formamide. Same as Figure 2. The charges are those resulting from fitting both the STO-3G potential and the experimental dipole components. The standard deviation is 6.7 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

charges obtained without the dipole constraint give a dipole moment of 2.8 D, and they are very different from the STO-3G Mulliken population charges (see Figure 2). The charges of Figure 5 are considerably closer to those obtained with the extended 11.7.1 basis set (see Figure 4) than to any of the other Mulliken charge sets. In the case shown in Figure 6, the final energy deviation is not as good as that obtained when no dipole condition was used (i.e., a standard deviation of 6.8 kcal/mol in Figure 6, compared to a standard deviation of 3.7 kcal/mol for the fit without the dipole moment constraint, shown in Figure 5). However, the charges are not radically different from those found without the dipole condition.

I

I

I

I

I

]

-4 -3 -2 -I 0 I 2 3 Flgure 7. Isopotential STO-3G unperturbed electrostatic potential map for methanol. Energy units are in kcal/mol and distance units are Angstroms. A shows the x-y plane, B the x-L plane.

Obviously, the demand that the long-range electrostatic potential (Le., dipole moment terms) be satisfied causes some distortion in the electrostatic short-range potential for the atom centered model. However, since both shortand long-range effects are fit when the dipole moment components are included, it is believed that the set of partial charges of Figure 6 is the best atom-centered charge distribution one can obtain without modification of the electrostatic potential for the structure of formamide used here. In the next section a “scaled” electrostatic potential will be examined, which satisfies both the theoretical and experimental constraints with the same set of charges. Methanol. The STO-3G electrostatic potential for methanol is shown in Figure 7. The negative energy region around the oxygen is similar in magnitude to the negative region found around the carbonyl oxygen of formamide. The positive energy region around the hydroxyl hydrogen is also similar in magnitude to the region around the amide hydrogens of formamide. The repulsive energy region around the methyl group is considerably less positive than that around the hydroxyl hydrogen, as one would expect if these hydrogens were not highly positively charged. It is interesting to note that a negative potential energy region in the vicinity of the methyl hydrogens is not found here, although a negative energy region has been observed by Politzer and Daiker30 using CND0/2 molecular wavefunctions. Possibly, the polar hydroxyl group is responsible for the lack of a negative region in our calculation, although differences between the STO-3G and CND0/2 methods may also be the reason. In Figure 8 is shown the isopotential energy difference map comparing the STO-3G ab initio electrostatic potential with the Coulombic potential calculated using the partial atomic charges obtained from the population analysis of our STO-3G calculation on isolated methanol. The dipole moment obtained from these STO-3G charges is 1.11D, while the wavefunction gave 1.49 D. The latter value is closer to the experimental value, 1.70 D. The energy differences in Figure 8 indicate that the oxygen is not sufficiently negatively charged, and the hydroxyl hydrogen not sufficiently positive to fit the unperturbed potential in these regions. The charges on the methyl group seem to reproduce the ab initio potential quite well

1

The Journal of Physical Chemistry, Vol. 82, No. 5, 1978 597

Molecular Electrostatic Potentials

-I

-2

2I -

B -2

-3 H-,C-0-H

O

I

I

I

I

I

I

-3 -2 -i 0 I 2 3 Figure 8. Methanol isopotential energy difference map. A V = V ,, - Vemp. The partial atomic charges used to calculate Vempare the STO-3G Mulliken population charges shown. The standard deviation is 6.9 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2. -4

I

-3 -2 -I 0 I 2 3 Figure 9, Methanol. Same as Figure 8. The charges are those resulting from fitting to the STO-3G potential with electroneutrality. The standard deviation is 3.7 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

a t the methyl end. Using only the electroneutrality condition and fitting the electrostatic potential by allowing M - 1charges to vary, the charges and energy differences shown in Figure 9 were obtained. The resulting dipole moment for this set of charges was 1.42 D, very close to the dipole moment obtained from the STO-3G wave functions. The deviations in energy around the hydroxyl group are significantly lower than that shown in Figure 8 for the STO-3G partial charges. The dipole moment is also closer to the experimental value than was the value obtained from the STO-3G charges. Finally, the dipole components were included in the fitting procedure, and the results are shown in Figure 10. There is a significant improvement in the energy differences surrounding the polar hydroxyl group, relative to the isopotential difference map from the STO-3G charges, but the region around the

I

1

I

I

I

-4 -3 -2 -I 0 I 2 3 Figure 10. Methanol. Same as Figure 8. The charges are those resulting from fitting both the STO-3G potential and the experimental dipole components. The standard deviation is 4.1 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

0 -4

IH'

H-C-0-H

-4 -3 -2 -I 0 I 2 3 Flgure 11. Isopotential 4-31G unperturbed electrostatic potential map for methanol. A shows the x-y plane, B the x-z plane. Energy and distance units are the same as in Figure 1.

methyl group showed some increased energy differences. The overall fit (standard deviation of 4.1 kcal/mol) was significantly better than that found using the STO-3G charge set (standard deviation of 6.9 kcal/mol), but not as good as the fit using only electroneutrality (standard deviation of 3.7 kcal/mol). It is important to establish the extent of the effect that the basis set used in the ab initio calculation will have on the electrostatic potential. We have examined this problem by carrying out a calculation of the unperturbed potential on methanol, using the extended 4-31G basis set. The results of this calculation are shown in Figure 11, and can be compared to the results for the STO-3G basis set shown in Figure 7. In Figure 12 is shown the energy difference map between the two basis sets &e., AV = VS,,,, - V4-,,,). From Figure 12 we can immediately see that (1)in the negative energy region, the 4-31G energy

598

The Journal of Physical Chemistry, Vol. 82, No. 5, 1978

Frank A. Momany

I A I

2I -5

0

-2 -I

I

1

1

I

I

IH‘

I

I

I

I

I

-3 -2 -I 0 I 2 3 Figure 12. Methanol isopotentialenergy difference map. A V = V ,, - V,. Energy and distance units are the same as in Figures 1 and 2.

I

-4

I

-4

1

-3

-2

I

-I

0

I

I

I

/ ,

I

I

\Yz

I

2

1

3

Flgure 13. Methanol isopotential energy difference map. A V = \/e370 - Vemp. The partial atomic charges used to calculate V,, are the 4-31G populatlon charges shown. The standard deviation is 7.5 kcal/mol. Energy, distance, and charge units are the Same as in Figures 1 and 2.

is much more negative than the STO-3G energy, and (2) in the positive energy region, the 4-31G energy is much more positive than the STO-3G energy. The general features of the maps of Figures 11 and 7 are very similar in their positive and negative regions. The greater enhancement of the energy regions for the 4-31G basis set also shows up in the calculation of the dipole moment from the wavefunctions. For example, the 4-31G dipole moment components from the wavefunction are = 1.479, pya= 1.816, and ptot = 2.342 D. These values deviate considerably from the experimental values given in Table I, and are in poorer agreement than the STO-3G values. The partial atomic charges obtained from a population analysis of the 4-31G calculation are shown in Figure 13, are more highly polarized (i.e., show greater charge separation) than those from the minimal basis set (see Figure 8), and these atom centered charges lead to dipole components of px =

I

I

I



I

I

-4 -3 -2 -I 0 I 2 3 Flgure 14. Methanol isopotential energy difference map. A V = VsTo.,G(scaled)- VemP.The scaling factor was 1-19 and the charges without brackets are those obtained from fitting the scaled potential with electroneutralii. The charges shown in brackets are those obtained by fitting both the scaled potential and the dipole moment components. The standard deviation with and without the dipole constraint is 4.4 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

2.557, py = 1.535, and ptot = 2.982 D. Figure 13 also shows the energy difference map for the 4-31G potential compared to the classical Coulombic potential using the atom centered charges obtained from the 4-31G Mulliken population analysis. Clearly we can say at this point that the energy differences between the two basis set calculations are certainly as large as the energy differences found when we compare the two ab initio electrostatic potentials to the classical Coulombic potentials found using the charges obtained by the population analysis of each. We now must consider which basis set is more likely to give us the best set of partial atomic charges when fit with the dipole moment constraint. The minimal basis set gives a better dipole moment in the case of methanol, but more importantly, the percent increase in dipole moment, going from the minimal to extended basis set, is directly reflected in the average percent increase in magnitude of the values of both positive and negative electrostatic potential. That is, calculating the energy difference between the STO-3G and 4-31G energy values grid point by grid point, we obtain an average increase (both negative and positive) in energy of -57%. This leads us to suggest that in order to obtain the best electrostatic potential to fit, we should increase the STO-3G interaction energies by a percentage equivalent to the percent increase in STO-3Gdipole moment (from the charges found without the dipole constraint, see Figure 9) required to reach the experimental value of the dipole moment. This requires a 19% increase in the STO-3G electrostatic potential energy values. Similarly, a 43% decrease in the 4-31G potential should give an equivalent result. The results of increasing the STO-3G potential energy by 1.19 and subsequent fitting of the electrostatic potential with only electroneutrality as a constraint gave the partial atomic charges shown in Figure 14. Figure 14 also shows the charge distribution and energy difference map obtained when the dipole moment constraint was included. It must be carefully noted here that scaling the electrostatic potential obtained from the

The Journal of Physical Chemistry, Vol. 82, No. 5, 1978 599

Molecular Electrostatic Potentials

I

I

1

I

I

I

I

I

I

-3 -2 -I 0 I 2 3 Flgure 15. Isopotential STO-3G unperturbed electrostatic potential map for formic acid. A shows the x-y plane, B the x-z plane. Energy and distance units are the same as in Figure 1.

STO-3G or 4-31G basis sets is not equivalent to scaling the partial atomic charges obtained by a Mulliken population analysis using the SCF-MO's. One example of this can be seen by looking at the carbon atom. The STO-3G partial atomic charge for carbon is -0.062 au (see Figure 8), the 4-31G charge for this atom is -0.125 au (see Figure 13), but the charge resulting from the fit to the STO-3G electrostatic potential, both with and without the dipole moment constraint, is +0.096 au (see Figure 10) and +0.245 au (see Figure 9), respectively. The charge from the fit to the scaled potential is +0.287 au (see Figure 14). Similar changes in sign are to be noted for the charges on the methyl hydrogens. Obviously there is no way to simply scale the MO charges and obtain a fit to the scaled electrostatic potential as good as that found here. On the other hand, if one scales directly the charges obtained by fitting the unscaled STO-3G electrostatic potential (see Figure 9), the results are nearly identical with the results from fitting the scaled potential. This is not the case when we attempt to scale the charges obtained from fitting the unscaled STO-3G electrostatic potential with the dipole constraint included. The perturbing effect of the dipole constraint shows up most noticeably in the atomic charges of the methyl group atoms (compare the charges of Figure 10 to those of Figure 9). Returning to the discussion of the fit to the scaled potential, one observes (see Figure 14) that the charge distributions which resulted from both fitting procedures are now nearly identical. The dipole components from fitting the scaled potential without the dipole constraint are p x = 0.956, p y = 1.394, and pbt = 1.690 D, in agreement with the experimental values, and one can compare this result to the previous value of ptot= 1.42 D for the fit without constraint shown in Figure 9. Clearly, the scaled minimal basis set electrostatic potential is capable of giving a close correspondence to the experimental data for the case described here. Formic Acid. The third polar molecule studied here is formic acid. The STO-3G electrostatic potential for the unperturbed molecule is shown in Figure 15. The two regions of negative energy are similar to those found previously around the carbonyl oxygen of formamide, and

I

I

I

I

I

I

I

I

I

I

1

I

I

I

I

I

-2 -I 0 I 2 3 Figure 16. Formic acid isopotential energy difference map. A V = V,,,.,, - Vemp. The partial atomic charges used to calculate Vsmp are the STO-3G Mulliken population charges shown. The standard deviation is 6.1 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

-3

I

-3 -2 -I 0 I 2 3 Figure 17. Formic acid. Same as Figure 16. The charges are those resulting from fitting the STO-3G potential with electroneutrality. The standard deviation is 2.5 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

the hydroxyl oxygen of methanol. The repulsive region around the acid proton indicates a signifiant positive charge, while the less repulsive region around the -CH hydrogen indicates a lesser degree of positive charge in this region. The STO-3G partial charges and isoenergetic difference energies are shown in Figure 16. The dipole moment arising from the STO-3G charges is 1.19 D, while the wave functions gave 0.746 D. It is interesting that, in this case, the atomic charges give a dipole moment which is in closer agreement with experiment, 1.415 D, than the dipole moment calculated from the wavefunctions. The best fit to the electrostatic potential without the dipole moment condition is shown in Figure 17. The resulting

000

The Journal of Physical Chemistry, Vol. 82, NO. 5, 1978

1

I

I

I

I

Frank A. Momany

0 -I

I

I

I

I

I

1

1

-2 -I 0 I 2 3 Figure 18. Formic acid. Same as Figure 16. The charges are those resulting from fitting both the STO-3G potential and the experimental dipole components. The standard deviation is 4.1 kcal/mol. Energy, distance, and charge units are the same as in Figures 1 and 2.

-3

charges, while considerably different from the STO-3G charges, gave a dipole moment of 0.908 D. However, it should be noted that the energy difference map is significantly better than that obtained using the STO-3G charges (standard deviation of 2.5 kcal/mol compared to 6.1 kcal/mol for the STO-3G charges). Figure 18 shows the result after fitting the electrostatic potential with the dipole condition included. The standard deviation is considerably smaller than that for ,the STO-3G charge set (4.1 vs. 6.1 kcal/mol) but, as would be expected, worse than the deviation from the unconstrained fit. We can see that the carbonyl oxygen seems to be somewhat too negative, while the hydroxyl oxygen is not sufficiently negative (compare the charges of Figures 6 and 10 to the equivalent groups of formic acid). The scaled electrostatic results are presented in the next section. Scaled Electrostatic Potentials. The results presented for methanol, upon scaling the magnitude of the values of the STO-3G electrostatic potential, or by scaling the charges obtained without the dipole constant, by the dipole moment magnitudes, were encouraging. Thus, we have treated formamide and formic acid in the same manner. In Figure 19 is shown the energy difference map and resulting charges for formamide after scaling the STO-3G electrostatic potential for formamide by 32%. The charges obtained by fitting the scaled potential are again nearly identical with those obtained from including the dipole moment components as constraints. The dipole moment of ptot = 3.68 D obtained without constraints (except for electroneutrality) is in excellent agreement with the experimental value of 3.72 D. Similar results for formic acid, scaled by a 56% factor, are shown in Figure 20. Again, the fit to the electrostatic potential resulted in charges which reproduced the dipole moment data very well. In both of these molecules the standard deviation of the fitted scaled potential was lower than the fits including the dipole moment constraint on the unscaled STO-3G potential. The charge distributions arrived at by this scaling procedure are most probably the best set of charges that can be obtained since they are best fits to both the theoretical electrostatic potentials (excluding the change in magni-

c-0

I

I

I

1

1

I

I

I

I

I

I

I

- 2 - 1 0 I 2 3 4 Flgure 19. Formamide isopotential energy difference map. A V = V,To.,G(scaled) - Vemp.The scaling factor was 1.32 and the charges without brackets are those obtained from fitting the scaled potential with electroneutrality. The charges shown in brackets are those obtained by fitting both the scaled potential and the dipole moment components. The standard deviation without the dipole Constraint was 4.9 kcal/mol, with the dipole 5.1 kcal/mol. Energy, distance, and charge unlts are the same as in Figures 1 and 2.

-2

-

-3

I

0 ‘

H-C-*H

0

I

I

I

I

I

-3 -2 -I 0 I 2 3 Flgure 20. Formic acid isopotential energy difference map. A V = V,.,(scae l d) - Vemp.The scaling factor was 1.56 and the charges shown without brackets are those obtained from fitting the scaled potential wlth electroneutrality. The charges shown in brackets are those obtained by fitting both the scaled potential and the dipole moment components. The standard deviation without the dipole constraint was 3.9 kcal/mol, with the dipole 4.0 kcal/mol. Energy, distance, and charge units are the same as Figures 1 and 2.

tude) and the experimental dipole moments. If we further compare the charges obtained with and without the dipole moment constraint from the unscaled potentials to those obtained from the scaled potentials, we find a close correspondence in all cases in the ratio of scaled to unscaled when no dipole constraint was included. However, the ratios obtained using the unscaled results

Molecular Electrostatic Potentials

with dipole moment constraint are not in close agreement for reasons pointed out in the methanol section. This lends further support to our conclusion reached above, that the scaling procedure will result in the best set of partial atomic charges, better even than the unscaled potential with the dipole moment constraint.

Conclusion Two important results of this study need emphasizing here. First, the partial atomic charges found by fitting the electrostatic potential, with only electroneutrality as a constraint, are a better representation of the SCF unperturbed electrostatic potential than the charges obtained from a Mulliken population analysis. This result shows up for each molecule studied as a lower standard deviation between the empirical electrostatic potential and the SCF potential, when the empirical calculation is from the charges obtained in the fitting procedure. Second, experimental dipole moment components can be used to linearly scale the SCF interaction energy so that the charges which result from fitting this scaled electrostatic potential reproduce the experimental dipole moment data very closely, even when no dipole constraint is used in the fitting procedure. Equivalently, the charges obtained from fitting the unscaled MO electrostatic potential without the dipole constraint can also be linearly scaled to obtain the experimental dipole moment. The importance of this scaling procedure to obtain atom centered charges for empirical electrostatic interactions cannot be stressed enough, since no other method available to date will reliably generate acceptable atomic charges. A further advantage to using the electrostatic potential as a means of obtaining a point charge distribution arises when one wishes to expand the number of charge locations or go to an electron and nuclei approach, such as that developed by Shipman et al.21931In their potential, integral charges are maintained for the nuclei, and the electrons are integral charges placed on the bonds or in positions off the atoms which reflect lone-pair electrons. The electron positions are the variables in this method, and work in progress1’ will describe how one can extract the best fit to the “scaled” electrostatic potential, and simultaneously fit other experimental data.21i31 More recently, Bonaccorsi et a1.32133have used MO electrostatic potentials to find a system of unit point charges (not partial atomic charges) which, when combined with the LCAO representation for regions close to the molecule, gives information as to protonation sites and stability of hydrogen-bonded complexes equivalent to that found from the more rigorous SCF electrostatic potential. It is proper here to also point out a procedure one might wish to use to selectively fit certain regions of the electrostatic potential, better than other regions around the molecule. For example, if one wished to obtain the very best fit possible to the region around the hydroxyl oxygen in methanol, and did not care if the region around the methyl were fit well, one could simply use reduced weighting for the energy points around the methyl group, and the resulting charges will reflect an improved fit around the hydroxyl. One might consider using these approaches with very large molecules where the interest in the electrostatic potential was localized in a small region around the polar atoms. The authors primary interest in developing new methods for determining partial atomic charges is to improve the

The Journal of Physical Chemistry, Vol. 82, No. 5, 1978 601

Coulombic contribution to the total energy for conformational energy calculations on large biologically interesting molecules. For cases such as polypeptides and proteins, molecular orbital methods simply cannot be applied. Thus, point charges must be used to mimic as closely as possible the true electron distribution. Finally, the computational time required to obtain a charge distribution in the manner presented here is somewhat greater than that required in a single SCF calculation. In general, the calculation of the electrostatic potential will require -N/20 times as much time as a single SCF calculation where N is the number of grid points around the molecule, at which the interaction energy is to be calculated. For the molecules studied here N 100-150; thus the calculation is -5-7 times as long as a single SCF run. The time required to optimize the atomic charge distribution, once the electrostatic potential is available, is usually trivially small when compared to the SCF calculations. A future paper will deal with the effects of polarization and charge transfer,l’ and the considerably greater computational times required for these studies.

-

Acknowledgment. The author gratefully acknowledges the financial support given by Professor H.A. Scheraga of Cornel1 University to carry out this work at Cornel1 during the summer of 1976. References and Notes (1) P. 0.Lowdin, J. Chem. Phys., 21, 374 (1953). (2) R. S. Mulliken. J. Chem. Phvs.. 23. 1833 11955). E. W. Stout, Jr., and P. Politier, Theor. C h h . Acta, 4, 1 (1966). E. R. Davidson, J . Chem. Phys., 46,3320 (1967). J. A. Pople and M. S.Gordon, J. Am. Chem, Soc., 89,4253 (1967). C. A. Coulson and G. Dogptt, Int. J. Quantum Chem., 2,825 (1968). W. J. Hehre and J. A. Pople, J. Am. Chem. Soc., 92,2191 (1970). P. Politzer and R. R. Harris, J . Am. Chem. Soc., 92,6451 (1970). R. E. Christoffersen and K. A. Baker, Chem. Phys. Lett., 8,4 (1971). R. F. W. Bader, P. M. Beddall, and P. E. Cade, J. Am. Chem. SOC.,

93,3095 (1971). P. Polltzer and P. H.Reggio, J. Am. Chem. Soc., 94,8308 (1972). K. Jug, Theor. Chim. Acta (Berlln), 31, 63 (1973). E. Scrocco and J. Tomasi, Top. Current Chem., 42, 95 (1973). R. Bonaccorsi, E. Scrocco, and J. Tomasi, Thew. Chem. Acta(&r/in),

21, 17 (1971). C. Giessner-Prettre and A. Pullman, Theor. Chim. Acta (Berlin),25,

83 (1972).

R. Bonaccorsi, R. Cimiraglia, E. Scrocco, and J. Tomasi, Theor. Chim. Acta (Berlin),33, 97 (1974).

F. A. Momany, in preparation. P. Polltzer and H. Weinstein, Tetrahedron, 31, 915 (1975). A. T. Hagler and A. Lapiccirella, Biopo/ymers, 15, 1167 (1976). The Gaussian 70 program obtained from the Quantum Chemistry Program Exchange, Bloomington, Ind., was used for all molecular orbital calculations reDorted here. L. L. Shipman, A. W. Burgess, and H. A. Scheraga, Proc. Natl. Acad. Sci. U.S.A., 72,543 (1975). C. C. Costain and J. M. Dowling, J . Chem. Phys., 32, 158 (1960). P. Venkatateswarluand W. Gordv, J. Chem. fhvs.. 23, 1200 (1955). A. Almenningen, 0.Bastlansenand T. Motzfeldt,-Acta Chem. band.

23,2848 (1969). G. W. Kwei and R. F. Curl, J. Chem. Phys., 32, 1592 (1960). H. Kim, R. Keller, and W. D. Gwinn, J. Chem. Phys., 37,2748 (1962). R. Bonaccorsi, A. Pullman, E. Scrocco, and J. Tomasi, Chem. Phys. Left., 12,622 (1972). F. A. Momany, L. M. Carruthers, R. F. McGuire, and H. A. Scheraga, J. Phys. Chem., 78, 1595 (1974). D. H. Christensen and R. N. Kortzeborn, J. Chem. Phys., 53,3912

(1970). P. Politzer and K. C. Daiker, Chem. f h G . Lett., 34, 294 (1975). A. W. Burgess, L. L. Shipman, and H. A. Scheraga, Roc. Natl. Acad. Sci. U.S.A., 72,854 (1975). R. Bonaccorsi, E. Scrocco, and J. Tomasl, J. Am. Chem. Soc., 98,

4049 (1976). R. Bonaccorsl, E. Scrocco, and J. Tomasi, J. Am. Chem. Soc., 99,

4546 (1977).