J. Phys. Chem. B 2002, 106, 2779-2785
2779
Sodium Parameters for AM1 and PM3 Optimized Using a Modified Genetic Algorithm Edward N. Brothers and Kenneth M. Merz, Jr.* Department of Chemistry, 152 DaVey Laboratory, The PennsylVania State UniVersity, UniVersity Park, PennsylVania 16802 ReceiVed: July 10, 2001; In Final Form: December 17, 2001
Sodium is very important as a counterion in biology. However, when used with the most common semiempirical Hamiltonians, such as AM1 or PM3, sodium is modeled as a point charge that can accept no electron density, called a “sparkle”. To better model sodium, we derived two sets of sodium parameters, which treat sodium on the same footing as other atoms parametrized in semiempirical methods. One set is compatible with the AM1 parameter set, while the second is compatible with PM3. These parameters were derived using a modified genetic algorithm with a diverse set of 71 compounds. The average unsigned error for the heats of formation was 10.3 kcal/mol for AM1 and 10.5 kcal/mol for PM3.
Introduction Sodium chemistry has a long history1 and sodium compounds have been the focus of both experimental2-18 and computational studies.19-32 Unfortunately, quantum mechanical studies were limited to either ab initio or DFT calculations because there were no appropriate semiempirical parameters33 until 1996 when MNDO/d32 became available. There have been no published sets of sodium parameters for use with other semiempirical methods. When a sodium-containing system is studied semiempirically at the sp-basis NDDO34 level (e.g., MNDO,35,36 AM1,37 and PM338,39), a “sparkle” is used. In this approach, the sodium atom is treated as a point atom that has a defined charge, a heat of atomic formation, and an R parameter, which is involved in determining core-core repulsion interactions. All of the remaining NDDO semiempirical parameters were set to zero. (Good explanations of all of the parameters can be found in the original MNDO paper35 or in the MOPAC 6.0 paper.33) This atom, therefore, has no orbitals or electrons making it incapable of participating in electronic events such as covalent bond formation or charge-transfer interactions. The sparkle concept assumes the atom type in question is purely classical in nature, which has been shown to not be the case for sodium.40,41 Moreover, in cases where sodium serves as a counterion around biomolecules (e.g., DNA, RNA, and proteins) it would be useful to have a fully quantum mechanical counterion, especially for use in linear-scaling QM calculations,42-44 so that the sodium cation could accept charge from the biomolecule or the solvation shell of the biomolecule. Finally, AM1 and PM3 are popular enough to warrant expansion of the available atom types. Because of these considerations, a parametrization of sodium was undertaken, resulting in parameter sets compatible with both the AM1 and PM3 Hamiltonians. Method Development The fundamental idea in parametrization is that through the use of a set of points (i.e., molecules and their associated properties) on a function’s surface it is possible to create a set of parameters such that all points on the surface can be accurately represented using those parameters. In other words, the ability to get the correct values for a test set of points implies the ability
to also obtain the correct answers for the set of points that lies outside the training or test set. The problem lies in choosing a subset small enough to be computationally tractable and yet diverse enough to give a valid sampling of the space. An example of the lack of diversity can be found in the MNDO sulfur parametrization, where the initial choice of the sulfur test set lead to a set of parameters that were incapable of handling high valence sulfur compounds.45-47 The sp-basis NDDO methods were originally parametrized using a gradient-based minimization routine on the error hypersurface. The hypersurface in the parametrizations of MNDO and AM1 was defined by the sum of the squares of the differences between the calculated and actual values for physical measurements times a weighting value. Formally, this is defined as where Etotal is the total error, N is the number of compounds, N
Etotal )
Ei ∑ i)1
(1)
n
Ei )
wj(qj,experimental - qj,calculated)2 ∑ j)1
(2)
Ei is the error of an individual compound, n is the number of values to be compared, w is the weight of the value (assigned units such that error is unitless), and q is the value itself. (The method used in PM3 was a bit more complicated.38) The goal of a parametrization effort is to minimize error as efficiently as possible, which continues to be a nontrivial problem for NDDObased methods because of the complexity of the functional space and the interrelated nature of the parameters. A critical problem in semiempirical parametrization is the lack of analytical derivatives that would greatly increase the speed of the optimization of the function in gradient-based minimization routines. Also, because some of the parameters used in NDDO methods are derived from other parameters (e.g., the F parameters, which ensure correct behavior as the interatomic distance goes to zero or infinity35), there is a complex intertwining of parameters that results in multiple minima. This can create problems for methods based on multiple line minimizations to parametrize a multidimensional function such as Powell’s method.48 Thus, in all of the papers presenting
10.1021/jp012637q CCC: $22.00 © 2002 American Chemical Society Published on Web 02/13/2002
2780 J. Phys. Chem. B, Vol. 106, No. 10, 2002
Brothers and Merz
TABLE 1: Weights Utilized in the Error Function quantity
weight
unit
heat of formation dipole ionization potential atomic distance atomic angles
1 10 10 100 1/ 3
kcal/mol 1/D 1/eV 1/Å 1/deg
NDDO-based parameter sets, the time and difficulty of parametrization is highlighted. For example, this was mentioned in detail in the AM1 halogen49 and PM338 parametrization efforts, where a variety of pitfalls and potential problems were listed. One way to avoid these problems, especially the multiple minima problem, is through the use of a genetic algorithm (GA).50 Genetic algorithms are heuristic/stochastic methods that simulate evolution in the sense that the most fit (accurate) members of a population (parameter sets) are allowed to live from one generation (optimization cycle) to the next and to breed (utilized as the basis from which other parameter sets are derived). For an AM1/PM3 parametrization effort, this would mean that multiple sets of parameters would be generated, and then only the best of these sets would be kept. New parameters are then generated via crossing or mutation of the existing “good” sets. Importantly, GAs are entirely nonlinear and do not require derivatives. An excellent example of using a genetic algorithm to generate semiempirical parameters is Rossi and Truhlar’s calculation of reaction surfaces using points on a reaction surface to obtain accurate NDDO parameters.51 Another example is the recently published AM1 magnesium parameters of Hutter et al.52 Because GAs have already been demonstrated to be an acceptable method of optimizing semiempirical parameters, the question then becomes how to increase the efficiency of the GA approach. In a standard genetic algorithm, a generation is defined by taking m vectors of parameters and then removing m/2 semirandomly with a bias toward preserving the most fit (lowest error). The empty m/2 vectors are then filled with point mutations (randomly altering the value of one of the parameters) and cross breeds (selecting some parameters from one vector and using them to replace the elements of another vector) of the top m/2. The new vectors are then evaluated. This cycle or generation is then repeated as many times as desired. In this study, the standard GA was modified in a few simple ways. One modification utilized in this study was to use the average of two existing vectors as a source of new vectors as well as those produced by breeding and mutating. The reason for this is apparent if a set of points symmetrically surrounding a functional minimum is considered (e.g., (1,1), (1,-1), (-1,1), and (-1,-1) for a minimum located at the origin). No cross breed could reach the minimum, and therefore, a number of mutations would be required to reach the minimum. However, for this example, an average of two points would reach the minimum much faster. Hence, this method of generating parameter sets was added to the GA used in this study. Another modification was in the error function. An absolute value was used rather than a square term (see eq 2) to avoid the change in behavior as the difference goes from above unity to below it. Formally, N
Etotal )
Ei ∑ i)1
(3)
n
Ei )
wj|qj,experimental - qj,calculated| ∑ j)1
(4)
Figure 1. AM1 heats of formation. All values are in kcal/mol. This figure shows calculated versus desired values with the perfect line for reference.
Figure 2. PM3 heats of formation. All values are in kcal/mol. This figure shows calculated versus desired values with the perfect line for reference.
This was also done in part to allow the returned error values to be more readily interpreted. Weights utilized in this study are given in Table 1. Sodium Parametrization The next issue is how to set up the initial vector set. If a parametrization was undertaken to refine an already existing parameter set, it would be possible to simply take the initial parameter set and then generate the rest of the population as that parameter set plus a point mutation. While no PM3 or AM1 parameter set existed for sodium, the values from MNDO/d were utilized as an initial guess, and the Gaussian repulsion terms were all set to unity. Note that the number of a, b, and c Gaussian widths (used to correct core-core repulsion) was held to 6 (as in PM3) rather than the maximum 12 allowed in AM1. One hundred vectors were used. All calculations were done on an SGI Origin 200 with Irix 6.5 using a semiempirical code written in our lab.53 The evolution of the parameters was allowed to continue until the error stabilized. The compounds that made up the test set (i.e., used for actual error evaluation in the parametrization) for the parametrization were chosen via the consideration of four criteria: perceived importance (chemical intuition), error (compounds with large computational errors were added preferentially), representative nature (i.e., they were part of a class of compounds rather than a single compound), and size (i.e., small compounds can be tested much faster than larger compounds). All compounds used in this study had known heats of formation
Sodium Parameters for AM1 and PM3
J. Phys. Chem. B, Vol. 106, No. 10, 2002 2781
TABLE 2: Sodium Compound Heats of Formation compound 1H2O-Na+c 2H2O-Na+c 3H2O-Na+c 4H2O-Na+ 5H2O-Na+ 6H2O-Na+ water-Na+ complexes
actual valuea
65.4 -13.1 -88.9 -162.3 -238.0 -312.0 signed error -2.3 water-Na+ complexes unsigned error 3.4 water-Na+ complexes rms 4.9 NaFc -69.4 -64.2 c NaCl -43.4 -40.3 NaBr -34.4 -30.1 NaI -19.0 -17.1 Na2F2c -202.3 -211.2 Na2Cl2c -135.3 -138.9 Na2Br2 -116.2 -130.5 Na2I2 -84.7 -133.7 + NaF-Na 14.7 1.3 + NaCl-Na 45.1 52.4 HF-Na+ 63.1 27.4 HCl-Na+ 111.3 113.8 halogen signed error -8.4 halogen unsigned error 12.4 halogen rms 18.9 + ammonia-Na 109.0 103.3 isopropylamine-Na+ 94.2 81.6 acetonitrile-Na+ 130.8 128.2 pyridine-Na+ 147.4 134.1 methanimine-Na+ 145.0 126.1 imidazole-Na+ 143.7 147.8 1,2,4-triazole-Na+c 161.9 167.5 + tetrazole-Na 194.6 212.4 + glycine-Na 14.3 -2.8 nitrogen compound-sodium complex signed error -4.7 nitrogen compound-sodium complex unsigned error 10.8 nitrogen compound-sodium complex rms 12.4 ethene-Na+c 145.4 138.5 ethyne-Na+ 187.5 177.8 benzene-Na+ 143.6 126.3 + phenol-Na 100.3 84.6 π cloud-sodium complex signed error -12.4 π cloud-sodium complex unsigned error 12.4 π cloud-sodium complex rms 13.1 methanol-Na+ 70.9 63.5 ethanol-Na+ 61.4 54.7 2-propanol-Na+ 50.7 44.5 + ethanediol-Na 10.3 -4.9 +c methyl ether-Na 79.4 63.7 ethyl ether-Na+ 54.2 39.7 (methyl ether)2-Na+ 12.1 -15.3 dimethoxyethane-Na+ 19.3 -3.5 12-crown-4-ether-Na+ -67.2 -115.0 alcohol-sodium complex signed error -18.2 alcohol-sodium complex unsigned error 18.2 alcohol-sodium complex rms 22.1 a
63.8 -14.6 -89.0 -161.4 -232.3 -301.6
AM1a
PM3a referenceb 70.5 -5.8 -80.1 -153.3 -223.4 -291.9 8.5 8.5 8.6 -47.6 -46.3 -42.2 -25.7 -144.8 -175.9 -133.5 -85.6 47.2 32.4 69.4 99.0 1.4 18.3 24.6 111.4 89.5 137.7 142.1 136.0 137.7 157.5 198.5 1.5 -3.2 6.2 6.8 141.8 177.4 132.7 89.1 -8.9 8.9 9.5 70.2 62.4 50.4 5.7 72.2 51.3 -2.2 11.5 -90.0 -6.6 6.8 9.8
2, 56 2, 56 2, 56 2, 56 2, 56 2, 56
57, 58 57, 58 57, 58 58 57 57 57 32 21, 25, 56 21, 25, 56 21, 25, 56 21, 25, 56
20, 25, 56 25, 56 25, 56 25, 56 25, 56 25, 56 25, 56, 59 25, 56, 59 8, 56
25, 56 25, 56 25, 56 25, 56
compound CO-Na+
actual valuea
111.8 CO2-Na+ 37.9 +c formaldehyde-Na 95.0 +c acetaldehyde-Na 77.3 acetone-Na+c 62.6 formic acid-Na+ 31.4 formate-Na+c -107.2 methyl acetate-Na+ 18.7 formamide-Na+ 68.0 acetamide-Na+ 52.5 + N-methylacetamide-Na 49.2 carbonyl-sodium complex signed error carbonyl-sodium complex unsigned error carbonyl-sodium complex rms hydrogen sulfide-sodium complex 126.9 CH3SH-sodium 131.6 SO2-sodium 60.0 NaSO4 (C2V) -230.8 NaSO4 (C3V) -230.8 Na2SO4c -247.0 sulfur compound-sodium complex signed error sulfur compound-sodium complex unsigned error sulfur compound-sodium complex rms Na+c 145.6 Na2c 34.0 NaH 29.7 NaOHc -47.2 NaO-29.0 Na2(OH)2 -145.2 NaCNc 22.5 Na2(CN)2 -2.1 +c CH3-Na 26.3 methane-Na+ 121.6 Na2O -8.6 NaAlF4 -440.0 H2-Na+ 142.5 N2-Na+ 137.6 miscellaneous compound signed error miscellaneous compound unsigned error miscellaneous compound rms overall compound signederror overall compound unsigned error overall compound rms
AM1a
PM3a
referenceb
109.8 54.0 89.7 76.2 66.0 22.0 -116.1 17.5 67.0 57.6 57.3 0.3 5.6 7.2 134.6 125.2 66.6 -229.8 -227.2 -231.4 4.7 6.8 8.2 141.2 35.3 43.4 -36.2 36.1 -135.6 26.4 -10.3 24.3 122.5 -4.3 -451.6 131.4 137.6 5.2 10.5 18.9 -3.7 10.3 15.4
102.7 46.9 89.4 77.1 66.4 25.0 -116.4 20.5 72.1 59.0 57.2 0.2 5.8 6.5 121.6 115.3 55.0 -238.1 -242.4 -250.0 -8.1 8.1 9.3 153.7 59.1 54.1 -35.9 16.1 -140.5 38.6 15.9 26.7 122.6 -11.5 -426.4 127.0 164.6 12.6 15.2 19.3 1.0 10.5 14.8
25, 56, 63 25, 56 25, 56, 64 25, 56 25, 56 25, 56, 64 25, 56, 64 25, 56 17, 25, 56, 65 8, 56 8, 56
20, 25, 56 25, 56, 60 25, 56 56, 66 56, 66 56
56 56, 58 57, 58 57 56 57 57 57 32 25, 56 32 56 25, 56, 67 25, 56, 67
56, 60 25, 56 25, 56 56, 60 23, 25, 56 25, 56 23, 25, 56 24, 56, 61 24, 56, 62
All values are in kcal/mol. b Includes all references used for a compound in this study. c Denotes compounds included in the parametrization
set.
because this has been the most common form of data previous parametrizations have used.38,54 Finally, the parameters were checked for their physical reasonableness (e.g., the one-center core-electron interactions (Uss) were always negative, etc.) between runs. Results and Discussion For PM3, the average error for the heats of formation is 10.5 kcal/mol, while for AM1, it is 10.3 kcal/mol. These represent dramatic improvements over the errors observed for the complexes based on sparkles (126.5 and 149.3 kcal/mol, respectively). Hence, we observe that our sodium parameter sets represent significant improvements over the sparkle concept. It can be seen from Figures 1 and 2 that the correspondence between desired and calculated variables is relatively constant
throughout the range of heats of formation (see also Table 2), that is, there are no systematic errors at either the high or low ends of the spectrum. Moreover, the average errors (both signed and unsigned) are similar to those for other atom types in AM1 and PM3.38 Stewart makes the comment,38 which addresses the issue of the subsequent addition of atoms to AM1 or MNDO after the first parameter sets had been fixed, “A result of this serial parametrization is that only for the set of elements first parametrized could the parameters be fully optimized and internally consistent.” The present parametrization suffers from this problem, and this can be used to partially explain some of the observed problems with the present parameter sets. For instance, the AM1 sodium parameters always model the interactions of alcohols and carbonyls with sodium cations less
2782 J. Phys. Chem. B, Vol. 106, No. 10, 2002
Brothers and Merz
TABLE 3: Heats of Reaction for [Na(H2O)n-1]+ + H2O f Na(H2O)n n 1 2 3 4 5 6 signed error unsigned error rms a
experimental (kcal/mol)a
AM1 (kcal/mol)a
PM3 (kcal/mol)a
-24.0 -20.6 -16.6 -14.6 -13.1 -11.5
-16.6 -19.3 -16.5 -14.1 -16.5 -14.7 0.4 2.64 3.6
-29.8 -22.9 -20.9 -19.8 -16.6 -15.1 -4.1 4.12 4.3
Reference 2.
accurately than the PM3 parameters. However, the difficulty lies with the previously published parameter sets, because AM1 has been shown to be less accurate in predicting heats of association than PM3.38 Also, no complexes were used in the parametrization of AM1 or PM3, while this test set is made up primarily of intermolecular complexes. Thus, the sodium parameters cannot be any better than the parameter sets to which they are being added. Please note that Table 1 includes all references for parametrization values used in this study. A similar issue is present for the water-sodium complexes. The experimental heat of formation for water is -57.8 kcal/ mol, compared to the AM1 and PM3 predicted values of -53.4 and -59.2. Thus, while the heats of formation of the complexes
TABLE 4: Bond Distances for Sodium Compounds compound
actual parameter valuea AM1a PM3a referenceb
1H2O-Na+c Na-O 2.26 2.24 2.25 2H2O-Na+c Na-O 2.28 2.28 2.28 3H2O-Na+c Na-O 2.30 2.30 2.30 4H2O-Na+ Na-O 2.34 2.32 2.32 5H2O-Na+ Na-O 2.38 2.35 2.36 6H2O-Na+ Na-O 2.44 2.35 2.37 water-Na+ complexes signed error -0.03 -0.02 water-Na+ complexes unsigned error 0.03 0.02 water-Na+ complexes rms 0.04 0.03 NaFc Na-F 1.93 1.63 1.84 NaClc Na-Cl 2.36 2.15 2.22 NaBr Na-Br 2.50 2.19 2.17 NaI Na-I 2.71 2.57 2.54 NaF-Na+ Na-F 2.03 1.83 2.10 NaCl-Na+ Na-Cl 2.48 2.30 2.27 HF-Na+ Na-F 2.15 1.94 2.41 HF-Na+ H-F 0.93 0.83 0.94 HCl-Na+ Na-Cl 2.75 2.67 2.37 HCl-Na+ H-Cl 1.28 1.31 1.27 halogen signed error -0.17 -0.10 halogen unsigned error 0.18 0.17 halogen rms 0.20 0.21 glycine-Na+ Na-O 2.22 2.25 2.25 glycine-Na+ Na-N 2.46 2.30 2.34 nitrogen compound-sodium complex -0.06 -0.04 signed error nitrogen compound-sodium complex 0.09 0.07 unsigned error nitrogen compound-sodium complex rms 0.11 0.08 methyl ether-Na+c Na-O 2.18 2.21 2.28 (methyl ether)2-Na+ Na-O 2.23 2.25 2.29 dimethoxyethane-Na+ Na-O 2.26 2.27 2.32 alcohol-sodium complex signed error 0.02 0.07 alcohol-sodium complex unsigned error 0.02 0.07 alcohol-sodium complex rms 0.02 0.07 CO-Na+ Na-C 2.61 2.24 2.31 CO-Na+ C-O 1.13 1.16 1.13 CO2-Na+ Na-O 2.23 2.31 2.29 formaldehyde-Na+c Na-O 2.17 2.20 2.21 formaldehyde-Na+c C-O 1.20 1.24 1.21 acetone-Na+c Na-O 2.15 2.15 2.15 acetone-Na+c C-O 1.24 1.25 1.23 formic acid-Na+ Na-O 2.15 2.19 2.19 formic acid-Na+ C-O 1.21 1.25 1.23 formate-Na+c Na-C 2.46 2.60 2.66 formate-Na+c Na-O 2.19 2.21 2.26 formate-Na+c C-O 1.24 1.28 1.27 methyl acetate-Na+ Na-O 2.14 2.16 2.20 methyl acetate-Na+ O-C 1.24 1.25 1.24 formamide-Na+ Na-O 2.14 2.15 2.15 carbonyl-sodium complex signed error 0.01 0.02 carbonyl-sodium complex unsigned error 0.06 0.06 carbonyl-sodium complex rms 0.11 0.10
2 2 2 2 2 2
58 58 58 58 21 21 21 21 21 21
8 8
compound
parameter
actual valuea
SO2-Na+ Na-O 2.55 SO2-Na+ S-O 1.49 NaSO4 (C2V) Na-O 2.14 NaSO4 (C3V) Na-O 2.33 Na2SO4c Na-O 2.21 sulfur compound-sodium complex signed error sulfur compound-sodium complex unsigned error sulfur compound-sodium complex rms Na2c Na-Na 3.08 NaH Na-H 1.89 H2-Na+ H-H 1.40 H2-Na+ Na-H 4.58 N2-Na+ N-N 2.10 N2-Na+ N-Na 4.66 miscellaneous compound signed error miscellaneous compound unsigned error miscellaneous compound rms overall compound signed error overall compound unsigned error overall compound rms excluding N2-Na+ and H2-Na+ signed errord excluding N2-Na+ and H2-Na+ unsigned errord excluding N2-Na+ and H2-Na+ rmsd
AM1a PM3a referenceb 2.46 1.45 2.12 2.27 2.21 -0.04 0.04 0.05 2.20 1.57 0.69 2.29 1.11 2.28 -1.26 1.26 1.49 -0.20 0.23 0.54 -0.07 0.10 0.19
2.41 1.45 2.11 2.32 2.20 -0.04 0.04 0.07 2.77 1.46 0.71 2.42 1.10 5.19 -0.68 0.85 1.06 -0.11 0.18 0.40 -0.04 0.09 0.15
25 25 66 66 56
58 58 67 67 67 67
23 23 61
63 63 25 64 64 25 25 64 64 64 64 64 17 17 65
a All values are listed in Å. b References in this table pertain only to geometry. c Denotes compounds included in the parametrization set. d The reason for compiling overall statistics without the H2-Na+ and N2-Na+ complexes are given in the text.
Sodium Parameters for AM1 and PM3
J. Phys. Chem. B, Vol. 106, No. 10, 2002 2783
TABLE 5: Bond Angles for Sodium Complexes compound
parameter
actual value (deg)
NaF-Na+ Na-F-Na 180.0 NaCl-Na+ Na-Cl-Na 180.0 HF-Na+ H-F-Na 180.0 HCl-Na+ H-Cl-Na 180.0 halogen signed error halogen unsigned error halogen rms methyl ether-Na+a Na-O-C 125.0 (methyl ether)2-Na+ Na-O-C 124.9 + (methyl ether)2-Na O-Na-O 180.0 alcohol-sodium complex signed error alcohol-sodium complex unsigned error alcohol-sodium complex rms formaldehyde-Na+a C-O-Na 180.0 acetone-Na+a Na-O-C 180.0 formic acid-Na+ Na-O-C 165.5 formate-Na+a C-O-Na 87.1 methyl acetate-Na+ Na-O-C 165.4 formamide-Na+ Na-O-C 165.9 carbonyl-sodium complex signed error carbonyl-sodium complex unsigned error carbonyl-sodium complex rms SO2-Na+ O-S-O 112.9 NaSO4 (C2V) O-Na-O 69.9 Na2SO4a O-Na-O 66.2 sulfur compound-sodium complex signed error sulfur compound-sodium complex unsigned error sulfur compound-sodium complex rms H2-Na+ Na-H-H 81.2 N2-Na+ Na-N-N 180.0 miscellaneous compound signed error miscellaneous compound unsigned error miscellaneous compound rms overall compound signed error overall compound unsigned error overall compound rms a
TABLE 6: Ionization Potentials AM1 (deg)
PM3 (deg)
180.0 180.0 180.0 180.0 0.0 0.0 0.0 114.8 115.2 155.7 -14.7 14.7 16.2 180.0 178.9 149.1 92.1 138.0 151.0 -9.1 10.8 14.5 97.4 64.4 60.7 -8.8 8.8 10.0 81.3 179.8 0.0 0.2 0.2 -7.0 7.5 11.4
180.0 180.0 180.0 180.0 0.0 0.0 0.0 103.2 102.7 164.4 -19.9 19.9 20.1 180.0 178.4 123.9 93.7 112.7 123.0 -22.0 24.2 32.6 97.6 72.4 68.2 -3.6 6.6 9.1 81.6 180.0 0.2 0.2 0.3 -11.2 12.5 20.9
Denotes compounds included in the parametrization set.
are modeled well, the reaction energies listed in Table 3 show larger errors. In other words, the errors do not cancel each other out, but are additive. This is exacerbated by the relative magnitude of the two energies, i.e. an error of 2 kcal/mol. The error of the water molecule’s heat of formation compounded with the error of the complexes’ heat of formation leads to the observed errors in the modeling of reaction energies. One problem faced in this parametrization was the paucity of experimental data for geometries, as only the water-sodium ion complexes and the sodium halide complexes geometries were experimental values (Tables 4 and 5). The rest of the values were theoretically derived with a variety of methods and basis sets. (Note that the article by Hoyau et al.25 was an invaluable aid in evaluating ab initio data and was the primary source used to evaluate the quality.) One thing that should be noted here is the tendency of both parameter sets to underestimate both distances and angles, as can be seen from the signed errors of -0.11 Å and -11.2° for PM3 and -0.20 Å and -7.0° for AM1. The worst compounds were the complexes of a sodium ion with a hydrogen molecule or with a nitrogen molecule. If these two compounds are excluded, the average signed bond distance error drops to -0.04 Å for PM3 and -0.08 Å for AM1. The new parameters have difficulty in modeling these two complexes because they underestimate the H-H or N-N bond distance and also overestimate the sodium ion-molecule distance. Thus, the statistics generated without these complexes included are much more representative of the overall performance of the new parameters.
compound
actual value (eV)
AM1 (eV)
PM3 (eV)
NaCla NaBr NaI Na2a
8.92 8.31 7.64 4.89
9.67 8.84 7.74 5.15
8.62 10.51 7.07 4.93
a
Denotes compounds included in the parametrization set.
TABLE 7: Dipole Moments compound
actual value (D)
AM1 (D)
PM3 (D)
NaFa NaCla NaBr NaI
8.16 9.00 9.12 9.24
4.98 7.92 8.23 9.35
6.47 8.63 8.24 9.51
a
Denotes compounds included in the parametrization set.
The geometries of the sodium cation-water complexes or sodium cation-carbonyl oxygen compounds are much better, and because this is the important part of sodium modeling (i.e., sodium as a counterion), the computed geometries were deemed acceptable. Finally, we cannot comment unequivocally on the other physical properties such as dipole moment and ionization potential (see Tables 6 and 7). These values were limited by both the number of points and the narrowness of compound type. Many more experimental values would be required to indicate whether these values are well reproduced or not. However, for the points available, the data agrees fairly well with the available experimental values. The new parameters are comparable to the results obtained with MNDO/d, and a much larger range of compounds was used in this study than in the MNDO/d parametrization. The published MNDO/d parameter set gave an average error of 7.6 kcal/mol over 23 compounds, and these 23 compounds were almost exclusively Na-X or Na2X2 compounds. Geometrical errors of 0.12 Å over 16 compounds and 5.9° for four compounds were found for distances and angles, respectively. Additional testing would be necessary for a true comparison because there were no test cases for sodium associated with carbonyl oxygens or other small complexes in the MNDO/d paper. Table 8 lists the MNDO/d compounds and the results obtained from this parametrization for those compounds. Please note that two of the compounds from the MNDO/d sodium tests were removed because they were open-shell molecules. Conclusions It can be seen from the data described above that the new parameter sets are an improvement over the “sparkle” method for predicting the behavior of sodium because the new parameter sets realistically model a large portion of sodium chemistry. Over the same set of compounds included in this paper, sparkles produced errors of ∼100 kcal/mol for heats of formation. The errors in the new parameter sets are comparable with the errors in existing parameter sets for other atom types, but sodium is at the higher end of the spectrum. It has been noted55 that there would be difficulties encountered in adding sodium and other alkaline earth (Group 1A) metals to NDDO methods because attempts at parametrization had yielded results with larger errors than the other atom types in the method. Whether this was due to chemical reasons or methodological considerations is unknown. For example, it may be that the scant electronic density of the outer shell of an ionically bound sodium exacerbates this problem.
2784 J. Phys. Chem. B, Vol. 106, No. 10, 2002
Brothers and Merz
TABLE 8: Comparison of Heats of Formation for the MNDO/d Parametrization Set compound
desired value (kcal/mol)
145.6 Na+ Na2 34.0 NaH 29.7 CH3-Na 26.3 NaCN 22.5 NH3-Na+ 109 Na2O -8.6 NaOH -47.2 Na2(OH)2 -145.2 NaF -69.4 Na2F2 -202.3 NaCl -43.4 Na2Cl2 -135.3 NaBr -34.4 Na2Br2 -116.2 NaI -19.0 NaI2 -84.7 1H2O-Na+ 63.8 2H2O-Na+ -14.6 3H2O-Na+ -89.0 4H2O-Na+ -161.4 signed error unsigned error rms
AM1 (kcal/mol)
PM3 (kcal/mol)
MNDO/d (kcal/mol)
141.2 35.3 43.4 24.3 26.4 103.3 -4.3 -36.2 -135.6 -64.2 -211.2 -40.3 -138.9 -30.1 -130.5 -17.1 -133.7 65.4 -13.1 -88.9 -162.3 -1.3 7.2 12.4
153.7 59.1 54.1 26.7 38.6 111.4 -11.5 -35.9 -140.5 -47.6 -144.8 -46.3 -175.9 -42.2 -133.5 -25.7 -85.6 70.5 -5.8 -80.1 -153.3 6.0 13.5 19.3
145.6 30.5 27.1 23.4 27.2 119.2 -10.8 -41.5 -139.8 -68.5 -221.3 -36.1 -108.4 -29.8 -99.3 -24.9 -92.3 65.7 -13.1 -90.0 -163.5 1.9 6.3 9.2
TABLE 9: Parametersa parameter
AM1
PM3
Uss Upp ζs ζp βs βp Gss Gpp GSP GP2 HSP R Gaussian A1 Gaussian A2 Gaussian B1 Gaussian B2 Gaussian C1 Gaussian C2 D1 D2 AM AD AQ electronic atomic energy
-5.255 536 2 -2.081 278 1 0.679 779 2 1.217 046 8 -1.453 694 4 -0.229 806 4 7.345 917 8 4.113 051 6 8.404 255 0 4.037 095 7 0.248 769 9 2.248 716 4 0.532 266 8 0.922 359 8 0.480 030 4 1.907 677 6 1.168 105 5 1.153 767 0 1.589 975 5 1.374 901 9 0.269 971 3 0.161 804 7 0.186 573 8 -5.255 536 2
-5.294 592 9 -2.459 656 4 1.137 501 1 1.187 743 3 -0.151 087 0 -0.218 409 6 3.955 869 2 5.336 396 3 7.192 910 9 5.058 807 4 0.568 788 9 2.367 716 9 0.643 365 5 1.087 178 8 1.546 505 4 1.452 900 0 0.997 669 9 1.450 609 9 1.735 237 7 1.408 823 0 0.145 382 9 0.212 106 3 0.257 582 9 -5.294 592 9
a
Names correspond to those found in the MOPAC 6.0 paper33.
A final issue is the number of ab initio data points that were included in this study. Without them, there clearly is a paucity of data to confidently cover a broad range of bonding situations to produce reliable parameters. However, ab initio data “robs” the semiempirical method of one of its great strengths, namely, the implicit incorporation of electronic correlation provided by experimental data. Nonetheless, ab initio and DFT methods are a valuable resource for quantitative information that cannot be or has not been obtained from experiment, especially in terms of geometric parameters. Despite these facts, the parameter sets provided in Table 9 are good at predicting the chemical behavior of sodium in a variety of bonding types. They represent a drastic improvement over the “sparkle” methodology. The new values also fall within the range of values defined by other atom types parametrized
for NDDO methods. Also, while the AM1 parameters appear to outperform the PM3 parameters for certain classes of compounds in certain situations, the PM3 values are recommended because PM3 is a more accurate method overall. Acknowledgment. We thank the National Center for Supercomputer Applications (NCSA) for generous allocation of supercomputer time. We also thank the NSF for support of this project via a Graduate Research Training Grant, Number DGE9354958. References and Notes (1) Cotton, F. A. AdVanced Inorganic Chemistry: A ComprehensiVe Text, 3rd ed.; Interscience Publishers: New York, 1972. (2) Dzidic, I.; Kebarle, P. J. Phys. Chem. 1970, 74, 1466-1474. (3) Castleman, A. W.; Holland, P. M.; Lindsay, D. M.; Peterson, K. I. J. Am. Chem. Soc. 1978, 100, 6039. (4) Castleman, A. W.; Peterson, K. I.; Upshulte, B. L.; Schelling, F. J. Int. J. Mass Spectrom. Ion Phys. 1983, 47, 203. (5) Peterson, K. I.; Mark, T. D.; Keesee, R. G.; Castleman, A. W. J. Phys. Chem. 1984, 88, 2880. (6) Guo, B. C.; Conklin, B. J.; Castleman, A. W. J. Am. Chem. Soc. 1989, 111, 6506. (7) Guo, B. C.; Purnell, J. W.; Castleman, A. W. Chem. Phys. Lett. 1990, 168, 155. (8) Klassen, J. S.; Anderson, S. G.; Blades, A. T.; Kebarle, P. J. Phys. Chem. 1996, 100, 14218-14227. (9) Dalleska, N. F.; Tjelta, B. L.; Armentrout, P. B. J. Phys. Chem. 1994, 98, 4191. (10) Walter, D.; Seivers, M. R.; Armentrout, P. B. Int. J. Mass Spectrom. Ion Processes 1998, 175, 93. (11) More, M. B.; Ray, D.; Armentrout, P. B. J. Phys. Chem. 1997, 101, 831. (12) Rodgers, M. T.; Armentrout, P. B. Int. J. Mass Spectrom. 1999, 185/186/187, 359. (13) Cooks, R. G.; Patrick, J. S.; Kotiaho, T.; McLuckey, S. A. Mass Spectrom. ReV. 1994, 13, 287. (14) Bojesen, G.; Breindahl, T.; Andersen, U. N. Org. Mass Spectrom. 1993, 28, 1448. (15) Cerda, B. A.; Wesdemiotis, C. J. Am. Chem. Soc. 1996, 118, 11884. (16) Cerda, B. A.; Hoyau, S.; Ohanessian, G.; Wesdemiotis, C. J. Am. Chem. Soc. 1998, 120, 2437. (17) Feng, W. Y.; Gronert, S.; Lebrilla, C. B. J. Am. Chem. Soc. 1999, 121, 1365. (18) Takasu, R.; Hashimoto, K.; Okuda, R.; Fuke, K. J. Phys. Chem. A 1999, 103, 349-354. (19) Petrie, S. Chem. Phys. Lett. 1998, 283, 131. (20) Remko, M.; Sarissky, M. Chem. Phys. Lett. 1998, 282, 227-232. (21) Sannigrahi, A. B.; Nandi, P. K.; von R. Schleyer, P. J. Am. Chem. Soc. 1994, 116, 7225-7232. (22) Glendening, E. D.; Feller, D. J. Phys. Chem. 1995, 99, 3060. (23) Hill, S. E.; Glendening, E. D.; Feller, D. J. Phys. Chem. A 1997, 101, 6125-6131. (24) Hill, S. E.; Feller, D.; Glendening, E. D. J. Phys. Chem. A 1998, 102, 3813-3819. (25) Hoyau, S.; Norrman, K.; McMahon, T. B.; Ohanessian, G. J. Am. Chem. Soc. 1999, 121, 8864-8875. (26) Marrone, T. J.; Merz, K. M., Jr. J. Phys. Chem. 1993, 97, 65246529. (27) Leon, E.; Amekraz, B.; Tortajada, J.; Morizur, J. P.; Gonzalez, A. I.; Mo, O.; Yanez, M. J. Phys. Chem. A 1997, 101, 2489-2495. (28) Chandrasekhar, J.; Spellmeyer, D. C.; Jorgensen, W. L. J. Am. Chem. Soc. 1984, 106, 903-910. (29) Cieplak, P.; Kollman, P. J. Chem. Phys. 1990, 92, 6761. (30) Tongraar, A.; Liedl, K. R.; Rode, B. M. J. Phys. Chem. A 1998, 102, 10340-10347. (31) Kistenmacher, H.; Popkie, H.; Clementi, E. J. Chem. Phys. 1973, 58, 1689. (32) Thiel, W.; Voityuk, A. A. J. Phys. Chem. 1996, 100, 616-626. (33) Stewart, J. J. P. J. Comput.-Aided Mol. Des. 1990, 4, 1-105. (34) Pople, J. A.; Santry, D. P.; Segal, G. A. J. Chem. Phys. 1969, 43, S129-S135. (35) Dewar, M. J. S.; Thiel, W. J. Am. Chem. Soc. 1977, 99, 48994907. (36) Dewar, M. J. S.; Thiel, W. J. Am. Chem. Soc. 1977, 99, 49074917.
Sodium Parameters for AM1 and PM3 (37) Dewar, M. J. S.; Zoebisch, E. G.; Healy, E. F.; Stewart, J. J. P. J. Am. Chem. Soc. 1985, 107, 3902-3909. (38) Stewart, J. J. P. J. Comput. Chem. 1989, 10, 209-220. (39) Stewart, J. J. P. J. Comput. Chem. 1989, 10, 221-264. (40) Mezei, M.; Beveridge, D. L. J. Chem. Phys. 1981, 74, 6902. (41) Maye, P. V.; Mezei, M. J. Mol. Struct. 1996, 362, 317. (42) Westerhoff, L. M.; van der Vaart, A.; Brothers, E. N.; Merz, K. M., Jr. Unpublised results, 2002. (43) Dixon, S. L.; Merz, K. M., Jr. J. Chem. Phys. 1996, 104, 66436649. (44) Dixon, S. L.; Merz, K. M., Jr. J. Chem. Phys. 1997, 107, 879893. (45) Dewar, M. J. S.; McKee, M. L.; Rzeps, H. S. J. Am. Chem. Soc. 1978, 100, 777. (46) Dewar, M. J. S.; McKee, M. L. J. Comput. Chem. 1983, 4, 84. (47) Dewar, M. J. S.; Reynolds, C. H. J. Comput. Chem. 1986, 7, 140143. (48) Press, W. H.; Flannery, B. P.; Tuekolsky, S. A.; Vetterling, W. T. Numerical Recipes: The Art of Scientific Computing (FORTRAN Version); Cambridge University Press: Cambridge, U.K., 1989. (49) Dewar, M. J. S.; Zoebisch, E. G. THEOCHEM 1988, 180, 1-21. (50) Goldberg, D. Genetic Algorithms in Search, Optimization and Machine Learning; Addison-Wesley: San Mateo, CA, 1989. (51) Rossi, I. a. T. D. G. Chem. Phys. Lett. 1995, 233, 231-236. (52) Hutter, M. C.; Reimers, J. R.; Hush, N. S. J. Phys. Chem. B 1998, 102, 8080-8090.
J. Phys. Chem. B, Vol. 106, No. 10, 2002 2785 (53) Dixon, S. L.; van der Vaart, A.; Gogonea, V.; Vincent, J. J.; Brothers, E. N.; Saurez, D.; Westerhoff, L. M.; Merz, K. M., Jr. DiVCon 99, The Pennsylvania State University: PA, 1999. (54) Stewart, J. J. P. In Semiempirical Molecular Orbital Methods; Lipkowitz, K. B., Boyd, D. B., Eds.; VCH: New York, 1990; Vol. 1. (55) Stewart, J. J. P. J. Comput. Chem. 1991, 12, 320-341. (56) NIST NIST Chemistry Webbook, webbook.nist.gov/chemistry, 1999. (57) Barin, I.; Platzki, G. Thermochemical Data of Pure Substances; VCH: Weinheim, Germany, 1995; Vol. 2. (58) CRC Handbook of Chemistry and Physics, 78th ed.; CRC Press: Boca Raton, FL, 1997-1998. (59) Alcami, M.; Mo, O.; Yanez, M. J. Phys. Chem. 1992, 96, 30223029. (60) Yeh, T.-S.; Su, T.-M. J. Phys. Chem. A 1999, 103, 3115-3122. (61) Yeh, T.-S.; Su, T.-M. J. Phys. Chem. A 1997, 101, 1672-1679. (62) Yeh, T.-S.; Su, T.-M. J. Phys. Chem. A 1998, 102, 6017-6024. (63) Ferrari, A. M.; Ugliengo, P.; Garrone, E. J. Chem. Phys. 1996, 105, 4129-4139. (64) Remko, M. Mol. Phys. 1997, 91, 929-936. (65) Tortajada, J.; Leon, E.; Morizur, J. P.; Luna, A.; Mo, O.; Yanez, M. J. Phys. Chem. 1995, 99, 13890-13898. (66) Wang, X.-B.; Ding, C.-F.; Nicholas, J. B.; Dixon, D. A.; Wang, L. S. J. Phys. Chem. A 1999, 103, 3423-3429. (67) Bishop, D. M.; Cybulski, S. M. Chem. Phys. Lett. 1994, 230, 177181.