Localized Orbital Corrections for the Calculation of Ionization

Sep 21, 2006 - The method significantly reduces the number of outliers and overall MAD to error levels below that achieved with G2 wave function based...
0 downloads 9 Views 140KB Size
J. Phys. Chem. B 2006, 110, 18787-18802

18787

Localized Orbital Corrections for the Calculation of Ionization Potentials and Electron Affinities in Density Functional Theory Eric H. Knoll* and Richard A. Friesner Department of Chemistry, Columbia UniVersity, HaVemeyer Hall, MC 3110, New York, New York 10025 ReceiVed: March 30, 2006; In Final Form: August 18, 2006

This paper describes the extension of a previously reported empirical localized orbital correction model to the correction of ionization potential energies (IP) and electron affinities (EA) for atoms and molecules of first and second row elements. The B3LYP localized orbital correction version of the model (B3LYP-LOC) uses 22 heuristically determined parameters that improve B3LYP DFT IP and EA energy calculations on the G2 data set of 134 molecules from a mean absolute deviation (MAD) from experiment of 0.137 to 0.039 eV. The method significantly reduces the number of outliers and overall MAD to error levels below that achieved with G2 wave function based theory; furthermore, the new model has zero additional computational cost beyond standard DFT calculations. Although the model is heuristic and is based on a multiple linear regression to experimental errors, each of the parameters is justified on physical grounds, and each provides insight into the fundamental limitations of DFT, most importantly the failure of current DFT methods to accurately account for nondynamical electron correlation.

I. Introduction The determination of ionization potentials and electron affinities is an essential objective of ab initio quantum chemical methodology. At least for first and second row atoms, highly accurate results for these quantities can be obtained by using high level wave function based approaches such as CCSD (T) and ultra large basis sets,1-3 while composite methods such as G2/G3 theory or CBS-Q can robustly achieve very good results with significantly less computational effort.2,4-7 However, such high level ab initio approaches are difficult to apply to larger molecules or to heavier atoms, due to the steep scaling of the calculations with the number of electrons. Density functional theory (DFT) offers an attractive alternative approach to the computation of electron affinities and ionization potentials for large molecules. The calculations are tractable for systems containing hundreds of atoms, and the benchmark accuracy, as assessed2,5 using a series of small molecules for which accurate experimental values have been measured, are very respectable; for example, the mean unsigned error using the B3LYP functional for the G2 ionization potential/ electron affinity data set5 is ∼0.15 eV. A significant number of applications of DFT methods to calculation of molecular electron affinities and ionization potentials to a wide range of systems, including various small molecules,2,8-11 organometallic compounds,12-15 nucleotide bases,16-18 porphyrins,19-21 fullerenes22 and other medium to large organic molecules,23,24 have been reported. While the performance discussed above is very respectable, there are significant variations among the individual members of the data set (which are not captured by the mean unsigned error), and the achievement of a higher level of accuracy and greater reliability, as is possible with ab initio methods, would be desirable. Over the past decade, a large number of new DFT functionals have been developed, and evaluation of performance for electron affinities and ionization potentials has typically been * Corresponding author.

a component of the overall assessment of the thermochemical accuracy of the functional,25-32 and some improvement over the most widely used functional, B3LYP, has been reported,32 although the test set in ref 32 is considerably smaller than that in ref 2. With regard to theoretical analysis of the errors made by DFT in computing ionization potentials and electron affinities, the focus has been primarily on the effects of self-interaction error,33-38 particularly on anions where in some cases a positive eigenvalue for the HOMO (which is nonphysical) can be observed. However, in practice, if a finite but large basis set is used (e.g., augmented triple-ζ plus polarization), calculation of energy differences between the neutral and anion can give a very reasonable representation of the experimental electron affinity (as in the results cited above). Other types of errors, in contrast, have not yet been extensively expored. In a previous paper,39 we have discussed the use of localized orbital corrections (LOCs) aimed at improving the thermochemical accuracy of DFT. That work was focused on the calculation of atomization energies, using both gradient corrected and hybrid DFT, with a particular emphasis on the B3LYP functional, which appears to be the most suitable of the currently available functionals in wide use with regard to the improvements that can be obtained with localized corrections of the type that we have been developing. The assignment of correction terms to molecules is based on a localized orbital picture of electrons. Although the method relies on identifying localized orbitals assigned by hand based on Lewis structures of bond, lone pair, and unpaired electrons, an automated algorithm for assignment of corrections could be developed from these Lewis structures. This correction scheme does not yield a smooth potential energy surface because a molecule either has or does not have a certain localized orbital for which a given empirical correction is additively applied to the DFT calculated energy. The resulting B3LYP-LOC methodology, employing 22 localized orbital correction parameters, yields an average error for atomization energies of 0.8 kcal/mol for the 222 molecules in the G3 test suite4 assembled by Pople and co-workers. This can

10.1021/jp0619888 CCC: $33.50 © 2006 American Chemical Society Published on Web 09/21/2006

18788 J. Phys. Chem. B, Vol. 110, No. 38, 2006 be compared with the 4.8 kcal/mol average error of uncorrected B3LYP, and the 1.1 kcal/mol average error of G3 theory, which involves the use of coupled cluster calculations and hence cannot easily be applied to large molecules. B3LYP-LOC represents the first DFT-based method to approach chemical accuracy for atomization energies (validated, at present, only for molecules constructed of atoms from the first two rows of the periodic table). The B3LYP-LOC methodology employs corrections for atomic hybridization states, and for different types of chemical bonds. The bonding corrections are quite specific and physically understandable (as is explained in detail in ref 39); however, the hybridization parameters in principle subsume various changes in the atom as one goes from the isolated atom to a “typical” molecular environment associated with the appropriate hybridization state, a set of far from trivial alterations in the orbitals occupied by the various electrons. Our previous work also did not address the question of how to apply corrections for charged species; all molecules in the G3 atomization test set are neutral. Finally, as discussed above, the calculation of ionization potentials and electron affinities is a fundamental quantum chemical task which arises in a wide variety of practical applications; a methodology cannot be complete if it cannot address this task at an appropriate level. All of these factors have provided motivation to examine the question as to whether empirical localized corrections can be designed which are capable of substantially reducing the errors in DFT based computation of ionization potentials and electron affinities. In the present paper, this question is answered in the affirmative, and parameters are presented based on fitting to the 146 molecules in the G2 ion test set5 of experimental data available for these quantities. Unlike the situation with atomization energies, where all major outliers were eliminated by the corrections, there are a few ions in the G2 set where the DFT wave function (as opposed to energetics) is qualitatively incorrect; a posteriori corrections of the type we develop here do not affect the wave function, and hence cannot remedy errors of this type. The three molecules eliminated from the data are CN, NCCN, and C2. For the remaining molecules, where the DFT density is close to the exact ground state density of the ion, a significant improvement as compared to the original DFT errors is achieved. Of equal importance, the analysis of the errors and the development of correction parameters provide new general insights into the source of errors in current DFT methods. These insights reaffirm the basic picture discussed in ref 39, while deepening that picture by analyzing the interactions of individual electrons within the atom. This analysis should be useful in improving the calculation of energy differences between different spin states, and in treating transition metal containing systems where the interactions of the d-electrons on the metal atom are crucial in understanding the electronic structure. As in the case of atomization energies, consistent patterns are revealed by analyzing the existing DFT errors as compared to experiment, and a combination of this analysis with a physical model of the origin of DFT errors leads to a quantitative correction scheme with an acceptable number of empirical parameters (although the ratio of experimental data to empirical parameters is not as great as in ref 39). The paper is organized as follows. In section II, we briefly review the DFT-LOC approach as presented in ref 39, emphasizing the theoretical architecture of the error analysis as well as briefly summarizing relevant results on the G3 set for atomization energies. In section III, we develop models for ionization potentials and electron affinities, and apply them to

Knoll and Friesner the G2 data from ref 5. Section IV, the discussion, provides a preliminary analysis of our findings. Finally, section V, the conclusions, summarizes our results and discusses future directions. II. Overview of the B3LYP-LOC Methodology We briefly review the LOC methodology for improving the performance of DFT functionals (with the primary focus on the B3LYP functional) discussed in ref 39. In the LOC approach, additive empirical corrections to the total energy were developed for unpaired electrons and for electron pairs which occupy localized orbitals, with the correction parameters dependent upon the local environment of the orbital. These corrections are complemented by a set of atomic corrections which depend on the hybridization state of the atom. Thus far, parameters have been developed for first and second row atoms. The methodology has been parametrized to fit atomization energies of the 222 molecules in the G3 data set of Pople and co-workers. With a total of 22 fitting parameters, the mean average deviation (MAD) of the energy in the B3LYP-LOC methodology is 0.8 kcal/mol, as compared to the 4.8 kcal/mol MAD of uncorrected B3LYP calculations, or the 1.1 kcal/mol of G3 theory. Other DFT functionals, including gradient corrected methods and even the local density approximation (LDA), are also substantially improved by LOCs. The key conceptual idea behind the DFT-LOC approach is the hypothesis that the self-interaction “error” in the treatment of localized electron pairs is used in DFT methods to represent the nondynamical correlation energy of those pairs in atoms and chemical bonds. The gradient corrections in the exchange and correlations components of modern functionals do a remarkably good job of representing dynamical (short range) correlation effects; indeed, the strength of DFT methods is that such effects are computed to a level of accuracy that, in a traditional wave function based approach (e.g., CCSD (T)) would require very large basis sets and levels of excitation. However, when correlations are on the length scale of a chemical bond, for example “in-out” correlation of an electron pair in a bonding orbitals, the molecular framework of the bond (chemical identity of the atoms involved in the bond, bond length, neighbors of the bonded atoms) play a crucial role in determining the quantitative value of the nondynamical correlation energy. The self-interaction terms can be tuned to represent nondynamical correlation in an “average” localized bonding environment; however, deviations from that average, which depend on the environment as specified above, are inherently nonlocal properties, and hence extremely difficult to capture in a localized gradient expansion, without introducing instabilities and noise into the model. In the LOC approach, the assumption is made that such deviations are controlled by the localized bonding framework, and are transferable from one molecule to another as long as the localized framework is preserved. The remarkable success of the LOC model on the G3 set demonstrates that this hypothesis is valid to the level of near-chemical accuracy. The values of the bond correction parameters in the LOC model depend on bond type, bond length, and polarity (or ionicity). Specific parameters are assigned to multiple bonds, discriminating between double and triple bonds, and polar and nonpolar atom pairs. The parametrization of single bonds is, at present, most easily amenable to physical interpretation. As the bond length increases relative to the size of the localized orbital, the nondynamical correlation energy is increasingly underestimated, because the pair of electrons in the bond have more room

LOC Model to Correct IP and EA in First and Second Row Elements to avoid each other (but this situation is difficult to ascertain via localized density or gradient calculations). The extreme case is observed for ionic bonds, where the electron pair is predominantly localized on one of the atoms in the bond. In addition to obvious cases such as NaCl, any molecule containing a formal charge transfer (e.g., a molecule such as CO whose Lewis structure contains a formal charge separation) requires a “charge transfer” correction for underbinding, which at -4.53 kcal/mol is substantially larger in magnitude than the typical localized bonding correction (ranging from +0.37 kcal/mol for a C-H bond, to -2.63 kcal/mol for an Si-Si bond). Single bonds whose constituent atoms are (otherwise) bonded to heavy atoms also are able to access a larger amount of nondynamical correlation energy than those bonded to hydrogens; an “environmental correction” of -0.49 kcal/mol is applied for each such bonded neighbor. Thus, the B3LYP error for the atomization energy of CCl4 is -14.0 kcal/mol, whereas that for CH3Cl is -0.8 kcal/mol. Employing the environmental correction scheme as discussed above, the localized correction parameter (including the environmental correction terms) for the C-Cl bond in CH3Cl is -1.96 kcal/mol, whereas that for a C-Cl bond in CCl4 is -3.43 kcal/mol. The power of this schedule of intrinsic and environmental corrections for various bond types is demonstrated in the G3 test set where the errors for data sets with molecules of very different size (e.g. hydrocarbons ranging from one to 10 heavy atoms) do not grow substantially as a function of molecule size. The B3LYP-LOC methodology provides equally accurate results for linear, branched, cyclic, and aromatic hydrocarbons, whereas all previous DFT functionals exhibit systematic errors for various classes of structures and display increasing errors (leading to errors in total atomization energies of 10-20 kcal/mol for the largest systems in the G3 data set) with increasing molecule size. Similarly, B3LYP and all other functionals systematically manifest large errors for molecules with ionic bonds (i.e., molecules with a formal charge transfer), whereas the use of a single charge transfer parameter for all such cases leads to a dramatic increase in the accuracy of the predicted atomization energy. The G3 test set contains 21 molecules to which we assign a charge transfer correction. The average error in these molecules in uncorrected B3LYP calculations is 9.8 kcal/mol, whereas in B3LYP-LOC it is reduced to 1.2 kcal/mol, with the largest error being 4 kcal/mol, and only three of the 21 molecules displaying errors larger than 2 kcal/mol. Overall, the addition of the single CT parameter reduces the average error by nearly a factor of 2 (from 1.56 kcal/mol to 0.79 kcal/mol), while inclusion of the environmental correction parameter similarly reduces the error by ∼30% (from 1.09 kcal/mol to 0.79 kcal/ mol). Singly occupied orbitals require a somewhat different physical analysis; here, one expects that when such orbitals are delocalized (as in e.g., a radical transition state in an odd electron system such as H + H2), the system will be overbound, as the localized nondynamical correlation in such an orbital will be overestimated given that the orbital contains only one electron. In the G3 data set, there are no transition states, so the observation of extreme overbinding is limited; however, even localized radical species exhibit systematic overbinding compared to the neutral atoms. To remedy this problem, we have developed simple empirical, localized corrections based on the bonded neighbors of the atom on which the radical resides, which are highly effective in reducing the errors in the G3 set; the use of three radical correction parameters reduces the overall MAD of the 222 molecule data set by 20%, despite the fact

J. Phys. Chem. B, Vol. 110, No. 38, 2006 18789

that only a relatively small fraction of the molecules contain radical species. Further details of the B3LYP-LOC methodology, including a complete list of parameters and results for each molecule of the G3 test set, can be found in ref 39. The present paper builds on the ideas and results of ref 39, but further conceptual development and parametrization is required to extend the basic ideas to the interactions of individual electrons, as opposed to calculation of atomization energies. In the latter, one does not have to consider how the electrons in the atom reorganize themselves when bonds are made and broken; many of the effects are subsumed under the “hybridization parameters” which provide corrections as one goes from the free atom to the atom in a specific molecular environment. In contrast, electron affinities and ionization potentials cannot be understood without considering the energies of electrons in atomic orbitals, and how these energies change as more, or fewer, electrons are introduced. Analysis of these transformations deepens the underlying basis of the theory, although there remain many unresolved issues to investigate in the future, some of which are raised by the results discussed below. III. Development of Models for Ionization Potential and Electron Affinity A. Overview. As in ref 39, we shall take the neutral atoms as reference states, and apply corrections to the atomic and molecular cations and anions, and to the neutral molecules when necessary. Corrections for atomic cations and anions are specified by delineating specific changes associated with converting the neutral atom to the relevant charged species. Corrections for molecular cations and anions are defined as compared to the neutral molecule. These corrections may include elimination of molecular correction terms defined in ref 39; for example, if an unpaired electron is removed from a neutral radical, the resulting cation will lose the radical delocalization corrections that are applied to the neutral molecule according to the model developed in ref 39. The total B3LYP-LOC energy of a charged atom is then specified by adding the correction term for that atom to the total B3LYP energy computed for the charged species. The total B3LYP-LOC energy for a charged molecule is calculated by adding a full set of corrections (both those defined below, and those defined in ref 39) to the total B3LYP energy computed for the charged species. The use of the neutral atoms as a reference state has no consequence for the calculation of chemically relevant energy differences. If one wanted to generate total energy that were as accurate as possible in an absolute sense, the total energies of the neutral atoms as computed via B3LYP (and the basis set employed in the calculations) could be corrected based on accurate total energy calculations using high level wave function based methods, and the resulting differences could then be added to the total molecular B3LYP-LOC energy as defined above. In what follows, we begin by discussing the ionization potential data for atoms, followed by the electron affinity data for atoms. The striking similarity in the error patterns for both first and second row atoms, as well as for ionization potentials and electron affinities, provides convincing evidence for the underlying validity of the model. We then proceed to analyze ionization potentials for molecules, and finally electron affinities of molecules. The data for molecular systems, while exhibiting more complexity in its interpretation, continues to reveal systematic patterns, enabling the development of a parametrization scheme capable of rather high accuracy and robustness. B. Ionization Potentials of Atoms. B.1 OVerView. Examination of the errors in ionization potentials of atoms in the G2

18790 J. Phys. Chem. B, Vol. 110, No. 38, 2006 data set reveals systematic regularities in the error as a function of atomic number. In the present section, we develop a simple model for these errors, involving a small number of parameters, which reduce the errors by more than an order of magnitude and which are consistent with the physical principles governing DFT errors elucidated in ref 39. We begin with an analysis of errors in first row atoms; errors for the second row display an identical pattern, but require different parameters, which are derived after the first row errors are analyzed in detail. Removal of an electron from a first or second row atom can be qualitatively analyzed as follows. (a) If an electron is removed from an orbital containing two paired electrons (filled orbital), it leaves an unpaired electron (or half filled orbital) in the ion. (b) If an electron is removed from a half filled orbital, an empty orbital is created in the ion. (c) If the atom contains both filled and unfilled p orbitals, Hund’s rules mandate that the electron will be removed from the filled p orbital, and that the electron in the resulting singly occupied orbital will have its spin parallel with the other unpaired electrons in the atom. In ref 39, we discussed at length the correlation of errors in the DFT energy with the size and occupancy of a given localized orbital. The conclusions of this analysis can be summarized as follows. (1) Singly occupied (half filled) orbitals become more overbound as the orbital size increases. Singly occupied orbitals do not have any intraorbital nondynamical correlation, and hence our expectation is that the DFT self-interaction term, which (as argued in ref 39) primarily represents intraorbital nondynamical correlation effects, will lead to overbinding. As argued in ref 39, the overbinding increases as the orbital size increases. (2) The situation with regard to doubly occupied orbitals is more complex. Here the self-interaction term in a molecule is effectively used to represent intrapair nondynamical correlation, and the DFT error can be either positive or negative, depending upon the size of the orbital as compared to the “space” available for nondynamical correlation. In ref 39, we analyzed these errors for all bond types present in the G3 data set and developed parameters that accurately represent the errors for these bond types. An electron pair in a free atom represents a new situation not considered in ref 39 (being irrelevant to the calculation of atomization energies as formulated in that paper), and hence new parameters must be fit to experimental data. These parameters are derived below and the physical interpretation of the parameters, which is consistent with the results obtained for a bonding electron pair, is then discussed. B.2 Ionization of Electron from a Singly Occupied Orbital in an Atom. We begin with an analysis of atoms where an electron is removed from a singly occupied orbital. Ionization of Li involves removal of a 2s electron, which as we shall see, requires (quite reasonably, on physical grounds) different parameters than a 2p electron. The data in Table 2 indicate that the Li atom is substantially overbound relative to the ion. This makes sense as the singly occupied 2s orbital is quite large, and the degree of overbinding (0.23 eV, or 6.2 kcal/mol) is consistent with, for example, the analysis of self-interaction errors in the H+H2 transition state of ∼6 kcal/mol in ref 40 (this transition state also has a stretched, singly occupied orbital). We next consider the series B, C, and N. In all of these cases, an unpaired electron is being removed from a half filled orbital. The key question is why the overbinding in B, C, and N is dramatically different. A simple hypothesis is based on the idea that there is a systematic error in the DFT calculation of the interaction of two electrons, possessing parallel spins as mandated by Hund’s rules, in half filled orbitals. Note that the Becke exchange functional was parametrized to fit exact

Knoll and Friesner TABLE 1: Description of Column Headings in Table 2 column abbreviation

A B C D E F G H

value [a]

description

Intrinsic correction for the removal of an e- from the following orbital: IP_U2s_A an unpaired 2s e-0.23 IP_P2s_A a paired 2s e0.20 IP_U2p_A an unpaired 2p e-0.44 IP_P2p_A a paired 2p e-0.21 IP_U3s_A an unpaired 3s e-0.28 IP_P3s_A a paired 3s e-0.08 IP_U3p_A/M an unpaired 3p e-0.04 IP_P3p_A/M a paired 3p e-0.04

value [b]

N/A N/A -0.44 -0.25 N/A N/A -0.02 -0.02

Correction for interacting unpaired parallel spins from: SS_2p 2p orbitals in cation or neutral -0.15 -0.12 SS_3p 3p orbitals in cation or neutral -0.07 -0.07

I J (1) (2) (3)

Data Columns B3LYP deviation from Experiment (Expt. - Theory) Error after correction (B3LYP deviation - Correction idealized for atoms, Value[a]) Error after correction (B3LYP deviation - Correction from Values[b])

a Values in eV. Note that two values for the parameters are shown. [a] is the idealized parameter value when considering only the atoms. [b] is the value when the entire dataset of atoms and molecules is considered, as some of the atom only parameters from [a] also serve as molecule parameters in [b]. The parameters in [b] for the 2s and 3s electrons were removed due to overfitting from the single available data point for each of these four parameters.

TABLE 2: Parameter Assignments for Ionization Potentials in Atomsa parameter A B C D E F G H

data I

J

Li 1 . . . . . . . . . Be . 1 . . . . . . . . B . . 1 . . . . . . . C . . 1 . . . . . -1 . N . . 1 . . . . . -2 . O . . . 1 . . . . 2 . F . . . 1 . . . . 1 . Ne . . . 1 . . . . . . Na . . . . 1 . . . . . Mg . . . . . 1 . . . . Al . . . . . . 1 . . . Si . . . . . . 1 . . -1 P . . . . . . 1 . . -2 S . . . . . . . 1 . 2 Cl . . . . . . . 1 . 1 Ar . . . . . . . 1 . . a

(1)

(2)

(3)

-0.23 0.00 N/A 0.20 0.00 N/A -0.44 0.00 -0.00 -0.29 0.00 0.03 -0.14 0.00 0.06 -0.55 -0.04 -0.06 -0.34 0.02 0.03 -0.21 0.00 0.04 -0.28 0.00 N/A -0.08 0.00 N/A -0.04 0.00 -0.02 0.04 0.01 -0.01 0.11 0.01 -0.01 -0.19 -0.01 -0.03 -0.10 0.01 -0.01 -0.04 0.00 -0.02

Data in eV. See Table 1 for descriptions of column headings.

exchange energies for the neon atom, which does not contain any unpaired electrons.41 In any case, the hypothesis is certainly physically plausible; the question is whether it is validated by the data. For the simple case of the B, C, N series, the above hypothesis fits the data perfectly. The unpaired electron in the B atom is overbound by 0.44 eV; the larger value versus Li may arise from the different shape of the 2p, as compared to the 2s, orbital. We postulate that the error in interactions between singly occupied orbitals, with identical spins, is +0.15 eV (i.e., the interaction energy of such a pair is underestimated, as compared to the exact result). Note that this parameter is reduced from 0.15 eV in the idealized case involving only atoms (considered in section this section and III.C) to 0.12 eV when this parameter is used to describe 2p or hybridized 2p spin-spin interactions in atoms and molecules, in sections III.D and E). Thus for the atom only case, the C atom correction is -0.44 eV + 0.15 eV, ) -0.29 eV, in agreement with the results for B3LYP errors

LOC Model to Correct IP and EA in First and Second Row Elements TABLE 3: Description of Column Headings in Table 4a column

abbreviation

description

value[a] value[b]

Intrinsic correction for the addition of an e- to the following orbital: K EA_AO_U2s an unpaired 2s orbital 0.26 N/A L EA_AO_U3s an unpaired 3s orbital 0.11 N/A M EA_AO_U2p an unpaired 2p orbital -0.09 -0.11 N EA_AO_E2p an empty 2p orbital -0.29 -0.20 O EA_AO_U3p an unpaired 3p orbital -0.06 -0.06 P EA_AO_E3p an empty 3p oribtal -0.08 -0.09 Q I J

Correction for interacting unpaired parallel spins from: SS_2p(-) 2p orbitals in anion -0.11 -0.07 SS_2p (same parameter as for IP) -0.15 -0.12 SS_3p (same parameter as for IP) -0.07 -0.07

a Values in eV. Note that two values for the parameters are shown. [a] is the idealized parameter value when considering only the atoms. [b] is the value when the entire dataset of atoms and molecules is considered, as some of the atom only parameters from [a] also serve as molecule parameters in [b].

in ref 5, reproduced in Table 2 (by construction, of course), However, ionization of the N atom removes two (out of three) of the half-filled-orbital interactions, leading to a correction of +0.30, and an overall result of -0.44 eV + 0.30 eV ) -0.14 eV, a value that is in perfect agreement with that in Table 2 for N, and the calculation of which did not involve any fitting whatsoever. B.3 Ionization of an Electron from a Filled Orbital in an Atom. We now turn to the Ne, F, and O atoms, where an electron is removed from a filled 2p orbital. For the Ne atom, when we remove an electron from a filled orbital, the resulting unpaired electron does not have any other unpaired electrons to interact with (so that the correction term for two unpaired electrons, postulated above, is not relevant). We take the resulting error, -0.21 eV, as a measure of the intrinsic error of removing an electron from a filled p orbital of a first row atom. The negative value of the error indicates that it is the neutral atom that is overbound relative to the ion. This result can be understood by recalling that the self-interaction error in a localized electron pair is supposed to represent the non-dynamical correlation of the pair. However, in an atom, there is no nondynamical correlation (using the heuristic definition of nondynamical correlation as electron correlation of two electrons placed in a bond between an atom pair, which is the core idea behind the model in ref 39 for atomization energies), hence the selfinteraction error in the neutral species (which contains one more filled pair than the ion) can lead to overbinding. This term is opposed in sign by the self-interaction error of the singly occupied orbital of the ion. The question of which term dominates depends on the details of the orbital sizes and shapes, magnitude of the gradient of the density in the orbital at various points, etc. The -0.21 eV value for Ne can be taken to be the difference in these two terms when a filled 2p pair is converted to a singly occupied 2p orbital via ionization. With this parameter in place, we can now calculate the correction terms for F and O without the use of further adjustable parameters, employing the same unpaired spin correction term used for C and N as discussed above. The ionized F atom has two unpaired electrons, versus only one such electron in the neutral atom; thus, a -0.15 eV correction for their interaction must be added to the -0.21 eV correction for ionization of the pair. The total, -0.36 eV, is in good agreement with the B3LYP error in Table 1 of -0.34 eV. The O+ ion has three unpaired spins, and hence two more unpaired spin pairs than the neutral O atom; the resulting total correction is -0.51 eV, close to the -0.55 B3LYP error in Table 2.

J. Phys. Chem. B, Vol. 110, No. 38, 2006 18791

TABLE 4: Parameter Assignments for Electron Affinities in Atomsa parameter K L M N O

P

data Q

I

J

Li 1 . . . . . . . . B . . . 1 . . -1 . . C . . . 1 . . -3 1 . O . . 1 . . . . 1 . F . . 1 . . . . . . Na . 1 . . . . . . . Al . . . . . 1 . . -1 Si . . . . . 1 . . -2 P . . . . 1 . . . 2 S . . . . 1 . . . 1 Cl . . . . 1 . . . .

(1) 0.26 -0.18 -0.11 -0.22 -0.13 0.11 -0.02 0.04 -0.21 -0.13 -0.06

(2)

(3)

0.00 N/A 0.00 -0.05 0.00 0.00 0.02 0.02 -0.04 -0.02 0.00 N/A -0.01 0.00 -0.01 0.00 -0.02 -0.01 0.00 0.01 0.00 0.00

a Data in eV. See Table 3 for descriptions of column headings. Columns (1), (2), and (3) are defined identically as in Table 1.

The last remaining first row ionization potential in the G3 set is for the Be atom. In this case, a 2s electron pair is converted by ionization into a singly occupied 2s orbital. As for the Ne atom, there are two competing correction terms, associated with the pair and the singly occupied orbital; in this case, the singly occupied orbital correction is (apparently) substantially larger, yielding a correction parameter of +0.20 eV. In fact, this value is comparable (in absolute value) to that for removing the unpaired 2s electron in Li, suggesting that overbinding of the 2s electron pair in the neutral Be atom is small. Further investigation will be required to understand in more depth how the self-interaction error differs for 2s, 2p, and other types of electron pairs. The above analysis yields the following parameters. (1) Two parameters for the 2s shell, one corresponding to removal of an unpaired 2s electron, the second representing removal of one electron from a filled 2s pair. (2) Three parameters for the 2p shell; the first two analogous to those for the 2s shell, and the third corresponding to a correction for the error in the interaction energy of two unpaired 2p electrons. The values of these parameters are summarized in Table 1. The 2s parameters are fit to one experimental data point each, and hence have no cross checks on their validity. The 2p parameters, in contrast, provide highly accurate results for 6 atoms, correcting quite large errors in a systematic fashion, and thus demonstrate the validity of the physical model and the robustness of the numerical parameter values. Analysis of the second row atoms proceeds in a similar fashion, yielding a further five parameters. Parameter values are presented in Table 1 while the predicted corrections emerging from use of these parameters are presented in Table 2. As in the case of first row atoms, there are no cross checks on the 3s parameters, but the 3p model again is remarkably accurate in predicting the correction terms for the ionization potentials of the second row atoms available in the G2 set. The success of the 2p/3p model for both first and second row atoms provides further confirmation that the underlying assumptions concerning the physics must be correct. The error pattern moving through Si, P, S, Cl, and Ar is identical to that for C, N, O, F, Ne, only with smaller values of the various correction terms (likely due to the smaller gradients associated with second row orbitals, although this needs to be investigated in more detail). It is worth noting that the error pattern in the ionization potential of atoms elucidated above is manifested, not only for B3LYP, but for all hybrid and gradient corrected functionals in Table 4 of ref 5; only the specific parameter values are different. Furthermore, the correction for the interaction of

18792 J. Phys. Chem. B, Vol. 110, No. 38, 2006 unpaired electrons for the BLYP functional is very close (within 0.02 eV) to that observed for B3LYP, whereas the values of this parameter for methods employing alternative correlation functionals (e.g., BP86, etc.) are quite different. The latter observation implies that the error in unpaired electron interactions is due to a problem in the correlation functional, and is more or less unaffected by the addition of a component of exact Hartree-Fock exchange. This is physically reasonable as it is the inaccurate description of the electron correlation between the two unpaired electrons that is the source of the error. In contrast, the intrinsic error associated with removing an electron from a singly or doubly occupied orbital depends on the selfinteraction terms in the functional, which in turn are strongly affected by the fraction of exact exchange that has been introduced (and hence the B3LYP and BLYP parameters for these errors are quite different). Thus, the entire set of error data for DFT ionization potentials of atoms in Table 4 of ref 5 is qualitatively consistent with the picture that we have outlined above. C. Electron Affinities of Atoms. We first briefly review some arguments in the literature concerning the applicability of DFT methods to computation of electron affinities of atoms.1 In principle, problems can arise if the DFT orbitals are assigned positive eigenvalues, which is possible if sufficiently large basis sets are used with pure DFT functionals. However, in the present study, we focus on the hybrid B3LYP method, for which positive eigenvalues have not been observed; furthermore, in practice, the use of standard quantum chemical basis sets appears to obviate the potential theoretical problem noted above, even for GGA type functionals such as BLYP.42 Hence, in what follows, we do not consider this issue further. As in the case of ionization potentials, we distinguish correction parameters for the following three types of interactions: (1) addition of an electron to an empty orbital; (2) addition of an electron to a half filled orbital; (3) interactions between unpaired spins. Anion wave functions (and, ergo, densities) are significantly different from cation and neutral wave functions, due to the additional electron repulsion introduced by the extra electron. For this reason, we do not expect parameters involving anionic energies to be identical to those developed for ionization potentials (e.g., correction of the addition of an electron to an empty orbital; computation of this parameter requires estimation of the error in the resulting anion DFT energy). However, parameters involving only the neutral atom, such as the correction for the energy of interaction of two unpaired electrons in the neutral atom, should be transferable from fitting of the ionization potential data; indeed, the validity of such transferability can be viewed as a test of the parametrization scheme. Before commencing with the parametrization, it is necessary to make sure that the calculated B3LYP deviations from experiment are converged. This is a problem only for anions, as the size of the basis set required for convergence can be substantially larger than that needed for neutrals or cations. A basis set that is too small will then lead to unbalanced results, in which the anion is underbound as compared to the neutral atom. While Pople and co-workers in ref 5 (which we have used for the ionization potential data; we have also used calculations by this group in our studies of DFT atomization energies) use a reasonably large basis set for their electron affinity calculations (6-311+G(3df,2p)), it apparently is not sufficient to obtain converged results. In ref 1, ultra large basis sets are employed, and considerably different results (differing by as much as ∼0.1 eV, a highly nontrivial difference in the present context) for

Knoll and Friesner atomic electron affinities are obtained. Galbraith and Schaefer investigated the effects of basis set on DFT computation of electron affinities for the F atom, and observed relatively rapid convergence (and, in fact, for F, the ref 42 value is in close agreement with the results of ref 1 with regard to the DFT computation of the electron affinity, although the experimental value cited there differs by 0.03 eV). However, in the F atom, the p electrons see a very high effective nuclear charge, so the contribution of diffuse basis functions is expected to be minimal. In contrast, for the B atom, there is a quite large difference between the results of refs 5 and 1, indicating a substantially greater diffuse character in the orbitals. We have recalculated all of the neutral and anionic wave functions and energies for the atomic test cases in ref 5 using both the basis set of ref 5 (6-311+G(3df,2p)), and the Dunning correlation consistent basis sets cc-pVTZ++ and cc-pVQZ(-g)++ of references 43 to 45. Our results using the 6-311+G(3df,2p) basis set agree very closely with that of ref 5 (and, consequently, exhibit nontrivial disagreement with ref 1 for a number of cases, such as the O atom). For the atoms treated in ref 1, the cc-pVTZ++ and cc-pVQZ (-g)++ both agree very well with the ultra large basis set calculations therein, with a maximum error of 0.02 eV and an average error of ∼0.01 eV for both basis sets. There are a few atoms included in the test set in ref 5 that are not examined in ref 1. Agreement between the cc-pVTZ++ and cc-vVQZ++ (-g) results is very good in all cases other than the Na and Li atoms, where the deviations are ∼0.1 eV. These atoms have highly diffuse valence orbitals (more so than any other atoms considered here) and so presumably need larger basis sets to converge the energy of the anion. However, these atoms do not appear in any of the molecules in the remainder of the test set for electron affinities in ref 5. Therefore, for reasons of computational efficiency, we employ the cc-pVTZ++ basis set for calculations of all molecular electron affinities. We begin with an analysis of first row atoms, starting with the F atom. The best fit value of the correction term to add an electron to the half filled orbital in F is -0.09 eV; no other terms contribute to this calculation. The negative sign of the correction term indicates that the anion is overbound relative to the neutral; as discussed previously, the B3LYP functional is designed to use self-interaction error in localized electron pairs to represent nondynamical correlation in molecular bonds, and when, in at atom, there is no nondynamical correlation, one can expect overbinding of the pair. The precise value depends on a complex combination of factors, which we shall not dissect further here. In the O atom, a half filled orbital is filled as in F, but, additionally, interaction between two unpaired electrons in the neutral is eliminated in the anion. The value of this interaction should be identical to what was used above in correcting ionization potentials, since it is a property of the neutral atom only; the result is to contribute a correction of -0.15 eV to the oxygen atom electron affinity. The calculated value, -0.22 eV, is in satisfactory agreement with the value of the B3LYP error of -0.22 eV in Table 5 of ref 1 (but not, it should be noted, with the error reported in ref 5 of -0.14 eV, using an insufficiently large basis set). This agreement increases confidence that the physical model for corrections discussed in the present paper has captured the essential physics correctly. Turning to the B atom, we see that two parameters should contribute to the correction: first, a parameter for adding an electron to an unfilled orbital, and second, a parameter for the interaction of two unpaired spins in the anion. For cations, we

LOC Model to Correct IP and EA in First and Second Row Elements

J. Phys. Chem. B, Vol. 110, No. 38, 2006 18793

TABLE 5: Parameters for the Full B3LYP-LOC Modela #

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

23 24 25 26 27 28 29 30 31 32 33 34

35 36 37

38 39 40 41

abbreviation

description

IP Parameters A. Intrinsic correction for the removal of e- from the following atomic orbital: IP_U2s_A unpaired 2s in free atom IP_P2s_A paired 2s in free atom IP_U2p_A unpaired 2p in free atom IP_P2p_A paired 2p in free atom IP_U3s_A unpaired 3s in free atom IP_P3s_A paired 3s in free atom IP_U3p_A/M unpaired 3p or hybrid 3p in free atom or molecule IP_P3p_A/M paired 3p or hybrid 3p in free atom or molecule IP_U2p_M unpaired 2p hybridized in molecule IP_P2p_M paired 2p hybridized in molecule B. Intrinsic correction for the removal of e- from the following bond: IP_B_A1-H_P A-H bond (polar & single), where A)first row IP_B_A1-H_NP A-H bond (nonpolar & single), where A)first row IP_B_A2-H A-H single bond, where A) second row IP_B_A-B A-B single bond, where A,B ) first, second row, not H, F IP_B_A-F A-F single bond, where A ) first, second row atom IP_B_M_Short short multiple bond such as CtC, CdC, CtN, CdN, CdO, NtN IP_B_M_Long long multiple bond such as PtP, CdS, SitSi, C-C in aromatic rings. C. Correction for the delocalization of positive charge: IP_D_SB_A-H in single bond through neighboring e- density from A-H bonds IP_D_SB_A-F in single bond through neighboring e- density from A-F bonds IP_D_SB_A-B in single bond through neighboring e- density from A-B bonds IP_D_A_A-H on an atom through neighboring e- density from A-H bonds IP_D_A_A-B on an atom through neighboring e- density from A-B bonds EA Parameters A. Intrinsic correction for the addition of an e- to the following atomic orbital: EA_AO_U2s unpaired 2s orbital in an atom EA_AO_U3s unpaired 3s orbital in an atom EA_AO_U2p unpaired 2p orbital in an atom EA_AO_E2p empty 2p orbital in an atom EA_AO_U3p unpaired 3p orbital in an atom EA_AO_E3p empty 3p orbital in an atom B. Intrinsic correction for the addition of an e- to the following molecular orbital: EA_A1_Close closed shell first row atom EA_Ep empty p or p-hybridized shell in an atom EA_R1_noMB first row atom with localized radical and no adjacent multiple bonds EA_R2_noMB second row atom with a localized radical and no adjacent multiple bonds EA_R_MB first & second row atom with a localized radical AND adjacent multiple bond C. Negative charge delocalization: EA_RD(-) resonance delocalization of negative charge in anion PARAMETERS SHARED BETWEEN IP & EA A. Correction for interacting unpaired parallel spins from: SS_2p 2p and hybridized orbitals on atom in cation or neutral SS_3p 3p and hybridized orbitals on atom in cation, neutral or anion SS_2p(-) 2p and 2p hybridized orbitals on atom in anion NON-OPTIMIZED PARAMETERS SHARED BETWEEN IP & EA A. Values from atomization energies in ref. [39] CT charge transfer - for neutrals and cations RH R-H radical environmental correction RA R-A radical environmental correction RT RT triple bond radical environmental correction

value [a]

value [b]

N/A N/A -0.44 -0.25 N/A N/A -0.02 -0.02 -0.11 -0.20

N/A N/A -0.44 -0.25 N/A N/A N/A N/A -0.07 -0.19

-0.03 0.00 0.04 0.09 -0.03 0.10 0.19

N/A N/A N/A 0.08 N/A 0.10 0.19

0.01 0.10 0.11 0.02 0.09

0.02 0.10 0.11 N/A 0.08

N/A N/A -0.11 -0.20 -0.06 -0.09

N/A N/A -0.11 -0.20 -0.06 -0.10

-0.29 -0.05 -0.01 -0.03 -0.06

-0.29 -0.05 N/A N/A -0.06

-0.10

-0.11

-0.12 -0.07 -0.07

-0.12 -0.07 -0.07

-0.20 -0.02 -0.07 0.11

-0.20 -0.02 -0.07 0.11

a The parameters are for all ions in the G2 data set except the three molecules with quantitatively incorrect wavefunctions and except atoms and molecules with electronic structures that appear only once in the data set. “N/A” in value column [a], the extensive fit, indicates the one-off parameters for unique electronic structures are not used. Additional “N/A” in value column [b], the minimalist fit, indicates that these parameters were eliminated from the extensive fit without significantly affecting the overall fit. Values in eV.

kept the correction parameter for the interaction of two unpaired electrons equal to its value in the neutral; this was clear from the data for both first and second row atoms. However, for anions, we expect more of a change in the wave function, particularly for atoms to the left of the periodic table (e.g., B and C, vs for example F) where the effective nuclear charge seen by p electrons is substantially diminished for lighter atoms, and hence the wave functions will become more diffuse (as argued above) upon addition of an excess electron. We can get some idea of the effect of increased diffuseness in the density on the unpaired electron interaction parameter by considering values for the neutral atoms for the first and second rows. The

second row value is significantly diminished; we argued above that this was likely due to the smaller density gradient for the valence electrons. Thus it is plausible that the unpaired electron correction for anions should be smaller than the value for cations and neutrals. We cannot resolve this question by considering the B atom alone, as there are two parameters and only one data point. However, if we simultaneously fit corrections for the B and C atoms, we can determine both parameters (keeping the neutral unpaired electron parameter, which contributes to the C atom correction, at its previously defined value), and we obtain the results presented in Table 4. The value obtained for the anion

18794 J. Phys. Chem. B, Vol. 110, No. 38, 2006 unpaired electron interaction term, 0.11 eV is, as expected, diminished from that determined for first row neutrals, but not as small as that obtained for second row neutrals. Lack of adequate experimental data (e.g., one cannot add an additional electron to the N atom) prevents a rigorous cross-check on these parameters; however, they are certainly consistent with the physical picture of the problem presented in this paper. In addition, both the B3LYP calculated energy for the N atom (-0.214 eV) and the LOC corrected energy (-0.328 eV) predict a negative electron affinity in agreement with the inability to experimentally observe the addition of an electron to the N atom (data not shown in table). Hence, for the moment, we must regard these results as satisfactory. Analysis of the second row data proceeds in a similar fashion; parameters and results are presented in Tables 3 and 4, respectively. For the second row atoms, no adjustment of the unpaired electron interaction correction is required, suggesting that the in the second row, the conversion of the neutral to an ion does not affect the gradient of the wave function as significantly as it does in the first row. The repeating pattern for the left and right sides of the period are in this case manifested very precisely; the success of the same model in quantitatively reproducing these patters again increases confidence that the model has a close correspondence to physical reality. D. Ionization potentials of molecules. D.1 OVerView. Our treatment of atoms in section B has identified the following sources of error in the DFT treatment of electrons in those atoms, as manifested in discrepancies between calculated and experimental ionization potentials. (1) Intrinsic error in removing an electron from a filled, or half filled, orbital. This error is primarily due to discrepancies between the self-interaction error attributable to the electron in the orbital in question, and the nondynamical correlation it is supposed to represent (per our previous analysis). (2) Error due to interaction of two electrons in half filled orbitals with parallel spins. From the results above, this error can be accurately specified for free atoms as 0.15 eV for first row atoms, and 0.07 eV for second row atoms. These two sources of error can be expected to dominate in molecules as well. While one anticipates that the magnitude of the unpaired spin error might be modified in the molecule, due to the fact that the orbitals are different from those in the atom (e.g., spn as opposed to p), the situation with regard to the intrinsic error is significantly more complicated. The key issue is how the nondynamical correlation of the electron which is being removed depends on the local environment of the bond or atom from which it is being removed, as well as the reorganization of electron density in the ion after removal. It is the balance between nondynamical correlation and self-interaction error which determines the intrinsic error in the ionization potential. In principle, it should be possible to explore in detail accurate ab initio wave functions, as well as DFT densities, for a wide range of molecules, and achieve an understanding as to how the balance between these terms could be understood directly from the electron density. However, this is an extremely complex task, probably more difficult than achieving a similar understanding concerning the errors in bond energies. Therefore, in the present paper, we adopt a heuristic approach, in which we examine the ionization potential data for systematic errors, classify environments of the ionized electron into groups, and develop a parametrization of the errors based on systematic behavior within the various groups. As will be clear from the discussion below, the classification scheme is for the most part

Knoll and Friesner relatively straightforward, and the parameters generally admit to a physical interpretation that is consistent with the LOC theory as we have developed it in ref 39 and above. However, a deeper level of explanation is clearly desirable, and will be pursued in subsequent work. The present model nevertheless should be very useful in practical applications, given the level of accuracy and robustness that is achieved. D.2 Classification of G2 Test Cases Into Groups. The first step in calculating corrections for DFT ionization potentials is an assignment of the cation to a localized orbital description. Assignments of multiplicities of both the neutral and cationic species are provided in ref 5. However, determination as to the dominant orbital the electron is removed from, in cases where there are several different alternatives consistent with the multiplicity, can in some cases be nontrivial. As was mentioned above, lone pairs on second row atoms can have relatively low ionization energies, which compete effectively with single and even double bond orbitals. Our assignments are presented in Figure 1. While a few assignments may be open to question, the great majority seem fairly clear based on the experimental value of the ionization potential of the molecule as well as those of the constituent atoms, as well as comparisons when necessary with simpler molecules in the database. We classify the test cases in the G2 data set into the following categories. (1) Removal of an electron from a single (σ) bond in a closed shell molecule. (2) Removal of an electron from a π bond in a closed shell molecule; (3) Removal of an electron from a neutral radical. These cases further subdivide into cases where (a) the unpaired electron is removed, leaving a closed shell cation; (b) the electron is removed from the lone pair of the atom on which the radical resides, yielding a triplet species where the two unpaired spins are primarily localized on one atom; (c) the electron is removed from a bond, yielding a triplet species where on electron is predominantly localized on an atom, and the second is predominantly localized on a bond. Furthermore, for cases of this type, separate parameters need to be developed when the initial radical is localized on a first row or second row atom. (4) Unusual systems such as O2, which do not fall into any of the above categories. (5) Systems where the B3LYP wave function is qualitatively inaccurate. Each of these classes of molecules will be discussed below in what follows. D.3 RemoVal of an Electron from a Single Bond. We begin with closed shell molecules where an electron is removed from a single bond. A list of all molecules in this category is presented in Table 6B. Unlike an electron pair in a free atom, an electron in a localized bond pair has a significant amount of nondynamical correlation, hence one would not expect to see the large overstabilization of the neutral, as compared to the ionized species, that is observed for atoms as is discussed in detail above. What we in fact find is that the ion can either be overbound or underbound as compared to the neutral, depending upon the type of bond and the local environment. These observations are in full accord with the analysis of atomization energies in ref 39. There, the correction term for each bond also depended upon the bond type and local environment. As in ref 39, we classify bonds into types and determine correction factors for each type by fitting the experimental data. We use fewer bond types in the present model than in ref 39, primarily due to the lack of sufficient experimental data. A further compression of bond types would in fact result in a relatively small increase in the RMS error of the least-squares fit; however, the patterns in the errors are clear and we see no reason to reduce the number of parameters when there is a clear physical justification for the parameter types. Figure 3 plots the

LOC Model to Correct IP and EA in First and Second Row Elements

J. Phys. Chem. B, Vol. 110, No. 38, 2006 18795

Figure 1. IP; Valence bond structures of molecular cations in the G2 dataset. The letters B-I correspond to the lettered categories in Table 6 that indicate from which localized orbital in the neutral molecule the electron is removed.

bond length vs the intrinsic correction parameter for the 5 bond types in Table 5 (IP.B parameters #11-14, 16, 17; #15 excluded because of the odd behavior of fluorine); it can be seen that the correlation is reasonable. As the bond length increases, the correction factor becomes more positive, which means that the ion is overstabilized relative to the neutral molecule. This is consistent with the idea of ref 39 that the nondynamical correlation in a single bond is underestimated by a larger amount by the DFT functional as the bond becomes longer. If the total energy of the neutral molecule is too high compared to the ion, the calculated ionization potential will be too small, and the error (reported as experiment - theory) will be increasingly positive. Thus, for this large class of examples, qualitative consistency with the atomization energy study of ref 39 has been established. The total correction factor should also depend on the local environment of the bond from which the electron is removed. We showed in ref 39 that the presence of long single bonds attached to one of the atoms defining the target bond served to increase the underestimation of the nondynamical correlation energy of the electron pair in the bond. Additionally, heavy atom neighbors were shown to overstabilize radicals. In the present case, these effects both go in the same direction (one leads to underbinding of the neutral, the other leads to overbinding of the ion). As the electron in question is being removed from a bond rather than an atom, the environment of the remaining

electron is not exactly the same as the neutral radicals we have treated previously, so, rather than attempt to employ those parameters, we redefine environmental parameters specifically for the present case. Table 5 (section IP.B, parameters #1820)) presents parameters for three different types of environmental bonds which have been optimized to fit the data in Table 6. As in the case of the intrinsic correction parameters, the values become more positive as the neighboring bonds become longer; furthermore, the magnitudes of these correction factors are of the same order (in fact, within a few tenths of a kcal/mol) of the analogous environmental correction factors in ref 39. These results are again encouraging with regard to the consistency of the physical picture implied by the present approach with the results of ref 39. An unusual system, which we have not included in the parametrization in Table 5, is removal of an electron from the σ orbital of the N2 molecule. Because there is only one example of this type in the database, it seems pointless to devise a parametrization without having any idea of its domain of applicability. However, the large negative error of -0.27 is consistent with the fact that the bond is very short considering that the atom pair are both heavy atoms (as opposed to one being a hydrogen). A systematic investigation of excitations of this type, using accurate theoretical methods, would likely enable accurate correction parameters to be developed.

18796 J. Phys. Chem. B, Vol. 110, No. 38, 2006

Knoll and Friesner

TABLE 6: Experimental Ionization Potentials and Theoretical Deviations from Experiment (expt. - theory) of the G2 Dataset for G2 Theory, B3LYP, and B3LYP-LOC in eV deviation molecule 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53

Expt.a

G2b

A. Remove e- from Atoms 8.30 0.10 -0.44 11.26 0.08 -0.29 14.54 0.06 -0.14 13.61 0.08 -0.55 17.42 0.03 -0.34 21.56 -0.05 -0.21 5.98 0.05 -0.04 8.15 0.05 0.04 10.49 0.04 0.11 10.36 0.16 -0.19 12.97 0.12 -0.10 15.76 0.07 -0.04 B. Remove e- from Single Bond CH4 12.62 -0.06 0.06 NH3 10.18 -0.01 -0.01 OH2 12.62 -0.01 0.00 FH 16.04 -0.05 -0.06 SiH4 11.00 -0.01 0.09 PH3 9.87 0.00 0.03 2 SH2 A1 cation 12.78 0.03 0.11 ClH 12.75 0.03 0.00 11.50 -0.01 0.11 Cl2 ClF 12.66 0.01 0.05 BF3 15.56 -0.05 0.10 BCl3 11.60 -0.06 0.28 B2F4 12.07 0.32 0.56 CH3OH 10.85 -0.11 0.19 CH3F 12.47 -0.23 0.11 CH3SH 9.44 -0.02 0.09 CH3Cl 11.22 -0.08 0.07 CH3OF 11.34 -0.06 0.15 B2H4 9.70 0.07 0.18 HOF 12.71 0.00 0.07 Si2H6 9.74 0.04 0.19 C2H5OH 10.47 -0.18 0.24 Thiirane 9.05 -0.02 0.11 CH3O 10.73 -0.06 0.09 C. Remove e- from Multiple Bond 11.40 -0.02 0.16 C2H2 C2H4 10.51 -0.07 0.15 2 N2 Π cation 16.70 0.03 0.09 10.53 -0.01 0.21 P2 CO2 13.77 0.07 0.12 cyclopropene 9.67 -0.09 0.24 benzene 9.25 -0.08 0.16 toluene 8.83 0.22 furan 8.83 -0.08 0.13 pyrrole 8.21 0.12 0.15 phenol 8.51 0.18 aniline 7.72 0.20 Si2H2 8.20 -0.08 0.15 Si2H4 8.09 -0.03 0.19 CH3CHO 10.23 -0.08 0.12 9.59 -0.13 -0.02 N2H2 CH2dCdCH2 9.69 -0.04 0.26 B C N O F Ne Al Si P S Cl Ar

deviation

B3LYPc B3LYP-LOCd 0.00 0.03 0.06 -0.06 0.03 0.04 -0.04 -0.03 -0.03 -0.05 -0.03 -0.04 0.01 -0.04 -0.02 -0.06 0.04 0.00 0.09 0.00 0.03 0.05 -0.10 -0.02 0.08 0.04 -0.02 -0.06 -0.07 -0.08 0.03 -0.03 0.01 0.10 -0.01 -0.04

molecule 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79

Expt.a

G2b

B3LYPc B3LYP-LOCd

D. Remove e- from First Row Radical CH3 9.84 0.06 -0.13 C2H5 8.12 0.05 -0.10 CH2OH 7.55 0.10 -0.13 CH2SH 7.54 0.12 -0.13 7.61 0.07 -0.30 N2H3 sec-C3H7 7.37 -0.03 0.00 CHO 8.14 0.04 -0.38 E. Remove e- from Second Row Radical PH 10.15 0.06 -0.02 PH2 9.82 0.10 -0.12 SH 10.37 0.06 -0.10 SiH3 8.14 0.08 -0.05 Si2H5 7.60 -0.01 -0.08 SiH2 9.15 -0.02 0.10 F. Remove e- from S 2 SH2 B1 cation 10.47 0.04 0.04 OCS 11.17 0.01 -0.02 CH2S 9.38 0.00 0.09 CS2 10.07 -0.03 0.05 G. Remove e- from an Atom Pair CO 14.01 0.00 -0.13 SC 11.33 -0.09 -0.11 OH 13.01 0.03 -0.23 NH 13.49 0.08 -0.20 NH2 11.14 -0.04 -0.20 CF2 11.42 -0.03 0.07 H. Remove e- from Triplet 10.40 0.08 -0.02 CH2 I. Strange Molecules 12.07 -0.10 -0.79 O2 9.36 0.08 -0.22 S2

0.01 0.01 -0.02 -0.02 -0.02 0.08 -0.01 -0.07 -0.08 -0.05 0.02 0.04 0.06 0.00 -0.09 0.02 -0.02 -0.03 -0.01 0.06 -0.03 0.06 -0.05 -0.02 -0.08 0.05

0.06 0.05 -0.01 0.02 0.02 0.05 -0.03 0.03 -0.06 -0.04 -0.01 0.01 -0.04 0.00 0.02 -0.12 0.07

a Experimental values as reported in ref 5. b Deviation of Gaussian-2 theory as reported in 5. c Deviation of DFT using B3LYP functional and 6-311+G(3df,2p) basis set, as reported in ref 5. d Deviation of B3LYP after corrections using the “minimalist” set of 22 parameters as defined in Table 5, column b.

The final correction factor is obtained by adding together the intrinsic and environmental correction factors. The results for this data subset of ionized single bonds are presented in Table 6B. The mean absolute error is 0.041 eV and the maximum absolute error is 0.10 eV for ethanol (C2H5OH). The average error is comparable with that of G2 theory, and a major improvement over the results of uncorrected B3LYP. B3LYPLOC has a smaller maximum error than and produces fewer outliers than that observed for G2 theory.

D.4 RemoVal of an Electron from a π Bond. We continue by examining cases where an electron is removed from a π bond. The list of test cases of this type is listed in Table 6C. In ref 39, no environmental factors were employed for multiple bonds; similarly, we have not attempted to optimize any such parameters here. We define two intrinsic parameters based on the bond length (“short” or “long”); parameters are listed in Table 5 as #16 and #17 for short and long, respectively. The errors for each test case are reported in Table 6C. The absolute mean for

LOC Model to Correct IP and EA in First and Second Row Elements

J. Phys. Chem. B, Vol. 110, No. 38, 2006 18797

Figure 2. EA; Valence bond structures of molecular anions in the G2 dataset. The letters B-E correspond to the lettered categories in Table 7 that indicate to which localized orbital in the neutral molecule the electron is added.

this subset is 0.037 eV. Again, errors are comparable to those of G2 theory, and are physically reasonable in magnitude. There is a single large outlier, N2H2, for which the correction parameters appear to make the error worse. In this case, the G2 error of -0.13 eV is very close to the corrected B3LYP-LOC error of -0.12 eV, suggesting that the problem may lie in the precision of the experimental data. Higher level theoretical calculations could settle this issue definitively. D.5 RemoVal of an Unpaired Electron from an Atom in a Molecule. We next consider ionization of radical species in which an unpaired electron is removed from the atom on which it is localized. Most of the examples in sections D and E in Table 6 are doublet radicals, although there are a few triplets (CH2, O2, S2). We will initially discuss first row atoms, and then proceed to second row atoms, for which the parametrization is similar in structure but with different values of the parameters. We begin by considering doublet radicals where removal of the unpaired electron produces a singlet cation. First row examples of this type are listed in Table 6 section D. The first step is to calculate correction terms for the neutral radical as determined in ref 39. These terms correct for overbinding of the neutral, and hence contribute negative values to the overall correction. The second question is the intrinsic error associated with removal of the electron. The best fit value we obtain, -0.11 eV (parameter #9 in Table 5), is considerably smaller than the -0.44 eV obtained for neutral first row atoms. The dramatic reduction in the intrinsic error implies that the unpaired electrons in the molecules in Table 6 experience considerable nondynamical correlation, in contrast to the analogous unpaired electron in the free atom. The obvious explanation is that the orbital of the unpaired electron is delocalized toward atoms bonded to the atom on which the unpaired electron resides, resulting in dynamical correlation effects similar to those observed for bonded electron pairs. Detailed examination of accurate ab initio wave functions will be required to elucidate this important issue further. From an empirical point of view,

the use of a single correction factor appears to work reasonably well for the various test cases in Table 6. A third correction factor arises due to the fact that bond lengths in the cation will contract, and the electron density in the bond will increase due to the positive charge on one of the atoms comprising the bond. Both of these factors lead to an overestimation of the nondynamical correlation in the bond by the B3LYP functional (as discussed in detail in ref 39), and thus to overstabilization of the cation state. The effect is more pronounced for heavy atoms than hydrogens, and affects single bonds much more strongly than multiple bonds (which already are rather short, and have a complicated nondynamical correlation in the interatomic region due to the presence of the π as well as σ bond). We have therefore defined two positive correction parameters, one for hydrogens, and one for heavy atoms making single bonds to the cationic center. Optimized values of these parameters are given in Table 5 section IP.C as parameters #21 and 22. Finally, there are two cases in Table 6 where an unusual electronic structure is generated by formation of the radical, creating a situation that is analogous to the ionic bonding (represented by the “charge transfer” correction parameter) discussed in ref 39. Valence bond configurations of the two molecules, CHO and N2H3, are shown in Figure 4. In both cases, a lone pair on a heavy atom can achieve good overlap with a partner atom via an energetically favorable mechanism. For CHO, the sp2 hybridized lone pair on the oxygen has direct access to the empty sp2 orbital on the carbon. For N2H3, a resonance structure enables the nitrogen lone pair to approach the nitrogen cation. In general, low energy resonance structures are identified when multiple acceptable Lewis structures of the molecule can be drawn. These arguments suggest that the nondynamical correlation of the heavy atom lone pair will be substantially larger than it would be in an isolated atom, or in a normal molecular bonding situation, analogously to lone pairs in with formal charge transfer such as CO (analyzed in detail

18798 J. Phys. Chem. B, Vol. 110, No. 38, 2006

Knoll and Friesner

Figure 3. Bond length vs the intrinsic correction parameter for ionization potentials of a collection of typical bonds. Note that the parameter IP_B_A1-H_NP (intrinsic ionization potential parameter for the removal of an electron from an A-H nonpolar bond, where A is a first row element, such as C-H, or N-H) is not present in the final model because inclusion of this value so close to 0 does not improve the model. This parameter is included here to demonstrate the trend of increasing bond lengths and increasing correction parameters.

Figure 4. Valence bond structures of CHO+ and N2H3+.

in ref 39). As a crude initial approximation (due to lack of sufficient examples), we apply the same “CT” correction parameter to these cases as we did to CO, and other such molecules, in ref 39. The correction is negative because the effect implies an underestimation of the nondynamical correlation in the cation species. As can be seen from Table 6, the large negative correction term provides much improved agreement with the observed B3LYP deviations from experiment, confirming the hypotheses discussed above. Note that these are the only two molecules in the data subset where structures of this type can be found. The sum of these corrections is presented in Table 6 section D, and demonstrates uniformly good agreement with experiment, eliminating a significant number of large, systematic errors for molecules of this type. The data subset is rather small given the number of parameters required, and more test cases would clearly be valuable in assessing the robustness of the parametrization. When there are two unpaired electrons on an atom, the same parameters can be used to model the removal of one of them, with the addition of a reparametrization of the unpaired spin correction (due to e.g., the occupation of sp3 orbitals by the electrons). There is only one example of this type for a first

row atom, CH2. The resulting parameter value for the unpaired spin correction represents a significant, but reasonable, reduction from the atomic value. Finally, Table 6 parts E and F presents results for cases (in both categories discussed above) where the electron is removed from a second row atom. No new parameters are actually required to fit the second row data in the table. The intrinsic and unpaired electron interaction parameters can be taken directly from the atomic parameters (suggesting that second row radicals have significantly less nondynamical correlation than those in the first row), while the environmental correction parameters employed for the first row can be used here as well. D.6 RemoVal of an Electron from a Lone Pair of an Atom in a Molecule. The only new parameter required to treat these cases is the intrinsic correction for first row atoms. The best fit value, -0.20, is a modest reduction from the atomic value of -0.25. Two interesting cases, CO and CS, involve the elimination of a CT correction upon removal of the electron (the electron is taken from the lone pair on the carbon atom, which was created there as a result of the formal charge transfer required to satisfy the octet rule). The success in reproducing the corrections for these two cases is an encouraging confirmation of the CT parametrization. Note also that the radical correction parameter for these cases is negative (rather than positive) because having a radical on a triply bonded system leads to underbinding, as was derived in ref 39. The overall results for this challenging set of cases are satisfactory for both first and second row molecules.

LOC Model to Correct IP and EA in First and Second Row Elements D.7 Other Cases: Unusual Molecules, and Molecules where the DFT WaVe Function is QualitatiVely Incorrect. We have not included hydrogen or helium in our atomic parametrization, because these are one-off cases which have no relevance to more complex systems. We discussed the case of removing an electron from the σ bond of N2 above; this is an interesting type of data to model, but the amount of such data is not sufficient in ref 5 to warrant parametrization in the present paper. Finally, the B3LYP electron density for the CN cation is qualitatively incorrect, as is discussed in ref 5. While it is important to be aware that such cases exist, the results have no bearing on the validity of the LOC correction approach for molecules where a grossly incorrect wave function is not produced. Clearly one cannot expect post hoc corrections of the energy to succeed if the DFT electron density has converged to the wrong state. Finally, for CN-CN, there is a large discrepancy between the LOC correction and the B3LYP error; it is natural to suspect that, given the results for CN, there is a problem with the wave function for CN-CN as well. We report overall errors in the discussion below with and without the various cases discussed here; our belief is that the latter are more meaningful as a calibration of the promise of the methodology, but we provide the former for completeness anyway. E. Electron Affinities of Molecules. As in the case of ionization potentials, we divide the G2 test suite into a number of different classes. Parameters are then developed for each class, as above. Figure 2 depicts our assignments of localized orbitals for the molecular anions in the dataset. E.1 Addition of an Electron to a Closed Shell Molecule. There are two cases of interest. First, an electron can be added to a molecule with a full octet. In this case, the wave function of the additional electron is very diffuse and significant overbinding of the anion is observed in all cases. The origin of this overbinding may be related to problems that many DFT functionals such as B3LYP have with diffuse wave functions, due to incorrect asymptotic behavior, as is analyzed for example in ref 46. Improvements to hybrid functionals to remedy this defect have been proposed;46 for the moment, however, a simple uniform correction for addition to a full octet, based on averaging the errors of the molecules in Table 7 section B, provides a very substantial improvement (although there still are significant errors, e.g., for ozone). The second case is addition of an electron to the empty orbital of the p shell of an atom where the remaining electrons are paired. Here, there are two relevant correction terms; the intrinsic correction for adding an electron to an unfilled p orbital, and the radical corrections of the resulting odd electron system. For the latter, we use the values determined in ref 39 for neutral molecules; an intrinsic correction factor for first row atoms (there are no second row examples of this type) is optimized using the data in Table 7 section C and presented in Table 5 section EA.B (#30, EA_Ep). The optimized value is significantly smaller than that for addition of an electron to the unfilled p orbital in a free second row atom, indicating that the nondynamical correlation of the additional electron in the anion is significantly larger than that in the corresponding additional electron in the anion of a free atom. The observation that radical species experience nondynamical correlation in molecules is consistent with the results discussed above for the ionization potential data. E.2 Addition of an Electron to a Doublet Radical. Addition of an electron to the half filled orbital of a doublet radical creates a closed shell anion. The two most obvious effects of this addition with regard to correction terms are that the initial radical

J. Phys. Chem. B, Vol. 110, No. 38, 2006 18799

TABLE 7: Experimental Electron Affinities and Theoretical Deviations from Experiment (expt. - theory) of the G2 Dataset for G2 Theory, B3LYP, and B3LYP-LOC in eV deviation molecule

expt.a

G2b A. Atoms -0.08 0.01 0.06 0.11 0.07 0.07 0.03 0.09 0.09

B3LYPc

B3LYP-LOCd

-0.13 -0.06 -0.22 -0.21 -0.13 -0.11 0.04 -0.18 -0.02

-0.02 0.00 0.02 -0.01 0.01 0.00 0.00 -0.05 0.00

-0.46 -0.64 -0.41 -0.15 -0.34 -0.30

-0.18 -0.15 0.07 0.13 -0.06 0.18

1 2 3 4 5 6 7 8 9

F Cl O P S C Si B Al

3.40 3.62 1.46 0.75 2.08 1.26 1.39 0.28 0.44

10 11 12 13 14 15

Cl2 O3 SO2 CH2S HNO S2O

B. Closed Shell 2.39 0.01 2.10 0.04 1.11 -0.05 0.47 0.08 0.34 0.09 1.88 -0.04

16 17 18 19

C. Add e- to Empty Orbital in Same Shell LiH 0.34 0.02 0.02 0.09 0.18 0.09 -0.24 -0.04 CF2 HCF 0.54 0.08 -0.21 -0.06 CH2dCdC 1.79 0.05 -0.11 0.01

20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

CH3 NH2 OH SiH3 PH2 HS OF CH3O CH3S HO2 CH3CH2O CH2CHCH2 CH2CN CH2CHO NO CN PO C2H C2H3 CH2dCdCH HCO CH3CO CHCO NCO CH3CH2S NO2

D. Add e- to Doublet 0.08 0.04 0.07 0.77 0.00 0.04 1.83 -0.04 -0.01 1.41 -0.01 0.03 1.27 0.02 0.02 2.36 0.06 0.02 2.27 -0.03 -0.02 1.57 -0.05 0.07 1.87 0.00 0.04 1.08 -0.03 0.07 1.71 -0.10 -0.08 0.47 -0.05 0.02 1.54 -0.04 0.00 1.82 -0.05 -0.05 0.02 0.09 -0.29 3.86 -0.11 -0.19 1.09 0.05 -0.23 2.97 -0.12 -0.12 0.67 -0.08 0.02 0.89 -0.10 -0.08 0.31 -0.03 0.01 0.42 -0.02 0.08 2.35 -0.01 0.06 3.61 -0.01 0.07 1.95 -0.01 0.05 2.27 -0.07 0.05

0.00 0.00 -0.03 -0.04 -0.03 0.00 0.01 -0.01 -0.04 0.00 -0.15 0.01 -0.01 -0.05 -0.05 -0.02 0.01 0.05 -0.03 -0.01 -0.03 -0.01 -0.01 -0.01 -0.03 0.08

46 47 48 49 50 51 52 53 54 55

C2O CH2NC CH2 NH O2 PH S2 CH SiH SiH2

E. Unpaired e2.29 -0.04 1.06 -0.12 0.65 -0.01 0.38 0.10 0.44 -0.03 1.03 0.07 1.66 0.01 1.24 0.11 1.28 0.09 1.12 0.14

0.07 -0.02 -0.01 0.03 -0.04 -0.03 0.06 0.05 -0.04 0.00

-0.04 -0.02 -0.08 -0.07 -0.13 -0.08 -0.04 -0.10 0.01 -0.05

a Experimental values as reported in ref 5. b Deviation of Gaussian-2 theory as reported in ref 5. c Deviation of DFT using B3LYP functional and cc-pVTZ++ basis set. d Deviation of B3LYP after corrections using the “minimalist” set of 22 parameters as defined in Table 5, column b.

corrections for the neutral are eliminated, and an intrinsic correction for filling the orbital must be applied. The intrinsic

18800 J. Phys. Chem. B, Vol. 110, No. 38, 2006

Knoll and Friesner

TABLE 8: Summary of Results for IP, EA, and Both Using B3LYP Alone, G2 Theory and B3LYP-LOCa Set 1 IP EA IP & EA

MADf MADf MADf max errg outliersh

Set 2

Set 3

G2b

B3LYPc

LOCd

G2b

B3LYPc

LOCd

G2b

B3LYPc

LOCd

LOC-MINe

0.063 0.061 0.062 0.32 27

0.177 0.134 0.160 1.65 83

0.076 0.057 0.069 0.82 21

0.062 0.059 0.061 0.32 25

0.157 0.118 0.141 0.79 80

0.042 0.035 0.039 0.34 9

0.063 0.056 0.060 0.32 22

0.152 0.115 0.137 0.79 73

0.037 0.037 0.037 0.18 7

0.039 0.039 0.039 0.18 7

a Data set 1 refers to all 146 molecules from the G2 ion dataset; 88 IP and 58 EA. Set 2 is set 1, but excludes the three molecules with qualitatively incorrect DFT wavefunctions (IP: CN, NCCN; EA: C2). Set 3 is set 2, but excludes atoms and molecules which have unique electronic structures that require a correction parameter applicable only to that atom or molecule. (IP: H, He, Li, Be, Na, Mg, N2; EA: Li, Na). Tables 6 and 7 list the molecules in data set 3. b Refers to the Gaussian-2 method from ref 5. c DFT calculation with the B3LYP functional using the 6-311+G(3df,2p) basis set for IP and the cc-pVTZ++ basis set for EA. d Linear orbital correction scheme applied to the B3LYP calculated IP and EA with an extensive set of parameters. e B3LYP-LOC correction scheme with a minimum number of parameters (22), as defined in Table 5, column b. f Mean absolute deviation, in eV. g The maximum absolute error, in eV. h The number of outliers is defined as absolute energies g 0.10 eV.

corrections are small, indicating that the nondynamical correlation errors of the initial radical and the resulting electron pair in the anion are very similar. The radical correction parameters are just those of ref 39; optimized values of the intrinsic addition parameters for first and second row atoms (#31 & #32 in Table 5), and for addition to an atom that is part of a multiple bond (#33), are presented in Table 5. Note that the parameter EA_R1_noMB (#32) is not used in the final model due to its value close to zero. A number of the molecules in Table 7 display significantly larger errors than can be explained by the two corrections discussed above. We have analyzed these molecules, and have found that all have alternative low energy resonance structures that in essence delocalize the anionic electron pair to neighboring atoms. Such delocalization can lead to overestimation of the nondynamical correlation in the anionic electron pair, in turn leading to overstabilization of the anion relative to the neutral. We have optimized a single correction factor, given as parameter #34 in Table 5, based on identification of resonance structures satisfying the criteria described above. Note that a molecule like NO2- has an equal energy resonance form, but the anion is not delocalized to a neighboring atom. A similar role can be played by a bonded atom with a very high electronegativity, as in the molecule OF-, which receives an identical correction. When these corrections are applied, most of the cases so identified are substantially improved. E.3 Addition of an Electron to an Atom with Two Unpaired Electrons. If an electron is added to a half filled orbital, the only new parameters that need to be employed are the unpaired spin interaction parameters. For both first and second row atoms, the same (reduced) value is used for both the neutral and anionic interactions. If an electron is instead added to an unfilled orbital (e.g., in the CH anion), then the intrinsic correction parameter for this process, already introduced in section C above, is required in addition. The O2 and S2 anions qualify for the resonance correction, which works well for O2 but results in an overcorrection for S2. Overall, however, the results for this class of molecules are satisfactory. E.4 QualitatiVely Incorrect DFT Density. As in the case of the ionization potential test cases, there is one molecule (C2) where the DFT density is qualitatively incorrect, and the resulting electron affinity grossly disagrees with experiment. As has been explained above, one cannot expect a post hoc empirical correction scheme to handle cases of this type. IV. Discussion In Table 8, we summarize a variety of statistics for our results for ionization potentials, electron affinities, and the entire suite of test cases, comparing with uncorrected B3LYP with G2

theory. We consider three different test sets. In the first test set, we include all of the molecules in the G2 test suite. In the second set, we eliminate all of the molecules with qualitatively incorrect densities. In the third set, we further remove molecules with odd electronic structures, appearing only once in the data set, that require one-off parameter fits to make the appropriate corrections. Note that there is not necessarily anything wrong with these one-off parameters; they are in fact not unlikely to work reasonably well when confronted with another, similar test case. However, one can get a better sense of the statistical validation of the theory by examining a reduced data set in which these cases are eliminated, and in turn dispensing with the parameters required to model their corrections. For the third data set, we present two parameter fits. The first is an “extensive” fit in which we have retained all parameters which yield some measurable improvement in the average errors and RMSDs. The second set is a “minimalist” parameter set in which small increase in the errors are tolerated with the goal of reducing the number of parameters as much as possible, while still retaining a method which is competitive with G2 theory for prediction of both electron affinities and ionization potentials. In ref 39, there was a sufficient amount of data as compared to adjustable parameters (223 molecules, 22 parameters) to enable the data to be split into a “training set” (taken to the G2 set of 148 molecules) and a “test set” (taken to be the G3 addition to the G2 set, of 75 molecules), and the methodology to be validated in a statistical sense by fitting parameters to the training set and evaluating the results for the test set. While there was some degradation of the overall RMSD for the test set (as compared to fitting to the entire data set), the results were reasonably close, and provided strong evidence that overfitting was not a serious problem (although there were clearly some parameters that required fitting to the full data set to extract an optimal value, due to deficiencies of relevant molecules in the test set). In the present case, the data set is too small to permit a meaningful decomposition, hence, for the present results, there is more danger that overfitting is an issue. There is nevertheless a sufficient number of cross checks in the data set to suggest that the basic model captures much of the important physics required for correcting B3LYP errors in ionization potentials and electron affinities. To achieve rigorous statistical validation, it will be necessary to apply the model to the prediction of additional molecules. The physical implications of the parametrization that we have developed are significant for understanding problems with current DFT functionals. The unpaired spin correction for atoms is confirmed by both the ionization potential and electron affinity data, and, as mentioned above, is present consistently in all

LOC Model to Correct IP and EA in First and Second Row Elements TABLE 9: Differences in Ionization Energy Errorsa B3LYP B3PW91 B3P86 BLYP BPW91 BP86 SVWN SS_2p B-C C-N F-Ne SS_3p Al-Si Si-P Cl-Ar

-0.15 -0.15 -0.13 -0.08 -0.07 -0.06

-0.08 -0.07 -0.09 -0.05 -0.03 -0.02

-0.08 -0.08 -0.12 -0.04 -0.03 -0.03

-0.17 -0.18 -0.16 -0.10 -0.10 -0.09

-0.09 -0.09 -0.11 -0.06 -0.05 -0.05

-0.11 -0.10 -0.14 -0.07 -0.07 -0.08

0.10 0.05 0.10 0.05 0.08 0.17

a Each entry in the above table is the IP of the second atom subtracted from the IP of the first atom for the energies, in eV, calculated with the given functional. In other words, the entry in row C-N and column BP86, which is -0.10 eV, is the IP calculated with the functional BP86 for the C atom minus the calculated IP for the N atom.

gradient corrected DFT functionals examined in ref 5 (interestingly, the LDA functional does not exhibit the identified pattern). To quantify these observations in more detail, Table 9 presents differences in three ionization energy errors (B-C, C-N and F-Ne) for all of the gradient corrected functionals treated in ref 5. These differences, per the analysis above, should all be equal to the unpaired spin correction (parameter ‘SS_2p′) for the particular functional in question. In addition, Table 9 shows the same trend for the three ionization energy error differences for the other unpaired spin correction, SS_3p, as indicated by the differences Al-Si, Si-P, and Cl-Ar. As can be seen from Table 9, the data are consistent with this hypothesis, and the magnitude of the error is dependent upon the correlation functional employed, rather than whether exact exchange is incorporated into the functional. The unpaired spin corrections in molecules for both the SS_2p and SS_3p parameters benefit from a small quantitative adjustment, but are qualitatively similar to those for the atoms. The atomic corrections for adding or removing an electron from a filled, half filled, or unfilled orbital, display more dependency upon the exact exchange component, although there is some commonality among a significant fraction of the functionals in many cases. In contrast to the unpaired spin corrections, however, these terms are substantially different in molecules, and exhibit considerable variation with the local environment. This observation emphasizes the critical role that the environment plays in determining the extent of nondynamical correlation, and the fact that current DFT functionals are unable to model the nondynamical correlation with quantitative accuracy, due to its nonlocal character. Qualitatively, the deficiencies determined in the present paper are consistent with those analyzed in ref 39, as has been outlined in the various specific instances above. There are a number of electronic structures for which the present correction scheme could likely be improved via the use of additional parameters enabling further discrimination among the molecules in the relevant category. For example, there are still significant errors associated with the addition of an electron to a closed shell molecule with full octets. Our view is that the data in the present paper are not sufficient to justify such an expansion of the parametrization effort. However, as the data set is expanded, yielding a larger training set and a fully independent test set, such improvements will likely be possible. V. Conclusion We have developed empirical localized corrections which substantially improve the calculation of ionization potentials and electron affinities using the B3LYP DFT functional, for a training set of 134 experimental values. The average error obtained is more than 3× smaller than the original B3LYP errors, and is comparable to the errors obtained using G2 theory,

J. Phys. Chem. B, Vol. 110, No. 38, 2006 18801

a high level ab initio approach. The number of outliers containing large errors has been dramatically reduced. Systematic errors in components of the DFT energies have been identified in the course of development of the model. As has been discussed above, the immediate task for the future is to generate additional test cases (either from experimental data, or via benchmark ab initio calculations) to validate the training set results. However, there are also a number of promising directions in which the present efforts can be extended. First, we intend to develop parameters for transition metals and other elements of the periodic table. Second, the relative energetics of different spin states (singlet, triplet, etc.) should be amenable to the same type of analysis as has been presented herein. Successfully attacking these problems would significantly extend the applicability of the DFT-LOC methodology. The empirical parametrizations we have developed, in both the present paper and ref 39, provide insights that in principle should enable the development of a more accurate DFT functional, one that corrects the errors in unpaired spin interactions noted above and provides a better treatment of nondynamical correlation. New functional forms, which are capable of modeling the nonlocal effects that have been further elucidated in the present paper, are going to be necessary. As the empirical studies are further advanced, the constraints on the required functional forms become clearer, and significant progress in this direction should become possible. Acknowledgment. This work was supported in part by grants from the NIH (GM40526) and DOE (FG02-90ER14162). References and Notes (1) de Oliveira, G.; et al. Phys. ReV. A 1999, 60(2), 1034-1045. (2) Rienstra-Kiracofe, J. C.; et al. Chem. ReV. 2002, 102(1), 231282. (3) Martin, J. M. L.; de Oliveira, G. J. Chem. Phys. 1999, 111(5), 1843-1856. (4) Curtiss, L. A.; et al. J. Chem. Phys. 2000, 112(17), 7374-7383. (5) Curtiss, L. A.; et al. J. Chem. Phys. 1998, 109(1), 42-55. (6) Irikura, K. K.; F., D. J.; Eds. Computational Thermochemistry, Eds.; Petersson, G. A., Ed.; ACS Symposium Series 677; American Chemical Society: Washington, D. C., 1998; Vol. 13, p 176. (7) Montgomery, J. A.; et al. J. Chem. Phys. 2000, 112(15), 65326542. (8) Lemierre, V.; et al. J. Phys. Chem. A 2005, 109(37), 8348-8355. (9) Zhan, C. G.; Nichols, J. A.; Dixon, D. A. J. Phys. Chem. A 2003, 107(20), 4184-4195. (10) VanHuis, T. J.; Galbraith, J. M.; Schaefer, H. F. Mol. Phys. 1996, 89(2), 607-631. (11) Pak, C.; Rienstra-Kiracofe, J. C.; Schaefer, H. F. J. Phys. Chem. A 2000, 104(47), 11232-11242. (12) Salomon, O.; Reiher, M.; Hess, B. A. J. Chem. Phys. 2002, 117(10), 4729-4737. (13) Pavelka, M.; et al. J. Phys. Chem. A 2006, 110(14), 4795-4809. (14) Glukhovtsev, M. N.; Bach, R. D.; Nagel, C. J. J. Phys. Chem. A 1997, 101(3), 316-323. (15) Pandey, R.; et al. J. Am. Chem. Soc. 2001, 123(16), 3799-3808. (16) Wesolowski, S. S.; et al. J. Am. Chem. Soc. 2001, 123(17), 40234028. (17) Wetmore, S. D.; Boyd, R. J.; Eriksson, L. A. Chem. Phys. Lett. 2000, 322(1-2), 129-135. (18) Richardson, N. A.; Wesolowski, S. S.; Schaefer, H. F. J. Phys. Chem. B 2003, 107(3), 848-853. (19) Ghosh, A. J. Phys. Chem. 1994, 98(43), 11004-11006. (20) Liao, M. S.; Scheiner, S. J. Chem. Phys. 2002, 117(1), 205-219. (21) Ghosh, A.; Vangberg, T. Theor. Chem. Acc. 1997, 97(1-4), 143149. (22) Diaz-Tendero, S.; et al. Int. J. Mass Spectrom. 2006, 252(2), 133141. (23) Grafton, A. K.; Wheeler, R. A. J. Phys. Chem. A 1997, 101(38), 7154-7166. (24) Modelli, A.; Mussoni, L.; Fabbri, D. J. Phys. Chem. A 2006, 110(20), 6482-6486. (25) Tao, J. M.; et al. Phys. ReV. Lett. 2003, 91(14).

18802 J. Phys. Chem. B, Vol. 110, No. 38, 2006 (26) Staroverov, V. N.; et al. J. Chem. Phys. 2003, 119(23), 1212912137. (27) Xu, X.; Goddard, W. A. J. Chem. Phys. 2004, 121(9), 4068-4082. (28) Xu, X.; Goddard, W. A. J. Phys. Chem. A 2004, 108(40), 84958504. (29) Xu, X.; et al. J. Chem. Phys. 2005, 122 (1). (30) Wright, J. S.; Rowley, C. N.; Chepelev, L. L. Mol. Phys. 2005, 103(6-8), 815-823. (31) Keal, T. W.; Tozer, D. J. J. Chem. Phys. 2005, 123(12). (32) Zhao, Y.; Schultz, N. E.; Truhlar, D. G. J. Chem. Theor. Comput. 2006, 2(2), 364-382. (33) Dutoi, A. D.; Head-Gordon, M. J. Chem. Phys. 2006, 422(1-3), 230-233. (34) Rosch, N.; Trickey, S. B. J. Chem. Phys. 1997, 106(21), 89408941. (35) Baerends, E. J.; Gritsenko, O. V. J. Phys. Chem. A 1997, 101(30), 5383-5403. (36) Gritsenko, O.; et al. Phys. ReV. A 1995, 51(3), 1944-1954.

Knoll and Friesner (37) Jensen, F. J. Chem. Phys. 2002, 117(20), 9234-9240. (38) Weimer, M.; Della Sala, F.; Gorling, A. Chem. Phys. Lett. 2003, 372(3-4), 538-547. (39) Friesner, R. A.; Knoll, E. H.; Cao, Y. J. Chem. Phys. 2006., accepted. (40) Johnson, B. G.; et al. Chem. Phys. Lett. 1994, 221(1-2), 100108. (41) Becke, A. D. Phys. ReV. A 1988, 38(6), 3098-3100. (42) Galbraith, J. M.; Schaefer, H. F. J. Chem. Phys. 1996, 105(2), 862864. (43) Dunning, T. H. J. Chem. Phys. 1989, 90(2), 1007-1023. (44) Kendall, R. A.; Dunning, T. H.; Harrison, R. J. J. Chem. Phys. 1992, 96(9), 6796-6806. (45) Woon, D. E.; Dunning, T. H. J. Chem. Phys. 1993, 98(2), 13581371. (46) Tozer, D. J.; Handy, N. C. J. Chem. Phys. 1998, 109(23), 1018010189.