Method for Slater-Type Density Fitting for Intermolecular Electrostatic

Mar 25, 2016 - We call the model the fitted exponent Slater-type charge density (FEST-CD). Results on the carbon monoxide and water dimers are reporte...
0 downloads 0 Views 942KB Size
Subscriber access provided by MAHIDOL UNIVERSITY (UniNet)

Article

Method for Slater-Type Density Fitting for Intermolecular Electrostatic Interactions with Charge Overlap. I. The Model. Anders Ohrn, Jose Manuel Hermida-Ramón, and Gunnar Karlstrom J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.5b01155 • Publication Date (Web): 25 Mar 2016 Downloaded from http://pubs.acs.org on March 30, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Method for Slater-Type Density Fitting for Intermolecular Electrostatic Interactions with Charge Overlap. I. The Model. Anders Öhrn,† Jose M. Hermida-Ramon,∗,‡ and Gunnar Karlström† †Department of Theoretical Chemistry, Chemical Centre,P.O.B 124, S-221 00 Lund, Sweden ‡Departamento de Química Física, Facultade de Química,Universidade de Vigo, 36310 Vigo, Spain E-mail: [email protected] Abstract The effects of charge overlap, or charge penetration, are neglected in most force fields and interaction terms in QM/MM methods. The effects are however significant at intermolecular distances near the van der Waals minimum. In the present study we propose a method to evaluate the intermolecular Coloumb interaction using Slatertype functions, thus explicltly modeling the charge overlap. The computational cost of the method is low, which allows it to be used in large systems with most force fields as well as in QM/MM schemes. The charge distribution is modeled as a distributed multipole expansion up to quadrupole and Slater-type functions of angular momentum up to L = 1. The exponents of the Slater-type functions are obtained using a divideand-conquer method to avoid the curse of dimensionality that otherwise is present for large non-linear optimizations. A Levenberg-Marquardt algorithm is applied in the fitting process. A set of parameters is obtained for each molecule and the process is

1

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

fully automated. Calculations have been performed in the carbon monoxide and the water dimers to illustrate the model. Results show a very good accuracy of the model with relative errors in the electrostatic potential lower than 3% over all reasonable separations. At very short distances where the charge overlaps is the most significant, errors are lower than 8% and lower than 3.5% at distances near the van der Waals minimum.

1

Introduction

Computer simulations have over the last decades become a tool to understand condensed systems and to quantitatively evaluate properties of proteins, polymers, solutions and other macromolecular aggregates. The limit to how accurate the simulations can become, is set by the accuracy of the interaction potential. On the other hand, given limited computational resources, the time of one energy or force evaluation has to be short in order to allow adequate sampling of the configurational space. The principle of maximum accuracy at a minimal cost applies here as well as in so many other instances. In this study we present a model of the contribution to the total intermolecular interaction from the electrostatic interaction. We aim for a moderate computational complexity in order to permit simulations of solutions to be run with this model. Furthermore, we focus on interaction distances where the molecules are packed close to each other—at and near the van der Waals minimum. For high model accuracy at these separations, it is necessary to take into account that the electron distributions of the molecules occupy a finite space, rather than an infinitesimal space as in a multipole expansion. Because of the overlapping electron charge distributions, there are contributions from the electron-electron interactions that are not associated with a force vector in the direction between the centers of the two electron distributions. Instead an increasing proportion of the repulsive force between the electrons is weakened by an attractive overlap interaction between the two centers. The total electronic force remains repulsive for any meaningful separation. However, it is reduced in magnitude 2

ACS Paragon Plus Environment

Page 2 of 45

Page 3 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

relative to a model based on a multipole expansion. The attractive intermolecular electron– nuclei attraction undergoes a modification of magnitude for the same fundamental reason, although at shorter distances due to that only one of the two charge distributions is of finite size. Consequently, at short range, there is an attractive component of the total electrostatic interaction missing in a multipole expansion model. This is the charge overlap contribution to the electrostatic interaction, also referred to as charge penetration. Charge overlap has been studied before. Pioneering work on atoms and small molecules were done by Ng et al. using the bipolar expansion. 1–3 Also in early work on the convergence of the one-center multipole expansion has the charge overlap been discussed. 4,5 The use of several distributed centers in the expansion practically reduce some of the limitations of the one-center multipole expansion. 6–11 Still, the charge overlap is not properly accounted for and the missing term in such a model has to be heuristically modeled through some other term in the force-field. There is a wealth of methods available nowadays for the construction of distributed charges or multipoles, and all lack charge overlap. 12,13 Standard quantum chemical calculations generates diffuse densities, described in terms of several Gaussian functions. If these densities are used, charge overlap is included. In molecular simulations, this is not a feasible option, though, considering the large number of Gaussians that typically are employed to describe a charge distribution to an adequate accuracy. A simplified Gaussian density, which acts as a compromise in this respect, is constructed in the studies by Hall and co-workers. 14–17 The approximate density is represented as ρ∗ =

X

qk Gk (αk , rk ),

(1)

k

where Gk (αk , rk ) is a normalized s-Gaussian with exponent αk centered at rk ; qk are constants, and represent the charge associated with Gk (αk , rk ). Hall and co-workers minimize the difference in the electric field between the target density ρ and ρ∗ , by optimizing all parameters, i.e. the linear qk and the non-linear αk and rk . For the water molecule, typically

3

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

nine Gaussians are required in three different centers. 16 The non-linear fitting in this method will be plagued by the curse of dimensionality, i.e. for large molecules multiple minima will appear in the parameter space and the computational load of the optimization will be not be practical. Furthermore, beyond the region of overlap, the charge density of the water molecule described above will operate as three distributed point-charges, which for water is not an accurate description. 18,19 By using additional Gaussians, the latter problem can be overcome, but at the cost of aggravating the former problem. The GMUL method by Wheatley is similar. 20–22 It starts from the quantum chemical density, but removes several Gaussians; the ones with very large exponents are described as corresponding multipoles of infinitesimal size, while the ones with intermediate exponents are replaced with a reduced set of Gaussians. The exponents of the Gaussians are set to constants, thus the non-linear problem in the method by Hall and co-workers is removed, and Gaussians of higher angular momentum are included as well, thus the charge density outside the overlap region can in principle be modeled very well. The fitting of Gaussians is still not trivial, and some simplifications are needed in order to proceed. 22 Wheatley also observes the close connection with the distributed multipole analysis (DMA), and the Gaussians can be viewed as a smearing of the DMA multipoles. 6,20 One drawback is that with a set of fixed exponents, more Gaussians are required to achieve the same level of accuracy as in the method with flexible exponent. Also the method by Strasburger is similar in spirit to the method by Hall and co-workers. 23 Strasburger includes higher angular momentum Gaussians and make a non-linear fit of the exponents. Instead of fitting all exponents at the same time, however, they are fitted one center at a time. This is done by a partition of the total density into subdensities with a population analysis method. This divide-and-conquer procedure overcomes the curse of dimensionality. The performance of the method will on the other hand be dependent on the accuracy of the arbitrary density partitioning. To obtain physically sound fits, Strasburger had to separately fit the core and valence electrons. The method by Cummings and co-workers is also worth noting in this context. 24–26 For water, a total of three s-Gaussians

4

ACS Paragon Plus Environment

Page 4 of 45

Page 5 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

are used to smear out the charges. The exponents of the Gaussian are fitted, along with other parameters, such that simulations reproduce experimental data. Considered that the methods with theoretical fitting references of Hall, Wheatley and Strasburger have to use several Gaussians per center to achieve adequate accuracy, it is questionable if the fitted parameters in the method by Cummings have a sound physical basis. Also, the method suffers from the curse of dimensionality if applied to larger molecules. In recent years density fittings have attracted a lot of attention in other areas of research, namely in the development of large-scale quantum chemical methods. The construction of auxiliary basis sets have been pursued extensively. 27–31 A linear combination of these auxiliary basis functions are found by minimizing some abstract distance (often the Coulomb metric) to the reference density. Piquemal, Darden and co-workers have constructed a method that uses the same linear combination approach of a standard auxiliary basis functions, but with the goal of accurately and efficiently evaluate the electrostatic interaction between molecules. 32–37 In the initial studies, s-Gaussians were distributed in the molecule, not necessarily on the atoms only, and the Coulomb metric was minimized. The limitation to s-Gaussians was later lifted. 34 The minimization of the Coulomb metric is a grid-independent method, and was in the first publication cited as an advantage. In a later study, however, it is concluded that this is not the case, since the Coulomb metric puts a too large weight on the core region, where the reproduction of the electric potential is not needed for application to intermolecular interactions. 35 A minimization on a grid allows for a more efficient selection of auxiliary Gaussian basis functions, and the procedure is reported to become more robust. Like Wheatley, Piquemal, Darden and co-workers observe the close connection between the Gaussians distribution and the multipole representation. Clearly, this method has also in other respects a lot in common with the method by Wheatley, although it has evolved further, and was included in the SIBFA force-field. 37 Volkov, Coppens and co-workers use a quite different approach to the charge overlap problem. 38–43 The method is based on relations between X-ray structure factors and the

5

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

charge density. A superposition of nucleus-centered pseudoatoms is constructed to a set of given structure factors. A pseudoatom consists of a spherically averaged core and an aspherical valence density, i.e. higher angular momenta are included. In contrast to the previous methods, the pseudoatom density is described in terms of Slater-type functions, not Gaussians. The integrals required for the evaluation of the electrostatic energy are computed numerically. 42 The structure factors are computed with standard quantum chemical methods, hence the method does not rely on the accuracy of X-ray experiments. With this method, a databank of pseudoatoms has been constructed, which enables a fast construction of a diffuse molecular density. It is not clear to what extent the exponents in the Slater-type functions are fitted, but at least some refinement of the exponents can be done with dimensionless expansion-contraction coefficients. 40 The number of Slater-type functions is also quite large, which together with the numerical integration, may lead to a too large computational load for a simulation (although timings are not published, to the best of our knowledge). This kind of application of X-ray structure factors has also been pursued by Spackman. 44,45 Donchev et al. have also used Slater-type functions instead of Gaussians to fit the density. 46–50 Only s-Slater functions are used, one for each atom, and the core electrons are added as a point-charge to the nuclei; the integrations are done analytically. The exponents are fitted to the quantum chemical electrostatic interaction energy for a training set of small dimers, so that, for example, all tetravalent carbon atoms in all molecules will have the same parameters. Donchev et al. do not comment on exactly how the non-linear fitting is done, or whether any problems with dimensionality or multiple minima are encountered. Since small dimers are used, this may not be a problem. Instead, the performance of the method becomes dependent on the transferability of the parameters between different molecules and dimers. The same approach is taken in the work of Wang and Truhlar. 51,52 In their model Slatertype functions are used to represent screened charges with spherical charge distributions. The electrostatic energy of several small dimers is used to fit a set of atomic parameters. Piquemal, Darden and co-workers have recently also used Slater-type functions to describe

6

ACS Paragon Plus Environment

Page 6 of 45

Page 7 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

charge penetration. 53 For each atom, the method uses a set of multipole moments (up to quadrupole) and a single Slater-type function with a single exponent. The parameters are calculated by fitting the electrostatic potential. Small dimers were used so the curse of dimensionality is in all likelihood practically circumvented. Recently, Piquemal and co-workers have proposed a method to include charge overlap in molecular mechanics force fields. 54 It uses a previous model 55 that includes exponential terms in the pure multipolar expressions of the electrostatic energy. The new suggested method uses a charge-charge correction, and a set of atomic parameters are obtained by fitting the electrostatic potential of a set of dimers. Since no large molecules are employed there are no problems with dimensionality but the question of the degree of transferability of parameters between molecular systems becomes relevant. Finally, we comment on three methods that include charge overlap, but not through an explicit fitting of a density. With the effective fragment potential (EFP) method, the charge overlap is modeled. 56,57 Kairys and Jensen derive an approximate relation between the charge density overlap of s-Gaussians and the molecular orbital overlap integrals. 56 This seems to be a computationally demanding method and the s-Gaussian assumption is likely too restrictive. A more pragmatic approach to the same problem in the EFP method is a damping of the electrostatic potential from a multipole description. 57 The charge-charge interaction between two centers is modified through a damping of the electric potential with (1 − e−αR ). The exponents are obtained by fitting the electric potential on a grid to a quantum chemical electric potential. The leading term in the charge overlap is modeled this way, which at small overlap is sufficient, but at larger overlap is not. The computational load of this method should not be that much larger than an ordinary multipole model. Rob et al. described a method to compute the electrostatic energy for large dimers. 58 It consists of only using the full integration of Coulomb integral for the parts of the monomer densities that are near one another, and using a multipole description for the other parts. This is computationally too demanding for a simulation. There also arise some complications related to the partition of

7

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the density. The method by Gavezzotti (slightly modified by Ma and Politzer) also accounts for the charge overlap. 59–61 The charge density is numerically integrated in a set of dense blocks, which leads to that the entire density becomes approximated by a very large number of point charges around the molecule, called e-pixels. As the distributions overlap, a cut-off has to be applied to avoid the situation where two point-charges get very close and the singularity in their Coulombic interaction becomes a concern. Since the e-pixels are quite numerous, a very large number of distances have to be computed, which involves the costly operation of taking a square-root. Therefore for simulations, this method is likely not to be computationally feasible. An unambiguous conclusion of the review above is that there are several properties possible to vary in methods to include charge overlap. They include: Gaussian or Slater-type functions, low or high angular momentum functions, treatment of core versus valence electrons, linear fits with fixed exponents or non-linear fits of the exponents, experimental or theoretical reference values for the fit, and what physical quantities to fit to. The model in this article uses Slater-type functions of angular momentum up to L = 1, with distributed quadrupole moments (L = 2) included but only as infinitesimal distributions; the core and valence electrons are treated separately with the core being only a point-charge; non-linear fits are used for the exponents and the fit is made to a computed quantum chemical electric potential on a grid around the molecule of interest; to overcome the curse of dimensionality, the complete problem is divided into sub-problems like in the method by Strasburger. 23 All theoretical and computational aspects of the model are described in detail in the next section. We call the model fitted exponent slater type-charge density (FEST-CD). Results on the carbon monoxide and water dimers are reported next. This is not meant to be an exhaustive test of the method, more an illustration; the focus is on the description of the method.

8

ACS Paragon Plus Environment

Page 8 of 45

Page 9 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

2

Formulation of Method

The electrostatic interaction is a pair-interaction. Therefore the formulation of the method will be, without loss of generality, in terms of two arbitrary molecules, denoted A and B. To each molecule there is a charge distribution (CD), obtained with any standard quantum chemical method. Therefore, the reference CD is fully defined by two Gaussian-type basis sets, {θiA } and {θjB }, two density matrices, DA and DB , and positive point-charges for the nuclei. Our aim is to approximate the reference CD in terms of Slater-type CDs and distributed multipoles. This constitutes a simplification in terms of computational complexity compared to the description in terms of several Gaussians. In the first section, the Slater function is introduced. This includes how the Coulomb integral is evaluated and how electric potentials, fields and field-gradients from a given Slater-type CD are computed. We adopt a spherical representation of the interacting CDs. This is different from the more common Cartesian representation, hence the key properties of the spherical representation is presented for easy reference. After this, the approach to parametrize the model is presented, in particular how the non-linear optimization of the exponents of the Slater-type CDs is done.

2.1 2.1.1

Slater-type charge distribution Coulomb integrals

Gaussian-type orbitals (GTO) were introduced by Boys in 1950. 62 The rationale for this was that on account of the Gaussian product theorem, three- and four-center two-electron integrals become substantially easier to evaluate in a GTO basis than in a Slater-type orbital (STO) basis. From basic theory and extensive studies, it is known, however, that both the wave function and the CD of a molecule have an exponential decay. 63–68 This qualitative difference in the long-range part, together with the problem near the nuclei (the nuclei cusp problem), are solved by fitting several GTOs to the wave function. As shown in great detail 9

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 45

by Klopper and Kutzelnigg, this can be done quite successfully. 69 The arguments in favor of using GTOs rather than STOs in quantum chemical calculations are therefore strong. 70 On the other hand, given their exponential decay, Slater-type functions, χN,L,M (R, φ, ϕ) = NN,L RN −1 SL,M (φ, ϕ)e−αR ,

(2)

are expected to be more expedient than Gaussians in obtaining a compact and simplified representation of the CD, after the CD has been obtained from a standard quantum chemical calculation. In the definition above, NN,L is a factor that normalizes the Slater-type distribution such that it at long-range has the same behavior as a unit 2L -multipole, SL,M is a real spherical harmonic, and the integers N , L and M are the usual “quantum numbers” restricted by N ≥ L + 1 and −L ≤ M ≤ L. Only two-center integrations

E=

ZZ

χA (~r1 )χB (~r2 ) d~r1 d~r2 , |~r1 − ~r2 |

(3)

are required to evaluate the Coulomb interaction since χ is a function describing the CD not the wave function. These integrals have closed form solutions. 71–73 For example, for two 1s-functions (i.e. (N, L, M ) = (1, 0, 0)) with exponents αA and αB , αA 6= αB , the Coulomb interaction is E=

1 1 1 − (1 − κ)2 (2 + κ) + R 4 1 − (1 + κ)2 (2 − κ) + 4

 1 αA R e−αA R 8   1 αB R e−αB R , 8

(4)

where 2 2 αA + αB , κ= 2 2 αA − αB

(5)

and R is the separation between the centers. For the case where αA = αB = α, the solution is 1 E= R



   11 3 2 2 1 3 3 −αR 1 − 1 + αR + α R + α R e . 16 16 48 10

ACS Paragon Plus Environment

(6)

Page 11 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

In a finite-precision implementation, the formulas for the case αA 6= αB become ill-conditioned as the difference between the exponents of the two CDs approaches zero. This was also observed by Wang and Truhlar. 52 The evaluation turns into a difference of two large numbers, as illustrated by the example above, where the denominator in the definition of κ is the cause of the trouble. Several other analytical methods for solving the Coulomb and repulsion integrals have been developed, though the results are often quite complex and not always in a closed form. 74–77 In Figure 1 the value of the Coulomb integral, evaluated by three different methods, is plotted as a function of αB , with αA fixed and R = 3.0 a.u. The axis between the centers is the z-axis, the charge distributions are of unit magnitude, and the different interactions are 1s − 1s, 2pz − 2pz and 3dzz − 3dzz . The three different methods are:

Figure 1: Coulomb repulsion between Slater charge distributions of unit magnitude, separated by 3.0 a.u. In (a) 1s − 1s repulsion, where αA is fixed at 1.0; in (b) 2pz − 2pz , where αA is fixed at 1.0; in (c) 3dzz − 3dzz , where αA is fixed at 3.0.

1. The primary analytical method, which uses the formula for the case αA 6= αB in all sit11

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

uations except when αA = αB , where the other formulas are employed. These formulas are available in Ref. 71 (observe that Ref. 71 uses a slightly different normalization of the charge distributions). As noted above, the formulas for αA 6= αB all become ill-conditioned as αB → αA . 2. On the line αA = αB , another set of formulas holds, see Ref. 71. The extended analytical method uses these formulas, with α = 21 (αA + αB ), in a neighborhood of the line αA = αB . This method of evaluation is free from the problems the primary analytical evaluation has in this neighborhood, but is not an entirely accurate evaluation of the integral. 3. Harris has derived a different method of evaluation that does not have problems in the neighborhood of the line αA = αB . 77 The method has been implemented in the symbolic computation software Maple by Harris and is considered as the reference in Figure 1. 78 It is clear from Figure 1(a) and 1(b) that the ill-conditioned behavior of the primary analytical evaluation does not extend far from the line αA = αB in the cases of the 1s − 1s and 2pz − 2pz repulsions. It is also an excellent approximation to compute the Coulomb integral with the extended analytical method in case |αA −αB | < 10−2 , and with the primary analytical method otherwise. Figures 1(a) and 1(b) formally only supports this conclusion for the special case of R = 3.0a.u. and αA = 1.0. However, since a short separation and a small exponent is the most difficult case, the conclusion should hold for all reasonable separations and exponents. This convention of evaluation is also adequate for the 1s − 2pz , 2px − 2px and 2py − 2py repulsions, since the order of the denominator that is the cause of the ill-conditioned behavior is less than or equal to the 2pz − 2pz repulsion for these other three cases. On the other hand, the evaluation of the 3dzz − 3dzz repulsion is troublesome, see Figure 1(c). Even quite far from αA = αB , the primary analytical method is far from accurate in the finite implementation. The extended analytical method is much better, but 12

ACS Paragon Plus Environment

Page 12 of 45

Page 13 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

does not reproduce the reference curve over the entire range where the primary method fails. If αA is made smaller, the problem becomes even worse. It is not feasible to use the general analytical method by Harris for the higher angular momentum in the applications we pursue. That method is computationally more demanding than the primary and extended methods, and there is also a problem in a finite-precision implementation, although for other reasons. A numerical integration of the Coulomb integrals is possible, but is also expected to be too demanding for our applications. Therefore, as a compromise between accuracy and computational load, we limit the diffuse Slater-type CDs to the cases L ≤ 1 and set

ZZ

   extended method, |αA − αB | < 10−2

χA (~r1 )χB (~r2 ) d~r1 d~r2 =  |~r1 − ~r2 |  primary method,

.

(7)

otherwise

Distributed quadrupoles are included in our method as well, but they are treated as typical ideal multipoles (vide infra). 2.1.2

Implementation with spherical multipoles

The calculations in the previous section were along the z-axis. In the general case, two points are given, (xA , yA , zA ) and (xB , yB , zB ), in which 1s- or 2p-Slater CDs are centered, the latter with three components. We use a spherical multipole implementation to calculate the Coulomb integral in the general case. This is how the formulas in Ref. 71 are given, which are used in the primary and extend method of integration. In this section, the practical aspects of the general case calculation in the spherical representation are given. Symmetry arguments give that only Slater CDs that satisfy MA = MB can give a nonzero value of the Coulomb integral, where MX is the integer that defines the projection of the angular momentum on the interaction axis

R = (xA − xB , yA − yB , zA − zB ). 13

ACS Paragon Plus Environment

(8)

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

For example, two interacting p-Slater CDs in the spherical representation consist of three terms: the σ-interaction, which is between the MA = MB = 0 components, the two πinteractions, which are between the MA = MB = 1 and MA = MB = −1 components, respectively. It should be noted that the dependence on the exponents is different for the σ- and π-interactions. In other words, the standard procedure to evaluate the interaction between Cartesian multipoles, i.e. taking the inner product of the multipole tensor and the   1 , can not just be modified through a multiplication of a scalar derelevant tensor ∇n |R| pending on αA and αB ; see Appendix A for an illustration of this point. The spherical

representation appears more natural. Therefore, the first step in computing the Coulomb integral is to transform the p-Slater distributions to one σ-component and two π-components relative the interaction axis (the s-Slater CDs have no angular dependence and thus requires no transformation). The practical procedure is as follows. The interaction axis R is considered the normal vector to a plane. One arbitrary vector V in this plane is selected. The vector U is constructed as the vector product between R and V . U is hence orthogonal to both vectors. The vectors are normalized and assembled as rows in a matrix T. This matrix transforms the Cartesian x-, y- and z-components of the two interacting p-Slater distributions to the σ- and two π-components. The formulas in Ref. 71 are then applied. In case the interaction is between an s-Slater CD and a p-Slater CD, only the σ-component contributes to the total energy. The transformation above is a simple application of a rotation. In the next section, the interaction between a Slater-type CD and ideal multipoles is discussed, among them quadrupoles. It is therefore also required that a quadrupole can be transformed into its σ-, π- and δ-components so it can interact with the Slater-type CDs in their spherical representation. The rotation of a quadrupole is not as easy as a dipole. It can be solved by any routine for rotating real spherical harmonics. 79–82 We use a different method, however, based on that quadrupoles, or second moments, can be viewed as quadratic forms. The method is easy to implement and understand, and can be generalized to higher moments, although it is

14

ACS Paragon Plus Environment

Page 14 of 45

Page 15 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

unlikely to be efficient for these higher order multipoles. A matrix or tensor representation of x2 is obtained by the outer product of the vector representations of x:       1 1 1 0 0       0 ⊗ 0 = 0 0 0 .             0 0 0 0 0

(9)

By transforming the vectors for x with T, and taking the outer product, the matrix representation of the transformed x2 component is obtained       1 T11 T11 T11 T21 T11 T31  1         ⊗ T 0 = T T  T 0 T T T T    21 11 21 21 21 31  ,         0 0 T31 T11 T31 T21 T31 T31

(10)

where Tij is the matrix element on row i and column j in T. Since each element in the matrix above uniquely corresponds to a component of a quadratic form, the transformation of x2 into x2 , xy, y 2 , etc. upon the transformation T, is evident from the matrix elements in Eq.(10). The corresponding procedure is applied for the other five unique Cartesian components. Observe that mixed components, such as xy, has to take the symmetry into account, i.e. its matrix is          1 0   0  1          ⊗ T  1  + T  1 ⊗ T  0 = T 0                 0 0 0 0   T11 T12 + T12 T11 T11 T22 + T12 T21 T11 T32 + T12 T31     T T + T T T T + T T T T + T T 22 11 21 22 22 21 21 32 22 31  .  21 12   T31 T12 + T32 T11 T31 T22 + T32 T21 T31 T32 + T32 T31

(11)

This construction allows a transformation matrix, T2 , to be constructed, which is applied to

15

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 45

a vector with the values of the second moments at the given center. The rotated quadratic form is then simply represented as spherical quadrupoles, with their one σ-component and two π- and two δ-components. 8 2.1.3

Damped interaction functions for mixed interactions

A combination of ideal multipoles and Slater-type CDs are used to represent the molecular CD in the present model. By taking the limit αB → ∞ in the relevant formulas in Ref. 71 for the interactions between Slater-type CDs, damped interaction functions consistent with the spherical multipoles, as discussed in the previous section, are obtained. The interaction functions are directly related to the electric potential, field and field-gradient of Slater distributions. All damped interaction functions relevant to this work are given in Table 1. For Table 1: Components of the Damped Interaction Functions for S- and P-Slater CDs. The Product between the Separation and Exponent is Abbreviated, αR = ρ. Ts0 Ts1 Ts2 Tp0 Tp1 Tp2

σ σ π σ π δ σ σ π σ π δ

S  1 − (1 + 21 ρ)e−ρ  − R12 1 − (1 + ρ + 21 ρ2 )e−ρ 0  1 1 2 1 − (1 + ρ + ρ + 61 ρ3 )e−ρ R3 2 0 0 P  1 1 − (1 + ρ + 12 ρ2 + 81 ρ3 )e−ρ R2  3 3 1 4 −ρ − R23 1 − (1 + ρ + 21 ρ2 + 16 ρ + 16 ρ )e 1 1 − (1 + ρ + 12 ρ2 + 81 ρ3 )e−ρ R3  3 1 4 1 5 −ρ 1√− (1 + ρ + 21 ρ2 + 61 ρ3 + 24 ρ + 72 ρ )e R4  1 4 −ρ ρ )e − R43 1 − (1 + ρ + 21 ρ2 + 16 ρ3 + 24 0 1 R

example, the interaction between a 2p-Slater distribution and a quadrupole is obtained by first transforming the 2p-Slater distribution and the quadrupole to their respective σ-, πand δ-components, as described in the previous section, followed by the operation

E = pσ Qσ (Tp2 )σ + (pπ1 Qπ1 + pπ2 Qπ2 ) (Tp2 )π , 16

ACS Paragon Plus Environment

(12)

Page 17 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

where pσ and Qσ are the magnitudes of the σ-component of the Slater-type CD and the quadrupole, respectively, and the same for the two π-components. The final type of interaction, that between two ideal multipoles, is done in a standard way, and is not further discussed. Hence, the computation of the electrostatic intermolecular interaction between two distributions with both Slater-type CDs and ideal multipoles has been fully accounted for.

2.2

Density expansion and center-by-center fitting to electric potential

Given the quantum chemical CD of a molecule, described in terms of Gaussians, the electric potential on a grid around the molecule is used in the FEST-CD method as the quantity to connect the quantum chemical CD to the approximate CD described with Slater-type CDs. The Slater-type CDs and the ideal multipoles are distributed in a number of fixed centers within the molecule. Each center is restricted to, at most, contain one 1s-Slater CD, one 2p-Slater CD, a point-charge and an ideal quadrupole. This means that for each center, two exponents are needed along with the factors to the Slater CDs (one for 1s and three for 2p) and the magnitude of the point-charge and the quadrupole. As we discussed in the Introduction, a fit of all these parameters at the same time is likely to be plagued by the curse of dimensionality. We apply a divide-and-conquer method to recast the fitting problem as several smaller problems, rather than one large problem. In the next section, we describe the method used to obtain the factors to the Slater CDs and the magnitude of the ideal multipoles. That section also sets the stage for the divideand-conquer method, which is more fully described in the section thereafter, along with the algorithm for the fitting of the exponents.

17

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2.2.1

Page 18 of 45

Multi-center Multipole Expansion

The multi-center multipole expansion (MME) is presented in this section in a somewhat different way than in previous publications. 9–11 This method is a standard procedure to derive distributed multipoles in a molecule given a quantum chemical density. It has been used in many previous applications. We should note that the MME method is in essence equivalent to the distributed multipole analysis by Stone. 6–8 The expectation value of a one-electron property M , such as a multipole moment, is computed as hM i = T r(DM) =

X

Dab hθa |M |θb i,

(13)

a,b

where T r denotes the trace of a matrix. By accumulating the terms in the sum above to different centers in the molecule, the MME method achieves the distribution of M . For a given term, the center is determined by which pair of atoms the basis functions θa and θb are centered on: if both θa and θb are on the same atom A, this is the center; if they are centered on different atoms, the corresponding MME center lies somewhere on the line defined by the atoms A and B. To summarize, the distributed property M in center c can be defined as

Mc =

X

Dab hθa |M |θb i,

(14)

a,b∈Ωc

where Ωc is the index set for center c constructed according to the criteria given above. In case M is origin-dependent, the origin of each matrix element has to be translated to center c. It should be noted that this distribution of M preserves the molecular expectation value. Centers between distant atom pairs often contain less density. Various methods have been constructed to reduce these centers by moving the multipoles in these centers to other centers, but still preserving the molecular moments. Söderhjelm et al. have studied how well the distributed ideal multipoles generated by the MME method reproduce the electric potential around several different molecules. 83

18

ACS Paragon Plus Environment

Page 19 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

The MME method is used to obtain the factors to each Slater CD in the linear combination. In other words, the MME method provides the magnitude of each Slater CD component, since the formulation above is normalised to unity. The MME method is also employed to evaluate the distributed quadrupole moments. The factor to the 1s-Slater CD is set equal to the electronic charge distributed in a given center, minus any charge from the core electrons. The three factors to the 2p-Slater CD are set equal to the components of the distributed dipole in the same center. The coordinates of the centers for the Slater CDs are the coordinates of the MME centers. A consequence of this construction is that at long-range, outside the region of charge-overlap, the approximate CD obtained with the FEST-CD method, has the same behavior as the distributed multipole expansion obtained with the MME method. First of all, this implies that the molecular moments up to the quadrupole moment are reproduced exactly with the FEST-CD method. For long-range electrostatic interactions, this is an important property. Second, the properties of the approximate CD beyond the region of charge overlap are well-known and invariant to how the charge overlap is modeled. In other words, we are in control of the interactions outside the region of charge overlap and do not run the risk of making these interactions described less accurately as we address the problem of accurately describing the interactions within the region of charge overlap. The number of MME centers is N (N + 1)/2, where N is the number of atoms in the molecule. The expansion centers are placed on the atomic nuclei and on the lines between atoms. Since the contributions to the electronic density from a pair of basis functions situated on different atoms will be distributed along the straight line connecting the atoms, a reassignation and translation procedure is performed in the MME method 9,11 to obtain a unique bond center for each pair of atoms. The location of these bond centers is such that as much as possible of the properties of the molecular bond-densities appear in the low-order terms of the multipole expansion. As stated above, centers between distant atoms often contain little density and can thus be reduced. In this study, we do not reduce any centers,

19

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 45

though, but keep all. In the next section, however, we will show that not every center must include a Slater-type CD. The core electrons that are included as a point-charge along with the nucleus at a given center are given a very simple definition. For atoms, such as carbon and oxygen, the two electrons that occupy the 1s-orbital are defined as being the core electrons. For these elements we believe this simple argument, based on the atomic electron configuration, is adequate. For heavier elements, a more elaborate definition may be needed. We do not pursue this question any further in this study. 2.2.2

Levenberg-Marquardt optimization of the electric potential on a grid

The exponents remain to be determined, all other quantities in the approximate CD are determined with the MME method, as described above. It is by fitting to the quantum chemical electric potential on a grid that the exponents are found. Also here the MME method is used, but now as a tool to simplify the non-linear fitting problem of the exponents by dividing it into several smaller problems. If the one-electron property operator M in Eq.(13) is the electric potential operator

M = φg =

1 , |r − rg |

(15)

where rg is a grid-point, the total electric potential on the grid is divided into separate contributions, φc,g , from subdensities, which are interpreted as being located at the specific MME center c, as defined above. To each such center the factors to the 1s- and 2p-Slater CDs have already been determined along with a nuclei-core point-charge and an ideal quadrupole. We can therefore formulate the electric potential from a given center, c, of the approximate CD in a given grid point with coordinates rg : c,g c c 0 c c 0 φc,g approx = sσ Ts (R, αs ) + pσ Tp (R, αp ) + φideal ,

20

ACS Paragon Plus Environment

(16)

Page 21 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

where scσ and pcσ are the σ-components to the Slater CDs relative to the interaction axis ~ = rg − rc , Ts0 and Tp0 are the damped interaction functions, listed in Table 1, with their R dependence on the length of the interaction axis and the exponent explicitly specified; φc,g ideal is the electric potential from the ideal multipoles in center c. We set up the following error function 2 G  c qnuc 1 X c,g c,g φapprox − φ − , ∆ = G g=1 |R| c

(17)

which is the average square difference on the grid between the electric potential from the part of the approximate CD in center c, and the distributed quantum chemical electric potential at that center along with the nuclear potential of the center (equal to zero if the center does not contain a nuclei). We proceed with a minimization of ∆c by optimizing the two nonlinear parameters αsc and αpc . This is done for all centers, one at a time. The final parameters in the approximate CD are thus obtained. The Levenberg–Marquardt algorithm (LMA) is used for the non-linear optimization. 84,85 This is a standard method for non-linear optimizations, and is known from experience to be an excellent compromise between algorithmic simplicity and accuracy. The LMA for this particular application is summarized below. For a given center c, the gradient of ∆c with respect to αsc and αpc is computed, and the vector β is defined as 



∂∆c  ∂αcs 

1 1 β c = ∇∆c =   2 2 ∂∆c

(18)

∂αcp

and the elements of the curvature matrix (essentially the Hessian) are approximated as recommended for the LMA G

γijc

1 X = G g=1



c,g ∂φc,g approx ∂φapprox ∂αic ∂αjc

21



ACS Paragon Plus Environment

,

(19)

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 45

where i and j are either s or p. A matrix Γ is defined with elements Γij = γijc (1 + λδij ),

(20)

where λ is a constant and δij is a Kronecker-delta. In other words, Γ is the curvature matrix with a modified diagonal. The new exponents are computed as c c αnew = αold + Γ−1 β c ,

(21)

where αc is the vector consisting of the two exponents. If ∆c becomes larger with the new exponents, the new exponents are discarded and λ is increased by a factor of 10 for the next iteration. The step size is thus made smaller and the LMA becomes more like a steepest descent method, which is a preferred optimization method when far from the optimum. If ∆c becomes smaller with the new exponents, the exponents are accepted and for the next step λ is reduced by a factor of 10, thus making the LMA more like a second-order optimization method. A small λ is hence a sign that the optimization is close to the minimum where the surface is quite well approximated by a quadratic form. Therefore, the first criterion for successful termination of the algorithm is that λ is below a threshold, τλ . The second criterion is that ∆c is below a threshold, τ∆ . On account of the matrix inversion, the LMA will run into problems if at least one γiic is close to zero. This can happen for two reasons. The first reason is that the corresponding exponent is large, in which case no small or moderate variation of the exponent will significantly change the contribution to the electric potential on the grid. The surface is in other words flat. A large exponent means that the CD is not diffuse and acts essentially as an ideal multipole. Therefore, we add to our implementation of the LMA, a threshold τα , such that if the optimization takes an exponent over this threshold, the corresponding Slater-type CD is transformed during the optimization into an ideal multipole and the exponent is removed from the optimization. This construction solves the problem with the matrix inversion, as 22

ACS Paragon Plus Environment

Page 23 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

well as reduces the complexity of the approximate CD. The other reason for very small diagonal elements is that the factor to the Slater-type CD is small. The most obvious case where this happens is for a center that coincides with an inversion center in the molecule; the dipole components are then by symmetry zero. A more common situation is that there is only very little density associated to that center. Therefore a fourth (and final) threshold, τf , is introduced, such that if a factor is below this threshold, the corresponding Slater-type CD is directly transformed into an ideal multipole and the exponent is removed from the optimization. As discussed in the previous section, small multipoles can be moved to other centers in the molecule to further reduce the complexity. The initial value of the exponents are set to 0.5. This is a small value and can be expected to increase. The reason a small value is chosen is to avoid the problems that large exponents imply. These problems also justify a damping of the step size. In the first few steps, too large steps are sometimes taken, which on account of the properties of the surface at large exponents, may take several iterations to adjust. Hence, we apply a simple upper limit, putting all step sizes above 0.7 to 0.7. This value was found to work well for the examples described in the next sections, but may need further adjustment for other systems. Finally, we will consider the properties of the model if the factors to each Slater distribution were optimized to fit the electrostatic potential on the grid as well, rather than determined in an independent manner by the MME method, as described above. Since more parameters are made available, the fit on the (training) grid will necessarily be numerically improved. The parameters are linear, hence the optimization should not mathematically become much more difficult. A problem implied by the joint optimization of exponents and factors, though, is that the division of the fitting problem into local problems can not be used without modification. This follows from that all factors taken together are interpreted as molecular expectation values of the different multipoles. In other words, the factors should fulfill certain global constraints. A possible solution could be a nested local and global optimization loops in the fitting. Going to a completely global fitting, on the other hand, would

23

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

increase the number of non-linear parameters, which constitutes a major obstacle. The risk of not satisfying the global constraints would also be reduced if the grid was made large, such that a deviation from the long-range electrostatic potential becomes strongly penalized in the parameter optimization. A more subtle aspect of fitting factors and exponents together lies in their different physical interpretations. The exponents model the width of the CD, which is the property responsible for the charge overlap. The factors reflect the local magnitude and orientation of the CD. Any approximate model will contain errors in both these features of the CD; the errors are called the charge overlap error and the multipole error, respectively. If both factors and exponents are fitted jointly to the same reference, these two different kinds of errors will mix. For example, using just two Slater-type CDs is an approximation. Therefore, if no more exponents are available to reduce the charge overlap error at short range, the factors would become optimized to reduce this error as well. For this not to lead to an increase of the multipole error, the grid must cover a rather large space, including the long-range part where the multipole error is the dominant contribution to any deviation from an accurate reference potential. In the FEST-CD method, these errors are only weakly coupled, since only the optimization of the exponent (relevant for short range interactions) can lead to a reduction of both errors. One can object that it is only the total error that matters, and therefore the error mixing is not a problem, rather an advantage; the objection could be said to hinge on the view that a strict physical interpretation of model parameters is not adequately beneficial to justify a poorer overall performance. This is certainly true for any single term. However, the path to an increasingly improving performance of an interaction potential or force-field is made easier to track if the constituent errors of the model can be, more or less, separately characterized and thus systematically reduced. The clear relation between the FEST-CD method and the well-known MME method, along with the separation of the determination of factors and exponents, are in our opinion features that facilitate the understanding and further development of the FEST-CD method. This is a favorable methodological feature. In addition, from a practical point of view, employing a

24

ACS Paragon Plus Environment

Page 24 of 45

Page 25 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

segmented fitting procedure makes it possible to use this model in force fields like AMBER 86 or AMOEBA, 87 where a set of charges or multipoles have been calculated to describe the molecule. A common way of obtaining charges and multipoles in these and other similar force-fields is by calculating the electrostatic potential and fitting these parameters to describe the electrostatic potential. Once the electrostatic potential has been ascertained the additional computational cost of introducing the Slater exponents into the fitting procedure is small. The proposed model should therefore be applicable in large molecules as well to add the penetration correction to the aforementioned force fields.

3

Test of Method

The FEST-CD method is tested on the water and the carbon monoxide dimers. The study of these model systems will illustrate the capabilities of the FEST-CD method on structurally simple, but chemically important interactions. Tests on a larger sample of dimers are not done in the present article. In part II of this work an application of FEST-CD within a QM/MM simulation is reported. The quantum chemical CD is described in the next section. This is followed by a test of the fitting of the electrostatic potential on a grid. Finally, the Coulomb integral is evaluated for a few separations for the approximate density, which is compared to the reference evaluation including the entire Gaussian basis set used in the quantum chemical calculation.

3.1

Quantum Chemical Calculation

For water the O–H bond length is 0.9578 Å and the bond angle is 104.48◦ . Two basis-sets are tested, the aug-cc-pvtz and an atomic natural orbital (ANO) basis set with contractions 14s9p4d3f/4s3p2d1f for the oxygen atom and 8s4p3d/3s2p1d for the hydrogen atoms. 88,89 For carbon monoxide the bond length is 1.1271 Å and the (ANO) basis set with the contraction 10s6p3d/4s3p2d is used for both carbon and the oxygen atoms. The densities are computed 25

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

with the Hartree–Fock (HF) method. The HF method is of limited accuracy relative to experimental observables. As far as testing the quality of the approximate density obtained with the FEST-CD method, as long as the reference density uses a large number of GTO basis functions and the density is varied in regards to core, valence and angular momentum, we assume the test is relevant to applications to other similar molecules even for more sophisticated methods to obtain the density. All multipoles used in the FEST-CD model as factors are calculated with the MME method as described earlier. 9,10 All calculations are done with the quantum chemical package MOLCAS, where the FEST-CD method has also been implemented. 90

3.2

Fitting of Electric Potential on a Grid

The error function ∆c , Eq.(17), is minimized for all c in the studied molecules. Suitable values on the thresholds of the LMA were determined through initial testing of the method, and are given in Table 2. A number of different grid constructions are tested for the training Table 2: Threshold Values to the LMA for Water; Descriptions of the Meaning of the Thresholds are Given in the Method Section. Threshold τ∆ τλ τα τf

Value 10−5 10−4 6.0 7 · 10−2

as well as for the test, discussed further below. It is shown in Figure 2 the fitted electrostatic potential of carbon monoxide using a multipole expansion and using the FEST-CD method. In the fitting procedure the electrostatic potential was calculated for a set of 300 random points placed within distances from the atoms of 0.8 and 2.1 times the van der Waals atomic radius. The improvement of the FESTCD method relative the multipole results is evident, in particular at large magnitudes of the electrostatic potential, which are found at small separations from the molecule. 26

ACS Paragon Plus Environment

Page 26 of 45

Page 27 of 45

0.15 FEST-CD Model. Multipole Expansion. 0.10 Modelled potential (a.u.)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

0.05

0.00

-0.05 0.00

0.05 Reference potential (a.u.)

0.10

Figure 2: Result of the fitting of the electrostatic potential for carbon monoxide in a random grid of 300 points using the FEST-CD model and the multipole expansion. Local electrostatic potentials as a function of distance from each atom along the bond vector of carbon monoxide are given in Figure 3. As expected the lack of charge overlap in the multipole expansion produces too negative values. The error is still significant at distances from the atomic centers greater than 2 Å. The lack of charge overlap effects also gives the multipole expansion qualitatively incorrect behaviour as the separation approaches zero. Both shortcomings are addressed with the FEST-CD model. For the test system, the FEST-CD model that represents the charge density up to local dipoles with Slater functions, reproduces the short-range carbon reference potential very well, including at very short separations, see Figure 3 (a). If only charges are represented using Slater functions, and dipoles are kept as distributed multipoles, the optimal fit is a compromise between the depth of the potential and the repulsive part at very short separations, with neither region being perfect. The behavior of the FEST-CD model near the oxygen atom is analyzed both with and without the point quadrupole term included. For this case the difference between using Slater functions for the charge term or for both charge and dipole terms is small. When point quadrupoles are included together with FEST-CD up to dipoles, the difference 27

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

0.02

(a)

Potential

0.00

-0.02 Reference Multipole Expansion FEST-CD (dipole) FEST-CD (charge)

-0.04 1.0

1.5

2.0 Distance to C (Å)

2.5

3.0

(b)

-0.06 -0.08 Potential

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 45

-0.1 -0.12

Reference Multipole Expansion FEST-CD (dipole) (not quadr.) FEST-CD (charge) (not quadr.) FEST-CD (dipole)

-0.14 -0.16 -0.18

1

1.5

2 Distance to O (Å)

2.5

3

Figure 3: Electrostatic potential for carbon monoxide along the C-O bond. (a) In local carbon potential. (b) In local oxygen potential.

28

ACS Paragon Plus Environment

Page 29 of 45

0,6

(a)

Fitting cubic grid, 800 points

0,4 0,2

FEST-CD Multipole Expansion

0 -0,2

Electric potential (a.u.)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

0

0,1

0,2

0,3

0,2

(b) 0,1

Random_grid_0.65_2.5, 2000 points

FEST-CD Multipole Expansion

0 -0,1 -0,2 -0,05

0,05

0

0

(c)

0,05

0,1

0,15

Random_grid_0.85_2.5, 2000 points

FEST-CD Multipole Expansion

-0,05

-0,1 -0,05

0

0,05

Electric potential (a.u.)

Figure 4: Electrostatic potential of water in a cubic grid used for the fitting procedure and in two sets of random grids (see text). Results with the FEST-CD model and the multipole expansion. Ab initio values given by green lines. Note the different magnitudes of the axes. between modeled and reference electrostatic potential increases slightly at distances close to the minimum. On the other hand, at these short distances, a slight difference in curvature is observed for the different models. In particular the model with Slater type functions up to the dipole term being closer in this regard to the reference potential. This indicates there is still insufficient degrees of freedom in the model. However, the errors are small and a large improvement over the multipole model. Figure 4 shows the predicted electrostatic potential for water in three different grids with points of varying magnitude. The CD is obtained with the FEST-CD method as described above, in particular with Slater functions up to L = 1. One of the grids is created by constructing a cubic grid with a separation distance between grid points of 0.5 a.u., and where each point must be between 0.65 and 2.0 times the van der Waals atomic radius. The latter constraint is introduced in order to avoid grid points unreasonably close to the atoms, 29

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

6 12

Ano-basis, Test Grid 1 Ano-basis, Test Grid 2 Ano-basis, Test Grid 3 aug-cc-pVTZ, Test Grid 1

Ano-basis, Test Grid 1 Ano-basis, Test Grid 2 Ano- basis, Test Grid 3 aug-cc-pVTZ, Test Grid 1

aug-cc-pVTZ, Test Grid 2

aug-cc-pVTZ, Test Grid 2

aug-cc-pVTZ, Test Grid 3

aug-cc-pVTZ, Test Grid 3

9

4

Relative error (%)

Absolute error (kJ/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 45

2

6

3

0

MP

1

2

4 3 5 6 Training Grid

7

0

8

MP

1

2

4 3 5 Training Grid

6

7

8

Figure 5: Electrostatic potential absolute and relative errors for water using different sets of Test grids and training grids (see text for the definition of the training grids). MP is Multipole Expansion. Parameters of the training grids are: Grid 1) 0.65, 2.0, 2, 40; Grid 2) 0.65, 2.0, 2, 20; Grid 3) 0.85, 2.0, 2, 40; Grid 4) 0.85, 2.0, 2, 20; Grid 5) 0.85, 2.0, 3, 20; Grid 6) 0.85, 3.0, 4, 40; Grid 7) 0.6, 1.5, 4, 60; Grid 8) 0.5, 2.5, 4, 60 and to avoid over representing the electrostatic potential far from the region of overlap. This grid is used in the Levenberg-Marquardt optimization of the FEST-CD exponents. Also, results from two random grids are shown that only contain grid points that were not included in the optimization of the exponents. The points in these test grids are at distances between 0.65 and 2.5 times the van der Waals atomic radii of the atoms, as well as between 0.85 and 2.5 times the van der Waals radii. For carbon monoxide the FEST-CD method generates a more accurate potential than the multipole expansion, see Figure 4(a). Excluding only a very small number of points, the FEST-CD provides an excellent description of the electrostatic potential over the entire grid, even for large values that appear at short distances. The flexibility and accuracy of the FEST-CD model is confirmed by the results using random grids, Figures 4(b) and 4(c). FEST-CD corrects the deviations of the multipole model providing better results for the overwhelming majority of tested points. In order to test the sensitivity of the modeled CD to variation to the training grid, we

30

ACS Paragon Plus Environment

Page 31 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

optimize the model for water in eight different training grids, and test the accuracy in three testing grids. The absolute and relative errors of the electrostatic potential obtained with the grids employed in the fitting procedure are given in Figure 5 for the grids used to test them. For all three test grids the points have been chosen randomly. In Test Grid 1 those points are between distances of 0.65 and 2.5 of the van der Waals atomic radius. In Test Grid 2 the limits are 0.85 and 2.5 and in Test Grid 3 they are 1.3 and 4.0 of the van der Waals radius. In other words, the test grids are progressively less short-ranged. The eight training grids uses a spherical disposition of points around the atoms, different shells farther away from the centers of the atoms. Training grids of this kind reduces the number of points compared to a cubic or random grid but is expected to include the physically most relevant points. The spherical training grids are defined by four parameters: 1) the factor the atomic van der Waals radius is multiplied to obtain the radius of the closest spherical shell, 2) the distance between shells in atomic units, 3) the number of shells used and 4) the number of points in each shell. The exact values used for the eight training grids are given in the caption to Figure 5. In all cases any point closer to any atom than the smallest factor to the van der Waals radii is removed. As expected from the previous results, when the testing grid includes points very close to the atoms (Test Grid 1), the FEST-CD method produces a major reduction of the error compared to a multipole model. This finding is independent of the training grid used. At large and medium distances (Test Grid 3) the FEST-CD method and the multipole expansion generates similar errors, which is expected as well since the FEST-CD method converges to the multipole expansion at large distances and any residual error derives predominately from the truncation of the multipole expansion. We observe a slight reduction of error in the Test Grid 1 when the training grid includes points very close to the atoms (Training Grids 7 and 8). However, the results in the three test grids are remarkably stable to the sampled variations in training grid. In Figure 5 results for both ANO and aug-cc-pVTZ basis sets are given. The multipole expansion obtained with the aug-cc-pVTZ basis always generates the smallest errors. Check-

31

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ing the expectations values of x2 , y2 and z2 for both basis sets in water, we find that the electronic cloud obtained with the ANO basis is more extended than the one obtained with the aug-cc-pVTZ basis. This means that when using the ANO basis there is a larger charge overlap and therefore presents a greater challenge to model accurately. The reason for the difference in this regard between ANO basis set and aug-cc-pVTZ is likely due to how the two basis sets are derived. Although the difference is important to consider when the most accurate reference model compared to the physical reality is pursued, we do not elaborate on this difference any further, rather conclude that the CD produced by either basis set can be modeled with FEST-CD with very high accuracy. We observe that with training and testing grids that include points at short distances, the absolute error is less than 1.5 kJ/mol and the relative error less than 3%. Even with the grids that contain a large number of points, the fitting procedure requires just a few seconds to run on a standard desktop computer. We conclude that the training effort is not a costly process in order to attain the highest accuracy allowed by the model per se. The divide-and-conquer formulation of the fitting process further implies that the computational effort scales quadratically with the size of the molecule (for very large molecules additional model adjustments should be possible to reduce and take this closer to linear scaling). We conclude that for large molecules as well, the additional required computational resources to construct the FEST-CD is minor compared to the effort already required to obtain the standard multipole expansion.

Figure 6: Water dimer conformation.

32

ACS Paragon Plus Environment

Page 32 of 45

Page 33 of 45

ANO Basis

Energy kJ/mol

0

-20 Total Intermolecular Energy Reference Multipole Expansion FEST-CD FEST-CD (8% larger exponent)

-40

-60

-80 1.5

2.0

2.5

3.0

3.5

4.0

O···H separation (Å)

Aug-cc-pVTZ Basis 0

Energy kJ/mol

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

-20

Total Intermolecular Energy Reference Multipole Expansion FEST-CD

-40

-60

1.5

2.0

2.5

3.0

3.5

4.0

O···H separation (Å)

Figure 7: Coulomb energy using ANO or aug-cc-pVTZ basis sets for water dimer with different description models (see text).

3.3

The Coulomb Integral

Calculations of the Coulomb energy have been done in the water dimer in order to illustrate the accuracy of the electrostatic energy calculated with the FEST-CD method. One configuration of the water dimer has been considered, see Figure 6. Several O···H intermolecular distances are considered along the O-H···O line. In Figure 7 is presented the intermolecular electrostatic energy of the water dimer with the ANO and the aug-cc-pVTZ basis sets. It displays the energy calculated using the electron densities of the monomers as obtained from HF quantum chemical calculations (the reference energy), the energy calculated with the multipole expansion and the energy using the FEST33

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

0 -10 -20

Energy kJ/mol

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 45

-30 -40

Total Intermolecular Energy Reference Multipole Expansion Spherical Grid 0.85_2.0_3_20 Spherical Grid 0.85_2.0_3_20 (8% larger exponent) Spherical Grid 0.6_1.5_4_60 Spherical Grid 0.5_1.5_4_60 Shperical Grid 0.5_2.5_4_60

-50 -60 -70 -80

1.6

1.8

2.0

2.2

2.4

2.6

O···H separation (Å)

Figure 8: Coulomb energy for water dimer with different description models (see text). CD method with a spherical training grid defined by parameters 0.85, 2.0, 3 and 20 (see the previous section for the definition of the grid parameters). The first observation is that the multipole expansion quite poorly models the Coulomb energy at short distances in both basis sets. The results improves considerably when Slater functions are included with the FEST-CD model. The accuracy compared to the reference interaction, is extremely good for the aug-cc-pVTZ basis. The accuracy is very good for the ANO basis, but deviates somewhat from the reference interaction. If we heuristically increase the value of the Slater exponents for the ANO case by a 8%, the interaction better matches the reference curve. Since the electronic cloud calculated with the ANO basis set is more extended, as discussed in a previous section, this finding may suggest that either larger training grids are required for very extended basis sets, or the limitation to Slater functions up to L = 1 becomes significant. The result of the electrostatic energy for the ANO basis set calculated using different training grids are given in Figure 8. We observe that increasing the number of grid points and including more points at short separations, the particular section of the water-dimer in-

34

ACS Paragon Plus Environment

Page 35 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

teraction potential we study improves relative to the reference interaction. For the Spherical Grid 0.5_2.5_4_60 (see Figure 8) we obtain a relative error in the electrostatic energy at the shortest distance (just below 1.6 Å) to 7.4% and for the second shortest distance (1.85 Å) to 3.4%, while at distances above that generates relative errors below 1%. Piquelman and co-workers 54 and Wang and Truhlar 51 report relative errors for their methods as 13.5% and 8%, respectively. The former value is an average error for all studied systems, and the latter value is the error of ten different configurations of the water dimer. In the latter case the electrostatic energy does not exceed 35 kJ/mol, thus most likely the distances are larger and the charge overlap smaller than the values studied in this work. Despite this, the errors in the present study are smaller at comparable separations and still small at separations where the electrostatic energy is almost twice as large. We conclude that the FEST-CD method has several advantages both with respect to accuracy and the likelihood of transferability of parameters compared to literature methods. The method should therefore be possible to employ in force fields like AMBER 86 or AMOEBA 87 and also to be used in QM/MM methods. The method is implement in the MOLCAS package. 90 In a second part of this work we will show how this model can be used in a QM/MM scheme to model a free electron in water.

4

Conclusion

We have formulated a computationally expedient method to very accurately model a molecular charge distribution as a set of Slater-type functions of angular momentum L ≤ 1 and distributed ideal quadrupole moments. Non-linear fits of the exponents to a reference electrostatic potential are done with a Levenberg-Marquardt algorithm. The method is robust and scales favorably with increasing system size. The parameters are readily derived from each new system, hence removing concerns about parameter transferability across different molecular systems. The method is included in the MOLCAS package and is fully automated.

35

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

With detailed studies of carbon monoxide and water dimers the effects of charge penetration and how the current method corrects the significant shortcomings of the multipole expansion are presented. The model achieves accurate electrostatic energies even at very short distances and the errors are smaller than errors of comparable methods in the literature. The computational cost of this method allows it to be used in force fields to remove one important source of error. The model can be used in QM/MM calculations as well. An example of this is presented in the second part of this work.

5

Acknowledgment

Financial support by the Swedish Research Council (VR) through the Linnaeus Center of Excellence on Organizing Molecular Matter (OMM) is gratefully acknowledged.

Appendix A In this Appendix, the evaluation of the dipole–dipole interaction in terms of σ- and πcomponents is illustrated. The relation to the standard Cartesian procedure with inner products is made clear, which further illustrates that the damped interactions can not be computed simply by adjusting the Cartesian procedure with a scalar. To simplify the illustration without any loss of generality, the example below is two-dimensional, which means that there is only one π-component. Two dipoles, µ1 and µ2 , forming the angles α and β with the interaction axis R, interacts, as in Figure 9. The dipole vectors are decomposed in a component parallel with the interaction axis (the σ-component) and a component orthogonal with the interaction axis (the π-component). It is clear that different components do not interact, hence the total

36

ACS Paragon Plus Environment

Page 36 of 45

Page 37 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

µ1

µ2

α

β

R

Figure 9: Interacting dipoles in two dimensions. interaction is (22)

E = Eσ + Eπ , |µ1 | cos α · |µ2 | cos β , |R|3 |µ1 | sin α · |µ2 | sin β = , |R|3

Eσ = −2

(23)



(24)

where standard formulas for aligned and parallel dipole–dipole interactions have been used. The trigonometric relation sin α sin β = cos(α − β) − cos α cos β is used to rewrite the total energy as E = −3

|µ1 ||µ2 | |µ1 ||µ2 | cos α cos β + cos(α − β). 3 |R| |R|3

(25)

hµ1 , Ri , |µ1 ||R|

(26)

By definition it holds that cos α =

where hµ1 , Ri denotes the scalar product between R and µ1 . A similar relation holds for cos β, and it is also trivially true that

cos(α − β) =

37

hµ1 , µ2 i . |µ1 ||µ2 |

ACS Paragon Plus Environment

(27)

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 45

The total energy can hence be written as

E = −3

hµ1 , Rihµ2 , Ri hµ1 , µ2 i + , |R|5 |R|3

(28)

which is the equation obtained by the standard Cartesian procedure of taking the inner 1 product of the tensors µ1 µ2 and ∇∇ |R| . This is a step-by-step account of the relation

between the two approaches for this special case. The damping of Eσ and Eπ are different, when the dipoles are made diffuse, as discussed in the article. Therefore the two terms in the final equation above are not damped equally since they are a mixture of the two terms Eσ and Eπ . This illustrates that in a model with charge distributions of finite extension, the spherical representation of the interaction is preferred over the Cartesian method, which is mathematically convenient for a multipole model.

References (1) Ng, K. C.; Meath, W. J.; Allnatt, A. R. Mol. Phys. 1976, 32, 177–194. (2) Ng, K. C.; Meath, W. J.; Allnatt, A. R. Mol. Phys. 1977, 33, 699–715. (3) Ng, K. C.; Meath, W. J.; Allnatt, A. R. Mol. Phys. 1979, 38, 449–463. (4) Pack, G. R.; Wang, H. Y.; Rein, R. Chem. Phys. Lett. 1972, 17, 381–384. (5) Bonaccorsi, R.; Cimiraglia, R.; Scrocco, E.; Tomasi, J. Theor. Chim. Acta 1974, 33, 97–103. (6) Stone, A. J. Chem. Phys. Lett. 1981, 83, 233–239. (7) Stone, A. J.; Alderton, M. Mol. Phys. 1985, 56, 1047–1064. (8) Stone, A. J. The Theory of Intermolecular Forces, 1st ed.; Oxford University Press: Oxford, 1996. 38

ACS Paragon Plus Environment

Page 39 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(9) Karlström, G. In Proceeding of fifth seminar on Computational Methods in Quantum Chemistry; van Duijnen, P. T., Nieuwpoort, W. C., Eds.; Laboratory of Chemical Physics, University of Groningen: Groningen, The Netherlands, 1981; p 353. (10) Karlström, G.; Linse, P.; Wallqvist, A.; Jönsson, B. J. Am. Chem. Soc. 1983, 105, 3777–3782. (11) Andersson, M.; Karlström, G. J. Phys. Chem. 1985, 89, 4957–4962. (12) Bachrach, S. M. Rev. Comput. Chem. 1994, 5, 171–227. (13) Engkvist, O.; Åstrand, P.-O.; Karlström, G. Chem. Rev. 2000, 100, 4087–4108. (14) Hall, G. G.; Smith, C. M. Int. J. Quant. Chem. 1984, 25, 881–890. (15) Smith, C. M.; Hall, G. G. Theor. Chim. Acta 1986, 69, 63–69. (16) Hall, G. G.; Smith, C. M. Theor. Chim. Acta 1986, 69, 71–81. (17) Hall, G. G.; Tsujinaga, K. Theor. Chim. Acta 1986, 69, 425–436. (18) Stillinger, F. H.; Rahman, A. J. Chem. Phys. 1974, 60, 1545–1557. (19) Wallqvist, A.; Ahlström, P.; Karlström, G. J. Phys. Chem. 1990, 94, 1649–1656. (20) Wheatley, R. J. Mol. Phys. 1993, 79, 597–610. (21) Wheatley, R. J.; Mitchell, J. B. O. J. Comput. Chem. 1994, 15, 1187–1198. (22) Wheatley, R. J. Mol. Phys. 1995, 86, 443–465. (23) Strasburger, K. Computers Chem. 1998, 22, 7–12. (24) Chialvo, A. A.; Cummings, P. T. Fluid Phase Equil. 1998, 150-151, 73–81. (25) Paricaud, P.; Předota, M.; Chialvo, A. A.; Cummings, P. T. J. Chem. Phys. 2005, 122, 244511. 39

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(26) Dyer, P. J.; Cummings, P. T. J. Chem. Phys. 2006, 125, 144519. (27) Whitten, J. L. J. Chem. Phys. 1973, 58, 4496–4501. (28) Eichkorn, K.; Treutler, O.; Öm, H.; Häser, M.; Ahlrichs, R. Chem. Phys. Lett. 1995, 240, 283–289. (29) Eichkorn, K.; Weigend, F.; Treutler, O.; Ahlrichs, R. Theor. Chem. Acc. 1997, 97, 119–124. (30) Weigend, F. Phys. Chem. Chem. Phys. 2006, 8, 1057–1065. (31) Aquilante, F.; Lindh, R.; Pedersen, T. B. J. Chem. Phys. 2007, 127, 114107. (32) Cisneros, G. A.; Piquemal, J.-P.; Darden, T. A. J. Chem. Phys. 2005, 123, 044109. (33) Piquemal, J. P.; Cisneros, G. A.; Reinhardt, P.; Gresh, N.; Darden, T. A. J. Chem. Phys. 2006, 124, 104101. (34) Cisneros, G. A.; Piquemal, J.-P.; Darden, T. A. J. Chem. Phys. 2006, 125, 184101. (35) Cisneros, G. A.; Elking, D.; Piquemal, J.-P.; Darden, T. A. J. Phys. Chem. A 2007, 111, 12049–12056. (36) Elking, D.; Darden, T. A.; Woods, R. J. J. Comput. Chem. 2007, 28, 1261–1274. (37) Gresh, N.; Cisneros, G. A.; Darden, T. A.; Piquemal, J.-P. J. Chem. Theory Comput. 2007, 3, 1960–1986. (38) Volkov, A.; Coppens, P. J. Comput. Chem. 2004, 25, 921–934. (39) Volkov, A.; Koritsanszky, T.; Coppens, P. Chem. Phys. Lett. 2004, 391, 170–175. (40) Volkov, A.; Li, X.; Koritsanszky, T.; Coppens, P. J. Phys. Chem. A 2004, 108, 4283– 4300.

40

ACS Paragon Plus Environment

Page 40 of 45

Page 41 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(41) Volkov, A.; King, H. F.; Coppens, P.; Farrugia, J. Acta Crystallogr. Sect. A 2006, 62, 400–408. (42) Volkov, A.; King, H. F.; Coppens, P. J. Chem. Theory Comput. 2006, 2, 81–89. (43) Dominiak, P.; Volkov, A.; Li, X.; Messerschmidt, M.; Coppens, P. J. Chem. Theory Comput. 2007, 3, 232–247. (44) Spackman, M. A.; Maslen, E. N. J. Phys. Chem. 1986, 90, 2020–2027. (45) Spackman, M. A. Chem. Phys. Lett. 2006, 418, 158–162. (46) Donchev, A. G.; Ozrin, V. D.; Subbotin, M. V.; Tarasov, O. V.; Tarasov, V. I. Proc. Natl. Acad. Sci. USA 2005, 102, 7829–7834. (47) Donchev, A. G. Phys. Rev. B 2006, 74, 235401. (48) Donchev, A. G.; Galkin, N. G.; Pereyaslavets, L. B.; Tarasov, V. I. J. Chem. Phys. 2006, 125, 244107. (49) Donchev, A. G.; Galkin, N. G.; Tarasov, V. I. J. Chem. Phys. 2007, 126, 174307. (50) Donchev, A. G.; Galkin, N. G.; Illarionov, A. A.; Khoruzhii, O. V.; Olevanov, M. A.; Ozrin, V. D.; Pereyaslavets, L. B.; Tarasov, V. I. J. Comput. Chem. 2008, 29, 1242– 1249. (51) Wang, B.; Truhlar, D. G. J. Chem. Theory Comput. 2010, 6, 3330–3342. (52) Wang, B.; Truhlar, D. G. J. Chem. Theory Comput. 2014, 10, 4480–4487. (53) Elking, D. M.; Cisneros, G. A.; Piquemal, J.-P.; Darden, T. A.; Pedersen, L. G. J. Chem. Theory Comput. 2010, 6, 190–202. (54) Wang, Q.; Rackers, J. A.; He, C.; Qi, R.; Narth, C.; Lagardere, L.; Gresh, N.; Ponder, J. W.; Piquemal, J.-P.; Ren, P. J. Chem. Theory Comput. 2015, 11, 2609–2618. 41

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(55) Piquemal, J.-P.; Gresh, N.; Giessner-Prettre, C. J. Phys. Chem. A 2003, 107, 10353– 10359. (56) Kairys, V.; Jensen, J. H. Chem. Phys. Lett. 1999, 315, 140–144. (57) Freitag, M. A.; Gordon, M. S.; Jensen, J. H.; Stevens, W. J. J. Chem. Phys. 2000, 112, 7300–7306. (58) Rob, F.; Podeszwa, R.; Szalewicz, K. Chem. Phys. Lett. 2007, 445, 315–320. (59) Gavezzotti, A. J. Phys. Chem. B 2002, 106, 4145–4154. (60) Ma, Y. G.; Politzer, P. J. Chem. Phys. 2004, 120, 3152–3157. (61) Ma, Y. G.; Politzer, P. J. Chem. Phys. 2004, 120, 8955–8959. (62) Boys, S. F. Proc. Roy. Soc. A 1950, 200, 542–554. (63) Handy, N. C.; Marron, M. T.; Silverstone, H. J. Phys. Rev. 1969, 180, 45–48. (64) Ahlrichs, R. Chem. Phys. Lett. 1972, 15, 609–612. (65) Ahlrichs, R. Chem. Phys. Lett. 1973, 18, 521–524. (66) Morrell, M. M.; Parr, R. G.; Levy, M. J. Chem. Phys. 1975, 62, 549–554. (67) Hoffmann-Ostenhof, M.; Hoffmann-Ostenhof, T. Phys. Rev. A 1977, 16, 1782–1785. (68) Ahlrichs, R.; Hoffmann-Ostenhof, M.; Hoffmann-Ostenhof, T.; Morgan, J. D. Phys. Rev. A 1981, 23, 2106–2117. (69) Klopper, W.; Kutzelnigg, W. J. Mol. Struct. (THEOCHEM) 1986, 135, 339–356. (70) Shavitt, I. Int. J. Quant. Chem. 2004, 100, 105–108. (71) Roothaan, C. C. J. J. Chem. Phys. 1951, 19, 1445–1458.

42

ACS Paragon Plus Environment

Page 42 of 45

Page 43 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(72) Roothaan, C. C. J.; Ruedenberg, K. J. Chem. Phys. 1954, 22, 765. (73) Barnett, M. P.; Coulson, C. A. Philos. Trans. R. Soc. Lond., A 1951, 243, 221–249. (74) Harris, F. E. J. Chem. Phys. 1969, 51, 4770–4778. (75) Guseinov, I. I. J. Phys. B: At. Mol. Phys. 1970, 3, 1399–1412. (76) Tai, H. Phys. Rev. A 1986, 33, 3657–3666. (77) Harris, F. E. Int. J. Quant. Chem. 2002, 88, 701–734. (78) Harris, F. E. http://www.physics.utah.edu/∼harris/home.html/ (accessed March 8, 2016). (79) Lynden-Bell, R. M.; Stone, A. J. Mol. Sim. 1989, 3, 271–281. (80) Su, Z.; Coppens, P. Acta Crystallogr. Sect. A 1994, 50, 636–643. (81) Ivanic, J.; Ruedenberg, K. J. Phys. Chem. 1996, 100, 6342–6347. (82) Hättig, C. Chem. Phys. Lett. 1996, 260, 341–351. (83) Söderhjelm, P.; Krogh, J. W.; Karlström, G.; Ryde, U.; Lindh, R. J. Comput. Chem. 2007, 28, 1083–1090. (84) Levenberg, K. Quarterly Appl. Math. 1944, 2, 164–168. (85) Marquardt, D. W. J. Soc. Ind. Appl. Math. 1963, 11, 431–441. (86) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M. J.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 5179–5197. (87) Shi, Y.; Xia, Z.; Zhang, J.; Best, R.; Wu, C.; Ponder, J. W.; Ren, P. J. Chem. Theory Comput. 2013, 9, 4046–4063. 43

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(88) Widmark, P.-O.; Malmqvist, P.-Å.; Roos, B. O. Theor. Chim. Acta 1990, 77, 291–306. (89) Kendall, R. A.; Dunning, T. H.; Harrison, R. J. J. Chem. Phys. 1992, 96, 6796–6806. (90) Aquilante, F.; Vico, L. D.; Ferré, N.; Ghigo, G.; Malmqvist, P.-Å.; Neogrády, P.; Pedersen, T.; Pitonak, M.; Reiher, M.; Roos, B.; Serrano-Andrés, L.; Urban, M.; Veryazov, V.; Lindh, R. J. Comput. Chem. 2010, 31, 224–247.

44

ACS Paragon Plus Environment

Page 44 of 45

Page 45 of 45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Graphical TOC Entry

45

ACS Paragon Plus Environment