Article pubs.acs.org/JCTC
Automatic Generation of Intermolecular Potential Energy Surfaces Michael P. Metz,* Konrad Piszczatowski, and Krzysztof Szalewicz Department of Physics and Astronomy, University of Delaware, Newark, Delaware 19716, United States S Supporting Information *
ABSTRACT: A method is developed for automatic generation of intermolecular two-body, rigid-monomer potential energy surfaces based on symmetry-adapted perturbation theory (SAPT). It is also possible to substitute SAPT interaction energies by values computed using sufficiently high-level supermolecular methods. The long-range component of the potential is obtained from a rigorous asymptotic expansion with ab initio computed coefficients which seamlessly connects to SAPT interaction energies at large separations. An accompanying software package has been developed and tested successfully on eight systems ranging in size from the Cl−−H2O dimer to the cyclotrimethylene trinitramine dimer containing 42 atoms total. The potentials have a typical fit error of about 0.2 kcal/mol in the negative energy region. The accuracy may be further improved by including off-atomic sites or increasing their number. All aspects of potential development were designed to work reliably on a broad range of systems with no human intervention.
I. INTRODUCTION Many properties of matter can be determined by performing various types of nuclear dynamics calculations on Born− Oppenheimer potential energy surfaces (PES). In particular, a broad class of matter consists of separated molecules. Properties of such systems are mainly governed by intermolecular PESs. In the past, such potentials, often called force fields, have mostly been determined by adjusting some parameters in analytic representations of PESs to reproduce experimental results in simulations of simple systems, leading to the so-called empirical force fields.1−5 Using modern electronic structure methods, it is possible to readily compute ab initio, that is, from first-principles, accurate interaction energies of molecular systems consisting of tens of atoms.6−8 One can use such energies directly in molecular dynamics calculations (the so-called on-the-fly approach), but the computational costs are so high that only the simplest, and therefore most limited in accuracy, of computational methods can be applied. The other option is to fit analytic intermolecular PESs to a set of ab initio interaction energies. Once such a PES is known, costs of dynamic calculations are reduced by several orders of magnitude compared to on-the-fly strategy. The first intermolecular PES fitted to results of ab initio calculations was the water dimer potential of Matsuoka et al.9 published in 1976, a truly ground-breaking work taking into account that no previous ab initio potential was published even for atomic dimers [note that Liu and McLean10 did not present any fit in their well-known 1973 paper on He2, although they did compute a number of interaction energies that would have been sufficient for fitting]. Since then, the number of published first-principles intermolecular potentials rapidly increased, and © 2016 American Chemical Society
currently several dozen such potentials are published each year, including some trimer potentials. A few recent examples of such work are refs 11−27. Compared to empirical PESs, the ones based on first-principles calculations have several advantages. While the former PESs usually perform poorly for properties not used in the parametrization, the latter ones should perform equally well in any nuclear dynamics calculations. Furthermore, the accuracy of the latter PESs can be systematically improved by increasing the level of electronic structure theory, the number of system configurations (grid points) used to train the fit, and the complexity of the analytic function. The main advantage of the empirical approach is that it includes in an effective way several physical effects that are difficult to compute. For example, empirical pair potentials account in an effective way for nonadditive many-body effects. If an analytic PES is available, it can be used to perform a multitude of predictions of properties of matter. For example, water potentials have been used to predict spectra of water clusters,17,18,28−32 virial coefficients,17,18,22,33−36 structure and energetics of water clusters,19 and properties of liquid water via molecular dynamics (MD) simulations,36−38 see ref 39 for a review of this subject. First-principle PESs are often used in scattering calculations, in particular for systems of astrophysical interest.12,23−25,27,40−42 Another field where such PESs are making inroads are crystal structure predictions (CSP).43−47 These applications will be important in designing novel materials with desired properties if such properties can be Received: September 16, 2016 Published: December 2, 2016 5895
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
generated by combinations of products of one-dimensional grids with points added based on physical properties of a given surface, the present approach uses a guided Monte Carlo grid generation. The interaction energies at grid points are computed using symmetry-adapted perturbation theory61 (SAPT) based on density-functional theory (DFT) description of monomers [SAPT(DFT)],62−70 but some supermolecular methods can optionally be used. One advantage of SAPT is that it naturally decomposes interaction energies into physically meaningful contributions such as the electrostatics, induction, dispersion, and exchange energies. Thus, the use of SAPT allows one to get insights into physics of a given intermolecular interaction without performing any additional calculations. The analytic fitting functions are in the form of a sum of isotropic site−site functions, the form initiated in refs 34, 71, and 72, with an optional so-called polarization term. Both the asymptotic and short-range parts of our PESs are chosen based on exact quantum mechanical properties of interaction energies. A very important element of our approach is the use of the ab initio computed asymptotic region of the potential. The asymptotic part is then used with frozen parameters in fitting the complete potential. The asymptotic calculations are very inexpensive since the asymptotic expansion is built from monomer properties: multipole moments and polarizabilities. These properties are computed at the level consistent with SAPT(DFT), so that calculations for finite R are seamlessly connected to the asymptotics, i.e., the interaction energies predicted by the two approaches agree to an arbitrary number of digits for sufficiently large R. This approach was initiated using SAPT based on the wave function (WF) description of monomers in ref 51 and further developed in refs 33 and 71. The use of ab initio computed asymptotics has two important advantages. First, it makes the potential accurate for arbitrarily large separations between monomers, an important feature for scattering calculations and crystal structure predictions. Second, it allows to greatly reduce the number of grid points since such points are not needed for R larger than about 1.5 times the radial minimum distance for each monomer orientation. We used the weighted least-squares method for all fits. Our method assumes the rigid-monomer approximation which results in PESs being at most six dimensional (6D). For the majority of small molecules, this approximation works very well provided that the monomer geometry is properly chosen.30,56,73 The reason for this is that molecular vibrations are fast, small-amplitude motions, and the effects of such motions average in slow, large-amplitude intermolecular processes. For some recent discussion of this subject see refs 24 and 27. Some large molecules may have “soft” degrees of freedom for which the rigid-monomer approximation fails. At the present time, there are no general methods of dealing with interactions involving such molecules using a first-principles approach unless there are only a couple of soft degrees of freedom per dimer. The reason is that, in the product of onedimensional (1D) grids strategy, the number of grid points scales as kD, where k is the average number of points per dimension and D is the number of dimensions. While algorithms with random generations of grid points can achieve a less steep scaling, for large D the number of necessary grid points still has to become large which makes ab initio calculations for large dimers too expensive. The PES for a system of N atoms or rigid molecules (monomers) can be represented in the form of the many-body expansion starting from the two-body (pair) PES, followed by
expeditiously predicted by computations, guiding in this way experimental work. There is no established single procedure for developing intermolecular PESs based on ab initio interactions energies, and various combinations of possible approaches at each stage of such developments are used. To generate a set of grid points, many groups use regular grids in each coordinate and direct products of such grids.23,24,48,49 Another option is a random generation of grid points.21,44 Finally, grids are often constructed based on physical information about the system obtained from universal potentials or from earlier versions of potentials for a given system. Such information may include minima and other characteristic points on a PES, tunneling paths, and snapshots of Monte Carlo or MD simulations of condensed phases.17,18,34 Next, the interaction energies are computed for all grid points. Several electronic structure methods can be used for this purpose. In the following stage, one has to choose the analytic function that will be used to fit this set of interaction energies. For small, rigid monomers most often the expansion in terms of the distance R between centers of mass (COM) and of the Euler angles describing mutual orientation of monomers is used (see, e.g., refs 33, 50, and 51): the angular part is expressed in terms of products of Wigner rotation functions,52 each term multiplied by a different function of R. The most often used analytic form of PES for rigid monomers is a sum of isotropic atom−atom functions, the form used for all universal empirical force fields. The functions can be as simple as Lennard-Jones potentials but usually are more elaborate and include a combination of several exponential and inverse powers terms. Sometimes off-atomic sites are included as well.34,53 An extension of this approach is to use anisotropic site−site functions.44,54 Another strategy is to use only very simple atom−atom functions but include also products of such functions.55 This strategy seems to work particularly well when monomer internal degrees of freedom are included.24,48 Another option in such cases is to expand parameters in rigid-monomer fits into polynomials of internal coordinates.30,56 The functional forms discussed so far are based on known physical properties of interaction PESs. Neural network57 or Kriging (Gaussian process)58,59 approaches use functional forms unrelated to such properties but rather chosen for easiness of implementation. The fitting forms discussed so far are global. One can also use local functions, in most cases various kinds of splines.21,60 The fitting functions include a number of free parameters. In most approaches, the parameter values are established by using the least-square method or interpolations. In some approaches, e.g., neural networks or Kriging, special training algorithms are used, or the parametrization is an inherent component of the method. Generation of PESs requires large amounts of human effort, and frequently physical intuition is used in the process. The aim of the present work is to develop an automatic procedure of PES generation, which would require a minimal human input. To our knowledge, only one automatic method for generation of intermolecular PESs has been published.60 It is based on interpolating moving least-squares (IMLS) fitting, rather than on a global analytic function as used in the present work. Several other aspects of this method are different from our approach, as will be discussed later on. The method developed in the present work automates procedures for generation of PESs developed by our group starting in the mid 1990s;33,50,51 however, several new elements have been introduced. While the grid points were in the past 5896
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
computational cost which is insignificant relative to the total PES generation [dominated by SAPT(DFT) calculations at the set of short-range grid points]. All these steps are described in Section III.C. The interaction energies at asymptotic grid points are fitted using the proper terms in the functional form defined in Section IV to give a distributed asymptotic expansion, i.e., an expansion in terms of inverse powers of the distances between atoms of monomer A and those of monomer B. This procedure is described in Section V. The purpose of fitting a distributed expansion to the COM-COM one is twofold: first, distributed expansions are more readily implemented in MD and CSP programs, and second, they behave better at shorter distances than the COM-COM expansion does, as demonstrated in Section V. One should mention that it is possible to obtain distributed asymptotic expansion directly (see, e.g., refs 54, 76, and 77), i.e., without the fitting step. Implementations of such methods are under development in our group. Next, parameters in the short-range parts of the PES are fit to the set of SAPT(DFT) energies, with the asymptotic parameters kept frozen. This stage is described in Section VI. The surfaces obtained in this way tend to include isolated areas at very small R where energies are nonphysically low. Such areas are called “holes”, and a procedure of removing them is described in Section VII. Finally, the quality of the resultant fit is evaluated, and if it does not meet the required accuracy criteria, then the procedure is repeated with additional ab initio interaction energies computed. The entire process is executed fully automatically. The performance of the method is evaluated on several examples in Section VIII, whereas Section IX contains conclusions. The Supporting Information (SI) provides details of each of the examples of Section VIII, including monomer geometries, functional forms and parameter values of the PESs, local minima of the PESs, and lists of grid points and energies at these points. The software package implementing the methods described here is called autoPES. The latest versions of the autoPES and SAPT software packages and the corresponding user manuals are available free of charge at the URL http://www.physics. udel.edu/~szalewic/SAPT/index.html.
the nonadditive three-body PES, and so on. The method proposed here applies to the two-body component, the most important part of the expansion. However, if the induction part of the two-body energy is fitted using the polarization model, this model can then be used to approximate the higher-body induction effects. The procedure of PES development used here is depicted schematically in Figure 1. As input, the user provides monomer
Figure 1. Schematic representation of the automatic PES generation procedure.
geometries and some control parameters such as accuracy thresholds. The procedure then branches into two independent paths. The first path starts with the generation of a set of dimer geometries (grid points). A judicious choice of dimer geometries is vital for obtaining an accurate fit without using unnecessarily large amounts of computer resources. We demonstrate that an iterative Monte Carlo grid generation procedure based on a guiding potential, described in Section II, results in reasonably small grid sets. Ab initio calculations of interaction energies are then performed at these grid points using SAPT(DFT), as described in Section III.B. The SAPT2016 set of computer codes74 is used for these calculations. Optionally, the coupled cluster method with single, double, and noniterative triple excitations,75 CCSD(T), can be used instead of SAPT. The second path is the generation of the asymptotic part of the PES. It is executed independently of the first path and starts with the calculation of electron densities and frequencydependent density susceptibility (FDDS) functions (called also density−density response functions) of each monomer. The FDDSs are computed at the coupled Kohn−Sham (CKS) level which is consistent with the SAPT(DFT) level of theory and assures the seamless connection between these two approaches. The densities and FDDSs are then used to compute the coefficients of the COM-COM asymptotic multipole expansion of interaction energy in inverse powers of R, with the expansion coefficients dependent in an analytic way on the mutual orientation of monomers. This COM-COM expansion is used to obtain interaction energies on a grid of long-range dimer configurations, i.e., configurations such that the values of R are a few times larger than the separation at the radial minimum for a given angular configuration. Since the densities and FDDSs are computed only once for each monomer and the remaining steps of computing the asymptotic interaction energies are negligible in terms of computer resources, the asymptotic calculations are performed at a
II. GEOMETRY GRID In order to obtain an analytic representation of the PES for a given dimer, one has to first calculate interaction energies on a set of grid points in the space of coordinates defining the separation, mutual orientation, and internal geometries of the two monomers. In the case of rigid monomers, the internal coordinates are fixed, so each grid point is uniquely determined by the set of up to six coordinates (defined in a dimer embedded reference frame): the intermolecular separation R, i.e., the distance between the COMs of both monomers, and up to five Euler angles describing the relative orientation of monomers. Whereas three Euler angles are needed to describe the orientation of a rigid body, the overall rotation of the dimer around the COM-COM axis does not change the interaction energy, so only five out of the six Euler angles are needed to define the dimer geometry, and the remaining one can be set to zero. We have chosen to use βA, γA, αB, βB, γB angles (in the zyz space-fixed axes convention with counterclockwise rotations). Thus, the ith grid point is specified by six numbers R i = (R i , Ωi) = (R i , βAi , γAi , αBi , βBi , γBi ) 5897
(1)
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
determine an approximate location of the repulsive wall which has to be dependent on the relative orientation Ω. Let us first define, for a given configuration of the dimer, the quantity ρ being the minimum (over all pairs of atoms from different monomers) of the ratio of rab to the sum of the covalent radii of atoms a and b, where rab is the distance between atom a from monomer A and atom b from monomer B. Then for a given Ω, there exists a mapping from R to ρ: ρ = ρ(Ω,R). Note that this mapping is not always one-to-one, because for some dimers and choices of Ω, multiple values of R may give the same ρ. Therefore, we define the partial inverse function R = R(Ω,ρ) as the largest value of R satisfying ρ = ρ(Ω,R). We will assume that the repulsive wall is located at R = R(Ω,ρc). For neutral molecules, ρc = 1.0. For ionic monomers, ρc = 1.4 is used if the monomer’s charges are of the same sign, and ρc = 0.9 is used if the monomer’s charges are of opposite sign. These values of ρc were found to predict reasonably well the location of the repulsive wall in the systems tested. Grid points are randomly generated for intermolecular separations between Ra(Ω) and Rb(Ω). The value of Ra(Ω) is chosen as Ra(Ω) = R(Ω,ρc) so that it is in the region of the repulsive wall of the interaction potential. The value of Rb(Ω) is chosen such that the asymptotic part of the potential is sufficiently accurate for R > Rb(Ω), so that no additional ab initio dimer calculations are required in that region. This separation is chosen as Rb(Ω) = Ra(Ω) + 3.5, where all distances are in angstroms. The values of Euler angles αB and γX (X = A, B) are generated randomly from the uniform distribution in the range [0,2π], whereas values of βX are generated in the range [0,π] with the distribution proportional to sin(βX). This choice results in a uniform distribution of three-dimensional random rotations78 (the choice is related to the volume element in Euler coordinates: sin β dα dβ dγ). For each Ω, the range of R is determined, and then the R coordinate is generated with the probability density function F(R) given by
To obtain a given dimer configuration in terms of Cartesian coordinates of all atoms, the coordinates of the atoms of monomers A and B are transformed by a set of rotations and a translation, starting from a reference configuration with each monomer’s COM at the origin of a space-fixed system (the initial or reference angular orientation of each monomer is arbitrarily fixed and corresponds to all Euler angles equal zero). The order of transformations is as follows: 1. Rotation of monomer X, X = A, B by an angle γX ∈ [0,2π] about the z-axis 2. Rotation of monomer X, X = A, B by an angle βX ∈ [0,π] about the y-axis of the space-fixed system 3. Rotation of monomer B by an angle αB ∈ [0,2π] about the z-axis of the space-fixed system 4. Translation of monomer B along the positive z-axis by a distance R. See Figure 1 of ref 33 for an illustration of this process. Several aspects of PES development require a measure of the “distance” between two dimer configurations. To this end, we define the metric ϵmn for any two dimer configurations Rm and Rn as ϵmn =
1 NANB
∑ ∑ ∥X ijm − X ijn∥2 i
j≥i
(2)
The summations are over symmetry-nonequivalent atom types of each dimer, e.g., for (H2O)2, symmetry-nonequivalent atom types are H and O (i.e., the sum over i,j has elements HH, HO, and OO). The elements of the vectors Xmij are the distances between all pairs of atoms from different monomers in the dimer m, where one atom is of type i and the other of type j. In the case of the (H2O)2 example, this gives an XHH of length 4, an XOH of length 4, and an XOO of length 1. The elements of each Xmij are ordered from smallest to largest. The vertical bars denote the Euclidean norm, and NA and NB are the numbers of atoms in monomers A and B, respectively. By default, the autoPES suite determines the number of grid points to use in the initial fitting set based on the functional form of the fit. The initial grid size is set to six times the number of free parameters in the short-range part of the fitting function (see Section IV). These are the parameters optimized in the final, and most sensitive, stage of fitting, described in Section VI. This ratio was found to be a reasonably reliable approximation of the number of points required to achieve convergence of the potentials and was used in all test cases in the present work except for the cyclotrimethylene trinitramine (RDX) dimer. For the latter potential, which was developed before a rule giving the initial grid size was established, 1400 points were used rather than 1248 as given by the rule. An efficient choice of the grid points has to satisfy two competing criteria. The set of such points must adequately cover all important regions of the configuration space, and at the same time the number of points should be as small as possible to avoid wasting computer resources. The procedure described here addresses these issues. The automatic grid generation is split into two stages. The first stage, giving 85% of the grid points, is a Monte Carlo procedure designed to adequately cover the bulk of the relevant configuration space. The second stage, giving the remaining 15% of the grid points, is designed to guarantee coverage in the regions near local minima on the PES. II.A. Monte Carlo Grid Generation. At the stage of grid generation and also at some later stages, it is important to
F (R ) =
1 2
1 (R b − R a)(R − R a)
(3)
This distribution is designed to give a higher proportion of grid points at shorter separations. It was found by numerical experimentation to reduce the overall number of grid points required in comparison to a uniform distribution. An excessive number of grid points in the repulsive region close to Ra is avoided by the inclusion of the guiding potential rejection criterion, which is described next. Note that each grid point Ri is obtained independently of all previous points by consecutive pseudorandom drawings of coordinates βiA, γiA, αiB, βiB, γiB, Ri. A grid point generated using the above distributions is kept or rejected by either of the following criteria: 1. The first rejection criterion uses a guiding potential to weight the grid point distribution more strongly in the more attractive regions of the potential. We use in the initial iteration of grid generation the OPLS-AA potential developed by Jorgensen and co-workers.1 In consecutive iterations, the PES of the previous iteration is used. This allows us to determine an approximate interaction energy Eguide at each grid point i. The i point i is then rejected with probability Pi given by ⎛ E guide − E guide ⎞ Pi = 1 − exp⎜⎜ − i guide min ⎟⎟ ⎝ 1.2[|Emin | + 4.0] ⎠ 5898
(4)
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
Figure 2. Scatter plots demonstrating the grid generation algorithm for two test dimers. The ab initio computed interaction energies are on the vertical axis. The variable ρ given on the horizontal axis is defined in Section II.A.
where Eguide min is the global minimum of the guiding potential, located as described in Section II.B, and all energies are in kcal/ mol. The parameters of eq 4 were optimized by trial and error. 2. The second rejection criterion uses the distance metric ϵij of eq 2 to ensure that the grid points are spread out in the sixdimensional configuration space. A randomly generated grid point i is rejected if, for any j, ϵij < ϵ0, where j goes over all the already selected grid points. The variable ϵ0 is initialized to 0.5 Å and is continually adjusted. When a point is rejected due to this criterion, ϵ0 is decreased, ϵ0 → 0.99ϵ0, whereas whenever a point is accepted, ϵ0 is increased, ϵ0 → 1.03ϵ0. Using this procedure, if ϵ0 is very small, then most points will be accepted and ϵ0 will increase. Conversely, if ϵ0 is very large, then most points will be rejected and ϵ0 will decrease. For a given set of grid points, there exists therefore some value of ϵ0 such that on average ϵ0 remains constant, and ϵ0 will tend toward this value. Given the ratios 0.99 and 1.03, this value is such that about three out of every four points are rejected. A typical value of ϵ0 after many points are added is about 0.75 Å. This fairly high rejection rate despite small value of the final rejection threshold is due to the fact that our grid selection algorithm tends to put many points near the bottom of the vdW well and on its repulsive wall. The above procedure of random generation is iterated until 85% of the desired number of grid points is acquired. II.B. Additional Grid Points at Local Minima. The remaining 15% of the desired number of grid points is placed in the vicinity of minima. The regions surrounding the local minima of the PES are of particular importance in many applications. The generation of these grid points proceeds in the following steps: 1. A grid of about 10,000 configurations Ri is generated as in Section II.A. Starting from each of these configurations, a nearby local minimum is found using the search procedure of Cerjan and Miller.79 While the minimum localization procedure used here is rudimentary compared to more advanced methods which have been recently developed,80−82 it suffices and is fast enough for the six-dimensional space. 2. Redundant minima obtained in step 1 are filtered out. Two minima Rm and Rn are considered identical if ϵmn < 0.05 Å. Additionally, any minima such that R < R(Ω,0.9) (for neutral monomers and similarly for charged monomers) are filtered out to ensure that no nonphysical minima which are in fact “holes” in the PES are found (see Section VII). For the tested systems, this resulted in between 1 and 51 local minima.
3. The grid points are then placed in the vicinity of minima as follows. First, one point is added at the location of each minimum. The remaining points are then distributed randomly around each minimum, with the number of points assigned to each minimum in proportion to the magnitude of the interaction energy at that minimum. The distribution of these points is uniform within the range of ±12 degrees around the minimum value of a given Euler angle and ±0.5 Å in R around the radial location of the given minimum. II.C. Iterative Grid Enlargement. After the initial set of grid points is generated, it is randomly split into a training set and a test set, comprised of 85% and 15% of the total number of grid points, respectively, with the constraint that points in the test set are selected only from those grid points with ab initio computed interaction energies less than 10 kcal/mol. As the default autoPES setting, if the fit contains no holes (see Section VII) and the root-mean-square error (RMSE) on the test data is less than 1.2 times the RMSE on the fit data, the fit is considered to be converged (except that if this happens already in the first iteration, the second one is performed anyway). If the fit is not converged, then additional grid points are generated, equal to 20% of the number of points used in the initial iteration, and the process is repeated. The method used to generate grid points in consecutive iterations is the same as used for the initial generation, except that the PES from the previous iteration replaces the initial guiding function. In each iteration, the set of all grid points (containing both training and test points from the previous iteration and the new points) is randomly split into training and test sets. For the systems tested in the present work, between two and six iterations were required for convergence. After the convergence is reached, the final fit is performed using the sum of the training and test sets from the last iteration as the new training set. This fit is still checked for holes (Section VII) and if any are found, the hole fixing procedure is performed. Plots of generated grid points after all iterations are shown in Figure 2 for two example systems. The ab initio computed interaction energies are shown as functions of ρ rather than R. In this way the radial interaction energy curves are scaled onto the same range of separations which gives a better picture of the grid-point distribution. The maximum value of ρ varies slightly from system to system since the maximum separation condition is set in terms of R not ρ. Figure 2 shows that the distribution of points is fairly dense in the region of the global minimum. While this effect may be expected due to the condition of eq 4, 5899
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation it may be surprising that the density of grid points on the plot initially increases as we go from the global minimum in the direction of larger energies and the most dense region is located near zero energy. The reason is that the volume of configuration space near the global minimum energy is comparatively small. For Ω farther from this volume, a randomly chosen grid point has the highest chance to be accepted if it is in the region of a radial minimum, i.e., the minimum of Eint(R,Ω) for a fixed Ω (note that with scaled radial interaction energy cross-section curves, most of the minima are at similar values of ρ ≈ 2). Since most radial minima are shallow compared to the global one, and some configurations do not have a radial minimum, this leads to the observed density patterns. Also note that the density of points on the plot does not correspond directly to density of points in configuration space. The number of points with energies above 50 kcal/mol (not shown in the figure) is very small despite the probability density function of eq 3 being strongly peaked at ρ = 1. As ρ increases beyond ρ = 2, the number of points per a range of ρ decreases, mainly due to the condition of eq 3. This is an intended behavior since the large-ρ regions are well described by the asymptotic expansion. D. Asymptotic Grid. In addition to the main grid of SAPT(DFT) calculations, a separate grid for asymptotic interaction energies is needed to fit the long-range components of the PES. It is crucial that the dimer configurations used in these calculations are in the asymptotic region of PES, i.e., the overlap between electron densities of monomers can be neglected, so that the asymptotic expansion can be applied to accurately describe the intermolecular interaction. This asymptotic grid is constructed as in the first stage of the Monte Carlo procedure described in Section II.A. Here the values Rasym (Ω) and Rasym a b (Ω) are used in place of Ra(Ω) and Rb(Ω), where Rasym (Ω) is the value of R such that the smallest a rab at the angular orientation Ω is 5 + 0.4(RA + RB) and asym Rasym (Ω), where all distances are in angstroms. b (Ω) = 2 Ra The quantity RA is the maximum distance between the COM of monomer A and any of the atoms in monomer A, so that it is a measure of the size of monomer A, and similarly for RB. This choice of Rasym (Ω) was made using trial and error to ensure a that the asymptotic expansion is converged beyond this point. There usually exists a gap in coverage between the maximum of the close-range grid Rb(Ω) and the minimum of the asymptotic grid Rasym (Ω). The size of this gap is dependent on the size of a the system. For the smallest system tested, H2O−Cl−, the gap ranges in size from 0.3 to 0.7 Å depending on the dimer orientation, while for the largest, the RDX dimer, it ranges from 2.6 to 3.4 Å. This gap is intentional since the charge-overlap effects should be negligible in the asymptotic region. We will show that the damped asymptotic expansions recover interaction energies in this region very well, so SAPT(DFT) calculations are not needed there. A uniform distribution is used for the R coordinate in place of the distribution given in eq 3. The asymptotic grid consists of 12,000 configurations.
should provide satisfactory results. These methods give geometries corresponding to the global minimum on monomer’s potential energy surface, so-called re geometry. It has been shown27,56,73 that more accurate results in dynamics calculations are produced by PESs that use the so-called ⟨r⟩0 monomer geometries, i.e., geometries averaged over the ground-state vibrational wave functions. For few-atomic monomers, such averaging can be performed if the monomer PES is known. The other option is to use experimental geometries which are in most cases averaged in some way over monomer’s vibrational motions. III.B. Short-Range Calculations. The SAPT(DFT) approach62−70 in the density fitting version66,83,84 is used to compute interaction energies for the dimer configurations generated as described in Section II. This method was shown to give interaction energies of similar accuracy as those from the CCSD(T) approach.69,85−89 While the WF-based SAPT could be used in autoPES, such an option has not yet been implemented. The ORCA electronic structure package90 is used as the front-end interface for monomer DFT calculations. While the methods described here are not restricted to any particular basis set, all of the test cases were computed using either the aug-cc-pVDZ or aug-cc-pVTZ basis set91 together with the corresponding correlation-energy fitted auxiliary bases from ref 92. Note that for improved accuracy of density-fitted SAPT(DFT) results at larger R one should use84 the so-called JK auxiliary bases93 and quadruple precision arithmetics in calculations of electrostatic energies, but these options are not chosen by default. We have checked the accuracy of the density-fitting approximation at the outer edges of our shortrange regions, and it was sufficient. However, one has to use these options if performing calculations in the asymptotic region, particularly for nonpolar monomers. In all cases, a monomer-centered “plus” basis set (MC+BS) form with midbond functions is used for monomer calculations, see ref 94. Midbond function exponents and placement are as described in ref 87, with the midbond auxiliary basis set from ref 84. Currently autoPES supports the PBE95 and PBE096 functionals, though compatibility with any functional included in ORCA may easily be added. In the case of the nonhybrid PBE functional, the fast dispersion path of SAPT(DFT) is used as the default, see ref 97. All DFT calculations are done applying the gradient-regulated asymptotic correction (GRAC),88,98 which requires the ionization potential (IP) of each monomer. The calculation of the IP is handled automatically by separate DFT computations using ORCA, with the same basis and DFT functional as are used for the calculations of interaction energies and with the neutral monomer geometry used for the ion. The SAPT(DFT) interaction energy includes all corrections up to the second order69
III. AB INITIO CALCULATIONS III.A. Monomer Geometries. On input, users have to provide geometries of the monomers. The most common practice is to optimize the geometries using a modest level of ab initio theory, for example, the second-order many-body perturbation theory based on the Møller−Plesset partition of the Hamiltonian (denoted as MBPT2 or MP2) and a double-ζquality basis set. In many cases, even a DFT optimization
(5)
SAPT(DFT) (1) (2) (1) E int = Eelst (KS) + Eexch (KS) + E ind (CKS) (2) (2) (2) + Eexch − ind (CKS) + Edisp(CKS) + Eexch − disp(CKS)
where terms with the KS label are computed from Kohn−Sham determinants. The quantities denoted by CKS are computed using the complete FDDSs in the adiabatic local density approximation (ALDA). This means that the exchangeinduction and exchange-dispersion corrections are computed from FDDS amplitudes.70 The exchange-dispersion correction cannot be computed this way when using the fast dispersion 5900
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
nected to the corresponding SAPT component. For a detailed form of eq 7, see the Appendix.
path; therefore, in such a case this correction is replaced by the scaled version of the uncoupled E(2) exch−disp(KS) correction (see ref 69). For reasons explained later on, we will often group together some second-order terms. We will denote the sum (2) E(2) ind (CKS) + Eexch−ind(CKS) as “ind+x” energy and the sum (2) (2) Edisp(CKS) + Eexch−disp(CKS) as “disp+x” energy. In addition to the standard SAPT(DFT) corrections, one 99,100 should include for polar monomers the δEHF int,resp correction
IV. FUNCTIONAL FORM OF POTENTIAL ENERGY SURFACE After computing interaction energies at each grid point, the data is fit with an analytic function V which takes the form of a sum of site−site functions uab(rab), each dependent on the distance rab between an atom or an off-atomic site a of monomer A and an analogous site b of monomer B, plus an optional polarization model Vind(A,B)
HF HF (10) (20) (20) (10) δE int ,resp = E int − Eelst − Eexch − E ind,resp − Eexch − ind,resp
(6) HF Eint
where is the supermolecular Hartree−Fock (HF) interaction energy and the WF-based SAPT corrections on the right-hand side are calculated with monomers described at the HF level, as indicated by the second superscript (zeroth order in the intramolecular correlation operator). The subscript “resp” means that the coupled HF-type response of a perturbed system is incorporated in the calculation of a given correction. The term δEHF int,resp accounts mostly for induction and exchangeinduction effects of the order higher than second and can contribute several percent to interaction energies of polar systems like the water dimer. By default, the choice to include δEHF int,resp is made automatically using the following criteria. The correction is included if either of the two following conditions are met. The first condition is that the dipole moment of either monomer is greater than 1.5 D. The second condition is that the mean ratio of the induction to dispersion energies, computed using the asymptotic COM-COM expansion on the grid of asymptotic points described in Section II.D, is at least 0.125. Note that these criteria are different from those used recently in ref 89. The evaluation of δEHF int,resp typically increases the total computer time needed for the ab initio calculations by about 50%, mainly due to the HF calculations for the dimer. The current version of autoPES allows for the use of CCSD(T) instead of SAPT(DFT) for calculations of shortrange interaction energies. In principle, any supermolecular method could be used here, but the issue is with the connection to the SAPT(DFT)-level asymptotics. Since, as mentioned earlier, CCSD(T) and SAPT(DFT) interaction energies are very close, SAPT(DFT) asymptotics should be very good approximations to the CCSD(T) asymptotics (note that the form of the latter is unknown). Such combination was successfully used in ref 12. Another supermolecular method which gives results very close to SAPT is MP4, and this method can be easily enabled in autoPES. On the other hand, methods such as MP2 may result in a poor connection to the asymptotics, but if the accuracy goals are not too high, this combination may work too (except for interactions dominated by dispersion forces, in particular with π−π contributions). III.C. Asymptotic Calculations. The interaction energy Eint(R) in the large-R region can be approximated by the COMCOM asymptotic multipole expansion61,101 E int(R ) ≈
∑ n
Cn(Ω) = Rn
∑ n
Cn ,elst(Ω) Rn
+
∑ n
Cn ,ind(Ω) Rn
+
∑ n
V=
∑ ∑ uab(rab) + Vind(A , B)
(8)
a∈A b∈B
The set of sites includes all atoms of both monomers and may also include off-atomic sites if a higher accuracy is required. The coordinates of off-atomic sites have to be provided by the user on input as a part of the geometry of each monomer. Each site−site function is of the form m
uab(rab) = [1 +
∑ aiab(rab)i ]e α
ab
− β abrab
i=1
+ f1 (δ1ab , rab)
qaqb rab
∑
−
+
ab A12
(rab)12
fn (δnab , rab)
n = 6,8,10
Cnab (rab)n
(9)
where qx are partial charges on each site determining the electrostatic interaction between sites, Cab n are distributed induction plus dispersion coefficients (constrained to be positive in our convention which due to the minus sign in eq 9 gives always negative values of these terms), and f n are TangToennies damping functions102 n
fn (δnab , rab) = 1 − e−δr
∑ m=0
(δr )m m!
(10)
Aab 12
The coefficients are constrained to be positive and are included to ensure the correct repulsive behavior of the potential at very close range. Any of the sites in the fit may optionally exclude electrostatic, polarization, dispersion, or exponential components (controlled by input parameters). In the case that one or both monomers are charged, the first term of the induction plus dispersion expansion is n = 4 rather than n = 6 in order to reproduce the correct rate of decay of the induction energy. While odd inverse powers could be used in addition to even powers and would be nonzero, our experience has been that this choice does not lead to improved fits. The default order of the polynomial in the exponential terms describing the exchange and charge-overlap effects is 2, and the maximum order is 5. The form of uab may be optionally adjusted, depending on the intended application and desired accuracy of a fit, by increasing or decreasing the number of polynomial terms, decreasing the number of asymptotic terms, and switching off the damping. If the polarization model is used, the last term in eq 9 describes mainly the dispersion energy but includes a small portion of induction interaction, see below. The polarization model defining Vind consists of a set of induced point-dipoles μind a on each site a ∈ A (and analogously for b ∈ B), defined as
Cn ,disp(Ω) Rn
asym asym asym = Eelst + E ind + Edisp
(7)
where the coefficients Cn depend analytically on the angles Ω and contain contributions from electrostatic, induction, and dispersion interactions. Each of these components has its own well-defined asymptotic expansion which is seamlessly con-
μaind = αa[,a +
∑ Tabμbind ] b∈B
5901
(11) DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
Figure 3. Comparison of electrostatic components of SAPT(DFT), COM-COM multipole expansion, and distributed multipole expansion. The total interaction energy is included for perspective. No damping is used. The radial cross sections are through the global minima of the two systems (geometries are available in the SI).
where αa is an isotropic polarizability assigned to a site a (see Section V), ,a is the damped electric field at point a due to all permanent point charges qb of monomer B q rab ,a = ∑ f1 (δ1ab , rab) b3 rab (12) b∈B
require very high multipole components and the asymptotic part becomes divergent already at R beyond physically interesting ranges for such monomers.76,77 The site−site functions used by us are isotropic, which enables their use in standard MD and crystal structure prediction (CSP) programs.45,47,104 The isotropic site−site form restricted to atomic sites has limited accuracy, especially for small monomers, and to increase accuracy one has to use off-atomic sites. Many MD programs, e.g., DL_POLY,105 allow for such sites, and they can be easily introduced in CSP programs. Whereas the representation of PESs by a sum of isotropic atom−atom functions is not complete (independent of the form of the functions), such an expanison can be made near-complete by adding several off-atomic sites. For the water dimer, it has been shown in ref 53 that by using 8 symmetry-distinct sites per monomer rather than 5 such sites as in refs 28 and 34−36, the error of the fit for negative interaction energies was reduced from 0.1 to 0.01 kcal/mol (or 0.2% of the interaction energy at the minimum). The autoPES package can automatically create input files usable by DL_POLY which specify the generated PESs in tabulated form.
and Tab is the damped dipole−dipole 3 × 3 interaction tensor Tab = f3 (δpab , rab)
rab ⊗ rab 5 rab
− f3 (δpab , rab)
1 3 rab
(13)
with rab denoting the displacement vector from site a to site b, and ⊗ denoting the Kronecker product. Note that the damping parameter in eq 12 is the same as used in damping of electrostatics. One may argue that it would have been more appropriate to use f 2 rather than f1. We justify our choice by assuming that when the electric field is calculated as the derivative of the electric potential, one may treat the damping factor as approximately constant. After the set of eqs 11 is solved iteratively (or just singly iterated when fitting the second-order induction energy), the energy contribution from the polarization model is given by Vind = −
1 2
∑ ,c·μcind c
V. ASYMPTOTIC FIT After the long-range interaction energies are calculated from the COM-COM asymptotic expansion, a distributed asymptotic expansion is fitted to these data. This step gives the parameters qa, qb, and Cab n in eq 9 and αa in eq 11. The fitting of these parameters is performed in three stages. All nonlinear optimizations are performed with an implementation of Powell’s method.106,107 In the numerical examples discussed in the present section, the same sets of parameters as described in Section VIII were used. The first stage is to optimize the values of the permanent point charges qa and qb at each of the sites a on monomer A and b on monomer B that are requested on input to be charged (by default all sites are charged). The partial charges are determined by using the nonlinear weighted least-squares optimization to fit the electrostatic component Easym elst of the set of long-range interaction energies generated by the COMCOM asymptotic multipole expansion. The form of the fit is
(14)
where the sum is over all atoms c in both monomers. The use of the explicit induction model for two-body potentials leads to only minor, if any, improvements of the accuracy of the fit. However, in a cluster on N molecules, this model can be used to approximately compute also the nonadditive three- and higher-body effects. The extension of the procedure described above to N > 2 monomers is very simple: in eqs 11 and 12 the sum has to include all monomers other than A, and in eq 14 the sum is over all N monomers. The polarization model is very effective in describing nonadditive effects in water (see, e.g., refs 19 and 103) and in other polar fluids. The form of the default fitting function in autoPES was selected based on the anticipated applications of such potentials. To ensure applicability to large monomers, the fit is in a distributed, i.e., site−site form, as COM-COM fits 5902
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
Figure 4. Comparison of ab initio computed electrostatic, induction, and dispersion energies and asymptotic PES components for (H2O)2. The curve (2) HF denoted by ‘SAPT ind+x + δ′ shows the sum of terms E(2) ind + Eexch−ind + δEint,resp and is represented by the curves denoted by ‘dist. pol (damped/ undamped)’. The curve denoted by ‘SAPT ind+x + disp+x + δ′ shows the sum of terms E(2) + δEHF int,resp and is represented by the curves denoted by ‘dist. pol + disp (damped/undamped)’. The undamped polarization model excludes the damping in eq 13, but it does retain the damping in eq 12; whereas the damped model includes both. Both polarization models are iterated to convergence. The radial cross sections are through the global minimum of the system (geometries are available in the SI).
much better for the CHFOHNH2 dimer, a system 2.6 times smaller than the RDX dimer. In the case of the RDX dimer, the distributed expansion performs better than the COM-COM one: the error at the vdW minimum is about 20%, and it does not increase as quickly as in the case of the COM-COM expansion for still smaller R. This may seem contradictory since the former expansion was fitted to the latter. The explanation of this paradox is that a general distributed expansion describes interactions between small objects, of the size of one atom, and therefore it starts to diverge badly at smaller separations than the COM-COM expansion for a given pair of molecules. In our case the distributed expansion is limited to charges, which further prevents divergent behavior. For the CHFOHNH2 dimer, the distributed expansion performs about as well as the COM-COM one. This is an expected behavior for monomers with up to about 5 atoms;76,77 the reason that it is observed for an 8-atom monomer may be the compact shape of this molecule. For both systems, the distributed expansions reproduce SAPT(DFT) electrostatic energies very well for distances about 20% larger than the vdW minimum distance, demonstrating that indeed there is no need to compute SAPT(DFT) electrostatic energies for such distances. We have not damped the electrostatic energies for the two systems shown in Figure 3 because damping of electrostatics usually has only a small benefit except for charged or highly polar molecules. Thus, the default setting of autoPES is that damping of the electrostatic component is not included. Since both the COM-COM and the distributed asymptotic expansions lie above the SAPT electrostatic energies, one may think that damping the electrostatic terms by functions f1 whose values are always in the range (0,1) can only make the agreement with SAPT values worse. The reason for this apparent paradox is that the damped asymptotic expansion is not sufficient to reproduce nonexpanded electrostatic energies. Such expansion misses the so-called “spherical” terms which decay purely exponentially rather than as inverse powers of R multiplied by exponential factors.61 In order to achieve a fit to nonexpanded electrostatic energies valid at all separations, one would have to add the exponential component from eq 9 to a damped asymptotic expansion. We have not done so, since near the minimum and closer there are sufficiently many ab initio data points in the fitting set that the exponential terms of the PES can readily cancel the errors of the electrostatic expansion.
the same as the appropriate term in eq 9, except that damping is not used, i.e., qq asym uelst = ∑∑ a b r (15) a ∈ A b ∈ B ab Starting values of qx are set according to the standard atomic electronegativity values,108 and optimization is done under the constraint that the net charge of each monomer sums to the appropriate value. The fitting weight welst of each grid point is given by welst
⎡ = ⎢∑ ⎢⎣ a ∈ A
∑ b∈B
−2 1 ⎤⎥ (rab)νelst ⎥⎦
(16)
The constant νelst is the decay rate of electrostatic energy and is equal to 3 in the case of neutral polar monomers, 2 in the case of a single charged monomer, and 1 in the case that both monomers A and B are charged. In the case that one or both monomers have zero charge and zero permanent dipole moment (computed by ORCA together with the ionization potential), the value of νelst is increased by one or two, respectively. The electrostatic distributed expansion is plotted together with the electrostatic COM-COM expansion and the electrostatic component of SAPT(DFT) for two test systems in Figure 3. For the RDX dimer, the COM-COM expansion reproduces very well the SAPT(DFT) electrostatic energies down to about 7.5 Å or about 1.16 times the van der Waals (vdW) minimum distance. For smaller R the agreement deteriorates: at the vdW minimum the error is about 40%, and it quickly becomes much larger for still smaller R. The main reason for this behavior is that asymptotic expansions are only semiconvergent, i.e., at each R as the number of terms in the expansion increases, the agreement with the nonexpanded energy initially gets better, but beginning at some term the asymptotic expansion starts to diverge. If R is small enough, even the addition of the second term in the expansion may deteriorate the agreement with the nonexpanded value. For the RDX dimer, this happens at R a few percent larger than the minimum separation. For large molecules, higher multipole moments are large, and the divergence starts at a larger R than for small molecules. This is seen in Figure 3 where the COM-COM expansion converges 5903
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
Figure 5. Comparison of induction and dispersion components of SAPT(DFT), undamped COM-COM multipole expansion, and undamped and damped distributed asymptotic expansion. The total interaction energy is included for perspective. ‘SAPT ind+x + disp+x’ is the sum of the induction, exchange-induction, dispersion, and exchange-dispersion components of SAPT(DFT). Neither the δEHF int,resp correction nor the polarization model were used. Radial cross sections are through the global minima of the two systems.
used. The distributed polarizabilites were optimized to the COM-COM induction energies on the asymptotic grid using only a single iteration of the polarization model. The curve denoted as ‘distr. pol (undamped)’ was then obtained by iterating this model to convergence. This curve is to be compared with the ‘SAPT ind+x + δ’ curve representing E(2) ind + HF E(2) exch−ind + δEint,resp. The agreement is very good, with small deviations starting to show up only very close to the vdW minimum distance. In the vdW minimum region, the pure E(2) ind curve (not shown) lies significantly below the discussed SAPT curve, so the agreement with the former curve would have been much worse. The main reason for this are the “spherical” terms, analogous to those discussed earlier for the electrostatic component, not accounted for by the asymptotic expansion (even if it is damped).61 In the case of the induction energy, such terms are larger than for any other component since the induction energy suffers most from the excessive tunneling present in the Rayleigh−Schrödinger perturbation theory (the theory which defines electrostatic, induction, and dispersion components109) due to the lack of proper permutational symmetry of wave function corrections.61,109−111 The exchange-induction correction quenches a large part of this effect, in this way improving the agreement with the asymptotic expansion. Although the undamped polarization model was constructed without any knowledge of δEHF int,resp, it partially recovers this term by performing additional iterations (beyond the first one). For R in the region of the vdW minimum, the shift resulting from the use of the fully iterated polarization model versus singly iterated recovers about 40% of δEHF int,resp, most likely due to the exchange part of the latter term. The fully damped distributed polarization model makes the agreement with the SAPT curve slightly worse despite being optimized on this curve. One might think that the damped polarization model, in contrast to the asymptotic expansion of the induction energy, can possibly lower the undamped curve since the terms in eq 14 can in principle be of either sign. However, in practice these terms are all negative since the induced dipoles are closely aligned with the direction of the electric field from the permanent charges. The reason that
Furthermore, in the case of the distributed asymptotic electrostatic expansion, damping can actually shift the curve down, as it will be seen later on in the example of the water dimer. The reason is that this expansion is obtained as a sum of a large number of negative and positive contributions. The functions f1(δab 1 ,rab) of the negative components optimize to values close to 1, whereas those of positive components become smaller than 1, so that the sum becomes more negative. Note that the same is not possible for the induction plus dispersion expansion because the coefficients Cab n are positive. In the second stage, if the polarization model is selected, the isotropic polarizabilities αi are fit to minimize the weighted (as specified below) RMSE of the polarization model Vind with respect to the induction component of the asymptotic expansion. The permanent charges used in the polarization model are those from the previous step. This step uses similar nonlinear optimization as the first stage, but under the constraint that the αi sum to the average static polarizability of each monomer, defined as the arithmetic mean of the diagonal elements αxx, αyy, and αzz of the diagonalized 3 × 3 Cartesian polariazability matrix, computed analytically at the CKS level with a separate call to ORCA using the same basis and functional as the rest of the calculations. Fitting weights are given simply by wind = (Eind)−2. Note that in the case of electrostatic energies one cannot use (Eelst)−2 weights since these energies can be of either sign and the weights would be excessive in the regions where these energies change sign. During this fitting stage, the polarization model Vind is iterated only once, rather than to convergence. This is because the second-order correction E(2) ind corresponds to a single iteration, while the infinite-order δEHF int,resp correction, not computed on the asymptotic grid, corresponds to the fully iterated model. Thus, if this correction is used in the short-range region, the polarization model is iterated to convergence in the fitting of the total potential. The undamped polarization model results for the water dimer are shown in Figure 4. In this case, all atoms are polarizable. Note that we have developed also another potential for the water dimer, discussed in Section VIII, where to enable comparisons with earlier work the polarization model is not 5904
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
Figure 6. Comparison of asymptotic expansions with SAPT(DFT). The data series “PES asym” is the sum of the damped distributed electrostatic and induction plus dispersion expansions, and the series “PES exp” is the remaining component of the PES, consisting of the exponential term multiplied by a linear polynomial factor. The asymptotic expansions are able to accurately reproduce the SAPT(DFT) data for COM-COM = 12.5 Å for the RDX dimer, separations greater than 1.3 times the vdW minimum separation. At the configurations shown, Rb = 8.5 Å and Rasym a = 9.0 Å for the CHFOHNH2 dimer. Radial cross sections are through the global minima of the two systems. whereas Rb = 6.4 Å and Rasym a
SAPT(DFT) components, are also presented for two other test systems in Figure 5. The polarization model was not used in these cases. As discussed above, in the overlap region the asymptotic induction energy is always significantly different from the unexpanded one due to the large spherical terms absent in damped asymptotic expansions,51,109−111 and these terms can be partly canceled by combining SAPT(DFT) induction and exchange induction energies. Since we fit simultaneously the induction and dispersion components, for consistency we included the E(2) exch−disp term, although the dispersion energy does not suffer from large spherical terms. Thus, we compare the sum of asymptotic expansions of induction and dispersion energies to the total second-order SAPT(DFT) energy
damping makes the agreement slightly worse is our limits on the ranges of the damping parameters. Figure 4 also shows that the damped distributed electrostatic expansion approximates SAPT electrostatic energies very well for all R shown, significantly better than the undamped expansion. As explained earlier, this is possible by the reduction of the magnitude of the positive contributions while keeping the magnitudes of negative contributions nearly unchanged. The third stage of the asymptotic fitting is to optimize the distributed induction plus dispersion coefficients Cab n of eq 9, i.e., to fit the sum of COM-COM Easym + Easym ind disp by the expression asym u ind + disp = − ∑
∑ ∑
a ∈ A b ∈ B n = 6,8,10
Cnab (rab)n
(17)
(2) (2) (2) (2) E(2) = E ind + Eexch − ind + Edisp + Eexch − disp
When the polarization model (Vind) is used, rather than fitting asym asym asym these parameters to Easym ind + Edisp , they are fitted to Edisp + Eind asym − Vind with Vind only singly iterated. Since the difference Eind − Vind is very small, we will refer to quantities obtained in this way simply as the dispersion energies. This approach partly cancels the error of the polarization model. The Cab n parameters are fitted using a nonlinear least-squares optimization with weights asym −2 ab wdisp = (Easym disp + Eind ) , under the constraint that each Cn parameter be non-negative [in accordance with the sign convention in eqs 9 and 17]. Starting values for this stage of optimization are computed from the atom−atom dispersion function developed to be used with the dlDF functional.112−114 The expansion using distributed Cab n constants obtained from first-principles typically converges down to significantly shorter distances than the corresponding COM-COM expansion.76,77,115 The performance of this stage on the water dimer for the case that the polarization model is used is shown in Figure 4. The sum of the model and the asymptotic dispersion energy is significantly below the SAPT result. If the asymptotics is damped, the agreement becomes very good. This is obviously due to the damping of the dispersion expansion since the polarization model changes very little due to the damping, as discussed above. Plots of the distributed induction plus dispersion expansion, together with the corresponding COM-COM expansion and
(18)
plotted in Figure 5 as the ‘SAPT ind+x + disp+x’ curve. This figure shows that for both dimers, the distributed undamped asymptotic expansion performs better than the COM-COM one. Thus, the relations observed for expansions distributed using first-principles76,77,115 hold also for the fitted distributed expansions, despite the fact that the fitting is made to the COM-COM generated values. The undamped distributed expansion for the RDX dimer is a very good approximation to E (2) for all R. For the CHFOHNH 2 dimer, this approximation is very good down to R about 15% larger than the vdW minimum separation, but then the agreement quickly deteriorates and at the vdW minimum the error of the asymptotic expansion is about 50%. In both cases, the use of damping, which will be discussed in detail later on, significantly improves the agreement of asymptotic expansions with the SAPT(DFT) components. The distributed expansions fit the COM-COM generated interaction energies in the asymptotic region very well, with the average unsigned relative errors of about 5%. With all components of the asymptotic expansion analyzed, we may now make quantitative assessment of the earlier statements concerning the relations between the maximum extensions of the short-range grid. As Figures 3, 4, and 5 show, the distributed asymptotic expansions reproduce very well 5905
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
linear fitting). In this way, the number of free parameters scales as the number of sites in the fit rather than as the square of the number of sites. The use of combination rules leads to a much smaller decrease of accuracy than one could expect. In particular, we found that there are no substantial benefits resulting from the use of pair-specific values for the damping parameters. In the case of the αab and βab parameters, going from combination rules to pair-specific parameters results in some increase of the flexibility of the fit, but it is not usually sufficient to warrant the large increase in the size of the parameter space, except for very small systems. Therefore, combination rules are enabled in autoPES for these parameters by default. In the case of the linear aab i parameters, accuracy of the fits were improved substantially by forgoing combination rules, and so these parameters are given pair-specific values by default in autoPES and in all test cases in the present work. In the long-range part of the potential, the electrostatic component is by definition in atom-specific form. The induction and dispersion coefficients Cab n are also constrained to use geometric combination rules by default. The parameters Aab 12 are not optimized, so there is no need to make them atomspecific. The fitting weight of grid point i is given by the formula
SAPT(DFT) components for R about 1.3 times larger than Rmin(Ω), the distance of the radial vdW minimum. For R > 1.3 Rmin(Ω), the asymptotic expansions agree with the SAPT(DFT) interaction energies to within 5%, as shown in Figure 6. The values of 1.3 Rmin(Ω) amount to about 9.0 and 5.7 Å for the RDX and CHFOHNH2 dimers, respectively, and are close to the upper limits for generation of short-range grid points, Rb(Ω) = 8.5 and 6.4 Å, respectively. Thus, by using asymptotic expansions, a large volume of the six-dimensional dimer configuration space is accurately represented by performing single ab initio calculations only for isolated monomers rather than many SAPT(DFT) calculations for dimers. This results in significant saving of computer time.
VI. FTTING THE COMPLETE POTENTIAL ENERGY SURFACE The fitting of the complete surface consists of optimization of ab the nonlinear parameters {αab, βab, δab n , δp } and of the linear parameters aab . The former set parameters are optimized using i Powell’s method,106,107 and the latter ones are optimized using the linear least-squares method. The linear optimization is performed in each step of Powell’s method. Throughout the final fitting, the partial charges, distributed Cab n van der Waals coefficients, and site polarizabilities are fixed, such that the longrange part of the PES is unaffected by the final fitting. Symmetry is imposed by restricting all parameters associated with a given symmetry-equivalent site type (or site pair type) to have the same value. Symmetry-equivalent site types are specified on input by the user. Site types which are approximately symmetrically equivalent may optionally be specified as equivalent in order to simplify the functional form of the potential. Each of the nonlinear parameters is constrained to remain within a physically reasonable range of values. The βab parameters are restricted to the range (0.9,2.2), αab to the range (−1.0,7.0), and all δab parameters, including the polarization damping ones, to the range (0.5,2.0), with all ab values given in atomic units (eα is assumed to have units of hartree). The βab parameters can be related to the exponential decay rate of electron density,116 e−2 2I r , where I is the ionization potential of the system expressed in hartrees. For a pair of atoms, assuming that the first-order exchange correction is proportional to the overlap of electron densities, one can show that this correction decays as e−2 2I r , see, e.g., ref 16. In fact, the authors of this reference proposed a method of determining the exponents βab from ionization potentials. We have not used this method, but we note that a typical value of I amounting to ∼0.5 hartree gives β = 2, within our range. We found that in practice the minimum and maximum ranges for the αab and βab parameters were never reached in any of the final fits. The purpose of the limits on these parameters is therefore to ensure that the parameter values remain physically reasonable in early stages of optimization or on early iterations of grid generation where some region of the PES may lack sufficiently many grid points. Combination rules may optionally be used to reduce the number of free parameters. In this case, each unique site type a is associated with its own set (αa, βa, aai , δan, δap). The values associated with each site pair are then given by the arithmetic mean of corresponding site-specific parameters in the case of ab αab, βab, and aab i and by the geometric mean in the case of δ ab (the arithmetic mean is required in the case of ai to enable
⎛ min(Ei , Vi ) ⎞ wi = exp⎜ − ⎟ Ew ⎠ ⎝
(19)
where Ei is the ab initio energy at the configuration i, Vi = V(Ri), and Ew is a system-dependent constant. In the case of the linear least-squares optimization of aab i parameters, where the weight cannot be a function of the PES, the value Vi is computed using the values of all free parameters, the linear and the nonlinear ones, following the previous Powell iteration. Here “Powell iteration” refers to a consecutive one-dimensional optimization along each conjugate direction, see ref 107. In fitting stages where components of the PES are optimized to individual SAPT energy components rather than to the total energy, the expression min(Ei, Vi) is simply replaced by Ei. We have not used the known value of the given component at Ri instead of Ei since we want the components fits to be most accurate at the same regions as the total interaction energy. The value of the constant Ew is given by Ew =
|Emin| + 4.0 Cw
(20)
where, by default, Cw = 6.0, Emin is the lowest ab initio computed interaction energy for the grid points in the fitting set, and all energies in eq 20 are in kcal/mol. This default value was used in all test cases in the present work, with the exception of the methyl formate−helium dimer where the much stronger weight Cw = 20.0 was used due to the very small magnitude of the interaction energy of that system. The form of eq 20 was chosen to give reasonable weights for the anticipated applications of the potentials for a wide range of systems, including systems which are entirely repulsive. The weight function (19) is used in each of the optimization stages in which a weighted least-squares fit is performed. Note that the use of this function means that we weight low energy regions of PES more than higher energy regions for the second time since the first such weighting was imposed during the selection of short-range grid. During fitting, any grid points with Eint > 100 kcal/mol are discarded, because such points lie outside the region that we 5906
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
case of Figure 4. If CCSD(T) calculations are used in the shortrange region, this stage is skipped. 4. The polarization damping parameters δab p are nonlinearly optimized to minimize the weighted RMSE of the polarization model Vind (if this model is used) against the sum of the E(2) ind , HF , and δE (if selected) components on the shortE(2) exch−ind int,resp range grid. This stage is dependent on the preceding one due to the presence of the δab 1 parameters in eq 12, which are kept fixed at this stage (if electrostatics were damped, if not, the field due to permanent charges is not damped either). As mentioned earlier, if the δEHF int,resp term is included, the polarization model is iterated to convergence, and if it is not included, it is iterated once. Again, if CCSD(T) calculations are used in the shortrange region, this stage is skipped. The damping of the polarization model is illustrated for the case of (H2O)2 in Figure 4 and was briefly discussed earlier. Polarizable point dipoles are located on all three atoms of both monomers. The undamped curve lies above the SAPT curve, and damping moves the former curve slightly up, making the agreement with SAPT results worse. Thus, in contrast to the electrostatic energy, damping is not able to shift the curve down. The net shift is due to the fact that the damping parameters are restricted to the range [0.5,2.0], and, therefore there is some damping present at the upper range. This slight damping is retained to avoid divergence of the model at very short distances. 5. The induction plus dispersion damping parameters δab n are nonlinearly optimized to minimize the weighted RMSE of the last term in eq 9 against E(2) + δEHF int,resp. If the polarization model is used, this term is optimized to the SAPT components minus Vind. Note that our approach becomes a bit inconsistent when δEHF int,resp is included and the polarization model is not used. This inconsistency is due to the fact that δEHF int,resp, as mentioned earlier, does contain a long-range component. However, this component decays as 1/R9, so at asymptotic separations it is negligible compared to the slower decaying terms and at shorter separations the inconsistency can be made up by the other terms. This stage is skipped if the CCSD(T) method is used. The performance of the damping factors is shown in Figure 5. The δEHF int,resp inconsistency is not affecting these results since this correction was not used for either system. As briefly discussed earlier, for the RDX dimer, the damping results in a near-perfect agreement of the fit curve with the SAPT curve for all R shown; however, already the undamped expansion performs very well. In contrast, for the CHFOHNH2 dimer, the use of damping results in a dramatic improvement of the agreement between the two sets of results. Although the damped curve is very close to the SAPT one, one might expect a still better agreement. However, what is shown in the figure is just one of many radial cross sections of the PES: for some other cross sections the damped curve may lie above the SAPT one. The reason is not that the damping parameters are reaching their lower limit, as this has not been observed (whereas some parameters do reach the upper limit in some cases). 6. The αab and βab parameters are nonlinearly optimized while keeping the linear parameters aab i fixed at zero and all previously optimized parameters frozen. Starting at this stage, the entire PES, V of eq 8, is fitted to the total SAPT interaction energies rather than parts of it to specific SAPT components. The χ2 functional minimized by the nonlinear optimizer is of the form
aim to reproduce. The grid generation procedure is designed to avoid such points, and so few if any grid points actually lie in this region. The fitting is performed in eight consecutive stages: 1. The repulsive Aab 12 parameters are set. These are intended primarily to alleviate the problem of “holes” in the repulsive wall (see Section VII), rather than to increase accuracy of the PES in attractive regions. Therefore, they are not fit directly to ab initio data, but the set of ab initio interaction energies is used ab to identify the location of the repulsive wall. The A12 parameters are given by ab A12 = Ewall[ρwall (racov + rbcov)]12
(21)
where rcov i is the covalent radius of atom i, Ewall is an applicationdependent parameter which is set to Ewall = 15 kcal/mol for the test cases in the present work, but can be set to a higher value for applications probing high-energy regions, and ρwall is a system-dependent dimensionless parameter which represents cov the expected location of the repulsive wall relative to rcov a + rb . ab 12 Equation 21 shows that the contribution A12/rab at rab = cov ρwall(rcov a + rb ) is equal to Ewall, the expected value of the PES for such configuration. This term increases quickly for smaller rab, counteracting the possible divergent behavior of negative 1/ rnab terms, whereas for larger rab, it quickly goes to zero. The parameter ρwall is set to the maximum value satisfying ∑i Θ(ρwall − ρi )Θ(Ei − Ewall) ∑i Θ(ρwall − ρi )
> 0.95 (22)
where the summations are over all ab initio grid points i, ρi is the distance between closest-contact atoms of grid point i divided by the sum of covalent radii of those atoms, and Θ is the Heaviside step function. The denominator in eq 22 gives the number of grid points with the normalized closest-contact distance ρi < ρwall. The numerator is the number of such points with interaction energies larger than Ewall. For large ρwall, the fraction of eq 22 is small and as ρwall decreases, the fraction becomes equal to 1 for some value of ρwall. Choosing the fraction slightly smaller than 1 makes the value of ρwall more stable with respect to the set of grid points. Typical values of ρwall are in the range of 1.1 to 1.4. The numerical constants in eqs 21 and 22 were optimized by trial and error to give the largest possible values of the Aab 12 parameters which do not cause a decrease in accuracy in the energetically important regions of the PES. The issue of holes is discussed further in Section VII. ab ab 2. The nonlinear αab, βab, δab 1 , δn , and δp parameters are constrained to be the same for all ab, and all the linear aab i parameters are set to zero. This set of five parameters is then least-squares optimized to minimize the weighted RMSE on the grid of SAPT(DFT) total interaction energies. This optimization is not prone to local minima in the χ2 functional and provides reasonable starting values for steps 3−6. Recall that all the asymptotic parameters remain frozen. 3. The electrostatic damping parameters δab 1 are nonlinearly optimized to minimize the weighted RMSE of the electrostatic term of eq 9 against the electrostatic component of the SAPT(DFT) interaction energies (if electrostatic damping is used) on the short-range grid. The effect of electrostatic damping is shown in the case of the water dimer potential in Figure 4. As discussed earlier, because of the cancellation between positive and negative terms in the electrostatic expansion, damping can actually increase the magnitude of the electrostatic energy at some configurations, as it is in the 5907
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation χ2 =
∑ wi[Ei − Vi ({α ab , β ab , a1ab , ..., amab})]2 i
parameters αab and βab which are uniformly distributed in the ab ab ab ranges (αab 0 −0.5,α0 +0.5) and (0.9 β0 ,1.10 β0 ), respectively. As before, each set of parameters is constructed by random selections from the consecutive ranges. The number of 128 sets is arbitrary and was chosen by trial and error. Clin was kept fixed at 10−5. After each set of starting parameters is optimized for 30 Powell iterations, the optimization with the smallest χ2 is selected as the final fit. The introduction of randomized starting values was found to reduce the value of χ2 by about 10% for the tested systems.
(23)
where the sum is over all Ngrid grid points i, Ei are ab initio interaction energies, Vi denotes the value of the fit function evaluated at grid point i, and wi is defined by eqs 19 and 20. Combination rules are always used for the αab and βab parameters during this stage of optimization. If CCSD(T) calculations are used in the short-range region, then the damping parameters are optimized to the total interaction energy together with the αab and βab parameters in this and the following stages. 7. The αab and βab parameters are further optimized starting from the values from stage 6, with the aab i parameters optimized linearly at each evaluation of the χ2 functional during the nonlinear least-squares optimization. At this stage, combination rules are no longer used for parameters αab and βab if this option is requested by the user. For each set of assumed values {αab,βab}, the linear parameters are obtained by solving the set of linear equations of the least-squares method. The resulting value of χ2 is used by the Powell nonlinear optimization procedure to search for the minimum of χ2 in the space of the parameters {αab,βab}. The linear fitting minimizes the functional χ2lin, given by 2 χlin = χ 2 + C lin
σ Nlin
VII. HOLE FIXING Due to the limited number of ab initio data points used in the fit, a PES optimized using the procedure described in Section VI may not feature an appropriate short-range repulsive wall at all orientations. In particular, regions of too low energies, often negative with a large magnitude, called holes, may be present. In fact, without the 1/r12 ab terms, one has to expect that for small enough R the negative inverse power terms which go to minus infinity as R decreases will be larger in magnitude than the positive exponential terms which go to a constant value. Even with 1/r12 ab terms which prevent this behavior at very small R, the unphysical holes may appear at small R. However, holes are harmless if they are behind sufficiently large barriers. Thus, a procedure is required to search the repulsive wall of the potential to ensure that it does not contain any regions where the interaction energy fails to reach a sufficiently high value, denoted by Ewall. For most applications, a value of Ewall of 15 to 20 kcal/mol is sufficient to ensure that MD simulations or CSP optimizations do not fail as a result of such holes. The holefixing procedure is iterative and consists of repeatedly scanning the PES for holes, calculating a set of additional grid points as necessary, and refitting the data, i.e., repeating the procedure of Section VI, until no further holes are found. The number of grid points added in any hole-fixing iteration is at most 10% of the number of points used in the initial iteration (if more holes are detected, the new grid points are added only for the ones with lowest energies, see below). To perform the search, the space of relative angular orientations of the monomers is divided evenly into ten increments in each of the five Euler angles for a total of 105 angular orientations Ω, and a radial scan of the PES is performed along each direction. Each radial scan consists of 200 HS points with R spaced evenly in the range (RHS a (Ω),Rb (Ω)), HS where Rb (Ω) is set as the smallest distance such that the closest-contact atoms between the monomers are separated by 7 Å. The value of RHS a (Ω) is set to R(Ω,0.8ρwall) [the latter quantity was defined in Section II.A], where ρwall is a systemdependent value defined in Section VI which represents the expected location of the repulsive wall. The value of ρwall is established using eq 22, but with a minimum ratio of 0.85 rather than 0.95, making ρwall a larger value. The value of ρwall adjusts automatically as more hole-fixing iterations are performed. For example, if in the first iteration the value of ρwall is too large and so the hole-fixing points are placed at too large separations, then in the next iteration less than 85% of grid points i with ρi ≥ ρwall (with ρwall from the first iteration) will have interaction energy Ei > Ewall, and so ρwall will decrease in the second iteration. The fit energies at these 200 points of each radial scan are then checked for any of several criteria which are considered to constitute a hole in the PES. Starting from the farthest-distance point and moving inward, the first point found which satisfies
m
∑ ∑ ∑ (aiab)2 a∈A b∈B i=1
(24)
where the second term is a penalty function used to restrict the magnitudes of the linear parameter values, Nlin is the total number of linear parameters, and the scaling factor σ is given by Ngrid
σ=
2 ∑ wE i i i=1
(25)
Note that while the functional χ2lin is used to obtain the linear parameters, only the value of χ2 computed with these parameters is returned to the Powell subroutine. The optimization of the αab and βab parameters is highly prone to local minima in χ2. To alleviate this problem, the linear parameter penalty function scaling parameter Clin is initially set to a very high value. This prevents nonlinear parameters from optimizing to values resulting in significant cancellations of terms with large values of aab i ’s of opposite signs. The value of Clin is lowered gradually before each consecutive Powell iteration. As Clin is reduced, and so the space of allowed parameter values grows, the χ2 functional becomes more complex and contains more local minima. By gradually reducing the value of Clin as the nonlinear parameters are optimized, the final optimized state is much more likely to be near the global minimum of χ2. Clin starts at 100 and is lowered by a uniform ratio over 30 Powell iterations to the final value of 10−5. A small value of Clin is retained because it was found to substantially reduce the magnitudes of the linear parameters even at this stage, without significantly affecting the RMSE of the final fit. We found that the use of the penalty function in χ2lin results in better fits for the tested systems than the functional of eq 23 alone. The αab and βab parameters obtained after this ab stage are denoted αab 0 and β0 . 8. Although one may think that the procedure of step 7 gives the global minimum of χ2 in the space of parameters, we have found that the value of χ2 can be further reduced by the following procedure. It consists in performing 128 optimizations of the same parameters as in step 7, each with starting 5908
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
If the number of unique holes is larger than 10% of the total size of the grid-point set, Ngrid, only 0.1 Ngrid SAPT(DFT) calculations are performed, for the holes with the lowest energies according to the fit (if R of a hole point was increased as described above, the hole energy is still that at the original position). The search for holes is performed after a given standard (i.e., extending the set of grid points) iteration is finished independently whether the convergence was reached or not. After interaction energies at the hole locations are computed, the whole current set of grid points, including the new points at the holes, is randomly divided into training and test sets, and the fit is performed again. Then this fit is again tested for holes, and, if any are found, additional ab initio calculations at hole locations are performed; and the procedure described above is repeated. This process is iterated until no holes are found. In practice, the first iteration usually removes all holes. After no holes are found and the ratio of RMSEs on the fit and test sets does not satisfy the convergence criteria, the consecutive standard iteration is performed. If the criterion is satisfied, one more fit is performed on the union of the fit and test sets. This step creates the final version of the fit. Note that our RMSE criterion for the convergence of fits is consistent with all the constraints used in choosing grid points. If test points were chosen without such constraints, such criterion would be less relevant from the point of view of anticipated applications of the PESs.
any of the criteria is taken as the location of the hole. The criteria for the existence of a hole are 1. The low end of the interval RHS a (Ω) is reached, and the interaction energy is less than Ewall. This criterion ensures that the repulsive wall is not located at an unphysically short distance (or there is no repulsive wall). 2. There exists a point where the evaluated interaction energy on the PES is less than 1.2 Emin, where Emin is the lowest ab initio interaction energy of any of the grid points. This ensures that the PES does not have regions with excessively negative values. 3. The evaluated interaction energy of a point is less than 0.9 Vm and Vm > 5 kcal/mol, where Vm is the largest evaluated interaction energy of any of the points already checked in the current radial scan. This criterion ensures that the repulsive wall has the proper monotonically increasing value as R decreases. The fairly large value of Vm prevents detecting false “holes” at large separation where interaction energies may have maxima. 4. If no hole is identified by the criteria described above, a second stage of hole searching is begun in which the dimer is no longer constrained to a radial scan along a fixed orientation. This stage starts at the smallest-R point along the radial scan for each angular configuration Ω such that the PES value V is less than or equal to 0.2 Ewall. A nonlinear minimization using Powell’s method is then performed in the six coordinates defined in eq 1 of the function VHS = max(V , 0.2Ewall) + σHSPHS
(26)
VIII. RESULTS We have selected eight systems of varying sizes for testing the performance of autoPES. Geometries of monomers were either taken from literature specified in the SI or optimized by us at the theory level specified in the SI. The water dimer and the RDX dimers were selected since SAPT(DFT) potentials have been developed for these systems in the past in refs 86 and 117, respectively. These fits were developed “manually” but using otherwise similar methodology (small differences will be discussed later on). These two cases allow for comparison between potentials generated by autoPES and those developed using traditional methods. The RDX dimer is a fairly large system, with 42 atoms total, and it also demonstrates the ability of autoPES to deal with such large systems. The CHFOHNH2 dimer, where the monomer is the methane molecule with three hydrogens substituted by F, OH, and NH2, was selected as a relatively small molecule with no elements of symmetry in order to test the asymptotic calculations for such molecules. The water complex with the chloride ion was chosen as a simple system with a charged monomer. The complex of helium atom with methyl formate was chosen since a potential for this system was developed12 earlier using a combination of the CCSD(T) method and SAPT(DFT) asymptotics. Nmethylacetamide is the simplest molecule containing peptide bond, so it is of biochemical interest. Similarly, the formic acid is the simplest carboxylic acid, and formamide is the simplest carboxylic acid amide. In addition to the SAPT(DFT) water dimer potential which we compare to ref 86, and the potential with the polarization model discussed earlier, we have also developed a second water dimer potential using the supermolecular CCSD(T) method. All the systems but the RDX dimer are small enough to make it possible to develop the potential in about 1 day provided that one has a few hundred compute cores at the disposal. The specification and the quality measures for all the PESs for all the systems are presented in Tables I and II.
where PHS =
⎛
∑ ∑ ln⎜0.85 + a∈A b∈B
⎝
racov
⎞ rab cov ⎟ + rb ⎠
(27)
and σHS is a constant. The scaling factor σHS is set to 0.01 Ewall/ P0HS where P0HS is the value of PHS at the starting configuration defined above. The maximum in eq 26 assures that the minimization does not enter regions where V < 0.2 Ewall which would result in localizing physical minima of the PES. The term σHS PHS is much smaller in magnitude than 0.2 Ewall and increases with intermonomer separation, which prevents the minimization from moving significantly away from the wall. The effect of minimizing VHS is therefore that the dimer configuration moves along the base of the wall near V = 0.2 Ewall, and therefore it can enter “hidden valleys” which cannot be detected in radial scans. Up to 20 Powell iterations are performed, and, if the dimer configuration Ri after any iteration satisfies the conditions V(Ri) < Ewall and ρi < 0.8 ρwall, then Ri is considered to be a hole. This procedure was found to identify holes in cases where a scan of R in any fixed orientation was insufficient. For the points which are found to constitute holes (or for a subset of such points), additional SAPT(DFT) computations are performed at the locations of the holes. However, if at this configuration the value of ρ is less than ρwall, R is increased to make ρ = ρwall. This minimum closest-contact distance is enforced so that no grid points are placed at very short distances where they would not be useful for fitting, or where the ab initio calculations may not converge. Any hole Rm found for which ϵmn < 0.15 Å, where Rn is any previously found hole in the current hole-fixing iteration, is discarded. Typically, the total number of grid points added by the hole-fixing procedure in all iterations varies from 0 to 30% of the total number of grid points. 5909
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation Table I. Comparison of Functional Forms Used in Test Potentialsa system name
Nsite
NFP
A12
Opoly
combin
water dimer water dimer [CCSD(T)] water−chloride formic acid dimer formamide dimer methyl formate−helium N-methylacetamide dimer RDX dimer CHFOHNH2 dimer
16 16 9 16 12 13 24 42 16
50 50 30 88 75 55 88 208 88
F F F T F F F F F
3 3 4 2 3 3 2 2 2
F F
ind+disp terms 6, 6, 4, 6, 6, 6, 6, 6 6,
T T T T T
8, 10 8, 10 6, 8 8 8 8 8 8
ind+disp damping
elst damping
T T T T T T T T T
T T T F F F F F
Nsite is the number of fitting sites in the dimer, including off-atomic sites. NFP is the number of free parameters in the final stage of fitting. A12 indicates whether the repulsive 1/r12 ab term of eq 9 is included, Opoly is the order of the linear polynomial, the column ‘combin’ indicates the use of combination rules for exponential parameters in the short-range part of the fit (T is the default value), and ‘ind+disp terms’ indicates the powers of 1/rab in the distributed induction plus dispersion expansion. Combination rules are used for damping parameters and Cab n parameters and not used for linear parameters in all cases. The columns ‘ind+disp damping’ and “elst damping” indicate whether the corresponding terms of eq 9 include damping factors. a
Table II. Comparison of Generated Potentialsa system name
Natom
Emin
Ngrid
Niter
fit RMSE
test RMSE
water dimer water dimer [CCSD(T)] water−chloride ion formic acid dimer formamide dimer methyl formate−helium N-methylacetamide dimer RDX dimer CHFOHNH2 dimer average
6 6 4 10 12 9 24 42 16
−4.98 −4.56 −13.7 −12.7 −9.77 −0.19 −7.14 −12.9 −5.85
test RMSE |Emin|
539 511 286 634 637 344 947 1918 821
4 4 3 2 3 6 5 4 6
0.076 0.066 0.050 0.178 0.197 0.010 0.178 0.298 0.190 0.14
0.078 0.072 0.057 0.194 0.234 0.012 0.204 0.347 0.189 0.16
1.5% 1.6% 0.4% 1.5% 2.4% 6.3% 2.9% 2.7% 3.2% 2.5%
a RMSEs are given in kcal/mol for points with negative interaction energies (the RMSEs for all points can be found in the SI). Emin is the global minimum energy of the PES, given in kcal/mol. Natom is the number of atoms in the dimer. Ngrid is the number of fitting grid points used in the final iteration (including “hole fixing” points). The value Ngrid includes points with interaction energy above 100 kcal/mol, even though these points are rejected. Niter is the total number of grid generation iterations performed (including “hole fixing” iterations). Test data sets are formed by randomly selecting 15% of the Ngrid data points and are not used in fitting.
computational cost of evaluating our methodology rather than to generate potentials of sufficient accuracy for physical applications. For several of the smaller systems, namely the water dimer, water−chloride ion dimer, formic acid dimer, and methyl formate−helium dimer, using only the atoms as fitting sites did not achieve the desired accuracy. Therefore, off-atomic sites were used to increase the adjustability of these potentials. In the case of formic acid and methyl formate−helium dimer, the offatomic sites were simply placed at midpoints between certain atoms (see the SI for coordinates). For methyl formate, offatomic sites were placed midway between the in-plane methyl hydrogen and bridge carbon, the two carbons, the two oxygens, and the bridge carbon and hydrogen attached to the carbonyl group. For formic acid, the off-atomic sites are placed between the hydrogen in the hydroxy group and the carbon and between the hydrogen bonded to the carbon and each of the oxygens. While optimized site positions would result in higher accuracy, this would spoil the automatic nature of the PES generation. In the case of water, we used the optimized offatomic site positions from ref 86 in order to directly compare the performance of the new fitting procedure to that of the manual fit. The number of free parameters per atom varies from 3.7 to 8.8 and is obviously the highest for dimers with offatomic sites. Other factors affecting this ratio is the use of
Calculations for the dimers in our test set were used to tune various parameters in autoPES. Thus, these calculations were performed with some parameters different from the final default values. For the smallest dimers in our set, we have repeated the calculations using the final defaults, but for larger systems we used the original calculations. The essential deviations are listed in Tables I and II. All details are given in the SI. None of the potentials included in Tables I and II use the polarization model. The polarizable water dimer potential discussed earlier was obtained by refitting the data from the SAPT potential presented in the tables. It would have been straightforward to use the polarization model for all systems, but it would not have improved the accuracy. The PESs for the RDX dimer, CHFOHNH2 dimer, formamide dimer, and the CCSD(T) version for the water dimer were computed using the aug-ccpVDZ basis, while all other PESs use the aug-cc-pVTZ basis.91 The SAPT(DFT) version of the water dimer, water-chloride ion dimer, formic acid dimer, and N-methylacetamide dimer potentials include the δEHF int,resp term, while all others do not. If our current automatically checked criterion for including this term were used, it would have been included for all systems except CHFOHNH2; however, for some systems we disabled it in inputs. The RDX and water dimer potentials use the PBE0 functional, while all others use PBE. Note that in some cases the size of the basis set was chosen to minimize the 5910
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
Figure 7. Scatter plots of ab initio SAPT(DFT) interaction energies versus fit energies for two of the test dimers.
the vdW minimum. This increased accuracy is mainly due to the use of off-atomic sites and high-degree polynomials. For methyl formate−helium the error is only 0.01 kcal/mol, but since the interaction energy at the vdW minimum is also very small, only −0.2 kcal/mol, the ratio of the error to the magnitude of this energy is as large as 6%. This example shows that for very weakly bound dimers one has to retune the parameters in autoPES. Finally, for the RDX dimer, the RMSE is 0.35 kcal/mol, larger than for the other systems. This is partly due to the fact that also the interaction energy at the vdW minimum amounting to −12.4 kcal/mol is largest in magnitude of all uncharged systems. As a consequence, the ratio of the error to the magnitude of this energy is 2.7%, within the desired range. This error could be reduced by using a more adjustable functional form of the PES (even without increasing the number of grid points since the ratio of this number to the number of free parameters is 9, i.e., relatively large), but the form was chosen for comparisons with a literature potential. The RMSEs on the test set are in all cases very close to those on the training sets, which is imposed by our iterative procedure which continues if the two errors are not close to each other. The number of grid points per free parameter varies between 6 and 11, with the majority of systems in the 9−11 range. The comparisons of the two fits for the water dimer presented in Table II show that the replacement of SAPT(DFT) interaction energies at short-range by the CCSD(T) ones has not resulted in any deterioration of the fit as the RMSEs are essentially the same. This demonstrates that our use of SAPT(DFT)-level asymptotics for fitting CCSD(T) calculations works very well. The smaller magnitude of the global minimum in the case of the CCSD(T) PES is due to the use of the aug-cc-pVDZ plus midbond basis set, whereas augcc-pVTZ plus midbond was used in SAPT(DFT) calculations. Figure 7 shows how the errors of fit depended in the value of the interaction energy. Due to our weighting of lowest energy regions, the fits are very accurate for energies at the global minimum and several kcal/mol above it. The accuracy steadily decreases with interaction energy, and near 10 kcal/mol the errors are quite large, sometimes a few dozen percent. There are also several outliers near E = 0. These points, however, are all located on the repulsive wall and have only a minor effect on the position of the wall. Thus, these errors should be inconsequential for the intended applications of our potentials. If the repulsive wall is of importance for some applications, the default weights can be changed to increase the weights of points located in this region.
combination rules, the power of the polynomial, and the degree of monomer symmetry (the number of symmetry nonequivalent atoms-sites). Table I shows that we have used the 1/r12 ab terms only for the formic acid dimer. The reason is that this term was added in a late stage of the potential development. Its use is not particularly important for smaller systems since it is relatively easy to remove holes for such systems without this term. For larger systems, we did not want to repeat the calculations since the potentials without this term were accurate enough. The largest power of the polynomial in the exponential part was the default value of 2 in most cases. For the water dimer, it was set to 3 for consistency with literature choices. For the formamide dimer, methyl formate− helium dimer, and water−chloride ion we increased this degree to 3, 3, and 4, respectively, to reduce RMSE. For most systems, the terms in the asymptotic expansion of the induction plus dispersion energies were those with the inverse powers of 6 and 8, the default. For compatibility with literature potentials, only n = 6 terms were used for the RDX dimer and n = 6, 8, 10 terms for the water dimer. For the water−chloride ion the summation includes also the term n = 4, the leading power in the decay of induction energy. The induction plus dispersion expansion was by default always damped, whereas the electrostatic expansion was not damped for most systems, also by default. We used damping for the water dimer for consistency with literature and for the water−chloride ion since the latter system is obviously sensitive to the electrostatic component. For methyl formate−helium there is no electrostatic asymptotic expansion. For dimers containing atoms or atomic ions, combination rules do not lead to any reduction in the number of free parameters. In the case of the RDX dimer, the grid generation used varies significantly from that described in Section II because this PES was obtained early in the development of our methodology. We did not regenerate the PES due to the large computational cost of doing so; however, it is likely that somewhat fewer grid points would be required using the latest method. VIII.A. Accuracy of PESs. The main criterion for evaluating the quality of PESs are their errors on sets of points not used in the fitting procedure. Such RMSEs are shown in Table II. For about half of the systems, the test RMSEs for negative interaction energies are about 0.2 kcal/mol, and these errors constitute about 2−3% of the interaction energy at the vdW minimum. Such values are reasonable taking into account that the accuracy of the ab initio interaction energies is of the same order of magnitude. For the two smallest dimers, the RMSEs are about 0.1 kcal/mol and 1−2% of the interaction energy at 5911
DOI: 10.1021/acs.jctc.6b00913 J. Chem. Theory Comput. 2016, 12, 5895−5919
Article
Journal of Chemical Theory and Computation
multiple times to account for the randomized aspect of the fitting procedure, and in each case there were no holes when the Aab 12 term was used, and the number of holes varied only slightly when it was not. This could be taken to indicate that the hole-fixing procedure may be unnecessary when the Aab 12 term is included; however, in earlier tests on the Nmethylacetamide dimer, the inclusion of this term was found to not remove all holes. Therefore, while the Aab 12 term is useful, it cannot replace the hole-fixing procedure. VIII.B. Comparison to Manually Generated Potentials. In the case of the water dimer, we compare to the 2006 fit published in ref 86, referred to as SDFT-5s, which was obtained by fitting to a set of grid 2510 points chosen by considering several properties of water. The interaction energies for our completely independent grid points were calculated as described in Section III.B, applying the level of an initio theory as close to that used in ref 86 as possible. The exact match could not be achieved since ref 86 used the Fermi-AmaldiTozer-Handy (FATH) asymptotic correction118 rather than GRAC, which was used here. We could not use FATH since it is not available in ORCA; the only front-end program interfaced to autoPES. This difference resulted in small differences in ab initio computed interaction energies−in particular, the value at the minimum configuration is −4.98 kcal/mol in the present work, as compared to −5.14 kcal/mol in ref 86. This small difference is irrelevant for comparing the performance of the two fitting procedures except when we evaluate one of the fits on the set of interaction energies used to fit the other PES. The maximum of the additional uncertainty resulting from the different asymptotic corrections is 0.16 kcal/ mol, but for all negative interaction energies the average uncertainty is smaller; we will assume it to be 0.05 kcal/mol. The number of grid points generated by autoPES is 539, which is far fewer than the 2510 points used in SDFT-5s. Neither surface includes a polarization model. The functional form and monomer geometry, including positions of off-atomic sites, are identical to those used in SDFT-5s. Table IV
A typical iterative process is shown in Table III for the case of the CHFOHNH2 dimer. Four standard grid generation Table III. Summary of Iterations for PES Generation for the CHFOHNH2 Dimera iteration
type
Ngrid add
Ngrid fit
Ngrid test
fit RMSE
test RMSE
Nhole
1 2 3 4 5 6
standard hole standard standard standard hole final
528 20 131 131 131 20 0
443 487 604 716 809 821 961
85 61 75 94 132 140 0
0.16 0.18 0.19 0.19 0.18 0.19 0.19
0.35 0.26 0.26 0.24 0.24 0.19 N/A
177 0 0 0 98 0 0
a
The grid generation procedure of Section II is used, except when holes are detected, in which case new grid points are added as described in Section VII. RMSEs for negative interaction energies are given in kcal/mol. The number of holes detected after performing the given fit is denoted Nhole. The number of grid points added in each iteration and the number of grid points used in the fit and test sets are denoted Ngrid add, Ngrid fit, and Ngrid test, respectively. The final fit is computed on the union of the fit and test sets (therefore Ngrid is 961 = 821 + 140) after the convergence criterion has been met. The number of test points is only approximately equal to 15% of the total number since test points are selected with 15% probability going over the whole set. The calculations presented here were performed when the default for the maximum number of hole grid points added was 20 rather than the present 0.1 Ntot grid. Also the default for the number of grid points added in a standard iteration was 25% rather than the current 20% of the number of points in the initial iteration.
iterations and two hole-fixing iterations were required in this case. The number of holes given is the number obtained after the redundant holes are removed. When a standard iteration follows another standard iteration, this means that no holes were found in the former iteration. Note that the hole-fixing iterations can cause a significant improvement in the quality even for points with negative interaction energy; this behavior is not unusual. One can also see that there is no need to add grid points at the location of all unique holes: the addition of only 20 such points was sufficient to remove 177 (98) holes located in the first (fifth) iteration. Note that the exact number of “unique” holes identified in a given iteration is somewhat arbitrary, as the criteria for distinguishing unique holes are governed by a threshold value of the ϵmn metric, as described in Section VII. The fact that the RMSEs change very little in the iterations 2−4 is due to a relatively small number of points added in each iteration (which is intentional since calculations of ab initio energies at grid points are much more timeconsuming than performing the fits). Since the fit changes very little, no new holes are created. Apparently, in iteration 5 a new local minimum was found in the parameter space which led to a set of new holes. These holes were probably the reason for the difference between fit and test RMSEs in the fifth iteration since after the holes were fixed, the fit converged. 12 To test the effectiveness of the Aab term of eq 8 at 12/(rab) removing holes, we repeated the PES fitting stage of iteration 12 number 5 in Table III, but this time including the Aab 12/(rab) ab term, with the A12 parameters set as described in Section VI. The resulting fit (included in the SI) contains zero holes. The RMSEs for this fit were 0.18 and 0.25 on the fitting and testing sets, respectively, essentially the same as for the fit without the ab Aab 12 term. The fits with and without the A12 term were repeated
Table IV. RMS Errors (in kcal/mol) for (H2O)2 Calculated on 2006 and 2016 Data Sets Using Two Different PESs, Present and from Ref 86a data set fit autoPES fit SDFT-5s a
energy range (kcal/mol) Eint Eint Eint Eint
< < <