Article pubs.acs.org/JCTC
Improved Minimum Mode Following Method for Finding First Order Saddle Points Manuel Plasencia Gutiérrez,† Carlos Argáez,† and Hannes Jónsson*,‡,¶ †
Science Institute of the University of Iceland, 107 Reykjavı ́k, Iceland Faculty of Physical Sciences, University of Iceland, 107 Reykjavı ́k, Iceland ¶ Applied Physics Department, Aalto University, FIN-00076 Espoo, Finland ‡
ABSTRACT: The minimum mode following method for finding first order saddle points on an energy surface is used, for example, in simulations of long time scale evolution of materials and surfaces of solids. Such simulations are increasingly being carried out in combination with computationally demanding electronic structure calculations of atomic interactions, so it is essential to reduce as much as possible the number of function evaluations needed to find the relevant saddle points. Several improvements to the method are presented here and tested on a benchmark system involving rearrangements of a heptamer island on a close packed crystal surface. Instead of using a uniform or Gaussian random initial displacement of the atoms, as has typically been done previously, the starting points are arranged evenly on the surface of a hypersphere and its radius is adjusted during the sampling of the saddle points. This increases the diversity of saddle points found and reduces the chances of reconverging on previously located saddle points. The minimum mode is estimated using the Davidson method, and it is shown that significant savings in the number of function evaluations can be obtained by assuming the minimum mode is unchanged until the atomic displacement exceeds a threshold value. The number of function evaluations needed for a recently published benchmark (S. T. Chill et al. J. Chem. Theory Comput. 2014, 10, 5476) is reduced to less than a third with the improved method as compared with the best previously reported results.
1. INTRODUCTION The problem of finding a first-order saddle point on a high dimensional surface, that is, a stationary point where the matrix of second derivatives, the Hessian, has one and only one negative eigenvalue, appears in many contexts.1 One example is the identification of the mechanism and estimation of the rate of atomic rearrangements in and on the surfaces of solids within harmonic transition state theory2,3 or Kramers/Langer theory.4,5 There, the task is to find saddle points on the surface describing the variation of the energy of the system as a function of atomic coordinates. The atomic rearrangements can, for example, represent diffusion events, chemical reactions or configurational changes of molecules occurring as a result of thermal activation. To simulate the time evolution of an atomic system over a long time scale, the mechanism of such transitions needs to be identified and the rate needs to be estimated. In the adaptive kinetic Monte Carlo (AKMC) approach,6,7 (sometimes referred to as “on-the-fly KMC”), the most important and computationally intensive part of the simulation is the repeated saddle point searches started near a given initial state energy minimum until adequate sampling of the saddle points on the energy rim surrounding the minimum has been obtained.6,8 The algorithm is self-learning in the sense that the table of transitions and rates is constructed for each local minimum visited by identifying the saddle points from unbiased searches. © XXXX American Chemical Society
This is unlike traditional KMC simulations for which a complete table of possible transitions needs to be constructed before the simulation is started. An essential feature of the AKMC method is the fact that a preconceived mechanism or reaction coordinate for the transitions is not assumed. Instead, the mechanism is deduced from the saddle points identified in unbiased searches. This is a crucial aspect of the method because multiple examples have been reported in which the mechanism of a thermally activated transition in an atomic system has turned out to be counterintuitive (see, for example, a review in reference7). The AKMC method has, in particular, been applied to simulations of diffusion in materials with a high degree of disorder such as hydrogen diffusion in metal grain boundaries9 and diffusion of CO10 and H2O11 molecules on ice Ih surfaces where, due to proton disorder, a broad distribution in binding energy and activation energy is present.12 In a wider context, saddle point searches can be used in global optimization methods, that is, methods for finding the global minimum of a function of many variables where multiple local minima are present. The simulated annealing algorithm mimics gradual cooling of a system, and the global minimum is reached if the cooling can be carried out slowly enough, that is, over a large enough “time” scale.13 The use of unbiased saddle Received: December 23, 2015 Published: December 1, 2016 A
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX
Article
Journal of Chemical Theory and Computation
minimum with energy below a certain threshold. The system consists of a heptamer island on a FCC(111) surface and the transitions involve rearrangements of the island; that is, 21 degrees of freedom are primarily involved in the transition but the system includes a total of 525 degrees of freedom. The task is to find all 27 saddle points with energy below 1.5 eV.45 We present here further improvements to the MMF method to reduce the number of function evaluations and increase the success rate of the saddle point searches. First of all, the initial point of each search path is chosen to lie on a hypersphere with a radius that increases gradually based on information gained from previous searches. Second, the number of times the minimum mode is evaluated is reduced by reusing a previous estimate until the system has been displaced by a sufficiently large distance. Third, the Davidson method is used based on a preconditioner obtained from L-BFGS updates of the Hessian evaluated at the initial state local minimum. The improved method is applied to the heptamer island benchmark problem,45 and the results show a reduction in the number of function evaluations needed to less than a third compared with the best previously reported performance. In section 2, a brief review the minimum mode following method as it has been implemented previously is given and then a description of the improvements. In section 3 the heptamer benchmark problem is described. Results are presented in section 4 and the article concludes with a summary and discussions in section 5.
point searches in the simulated annealing algorithm as well as systematic coarse graining of the basins associated with the minima can improve the efficiency of the algorithm and increase the probability of finding the global minimum.14 For functions of only a few variables where the number of local minima is not too large, a systematic mapping of all minima can be carried out by moving from one local minimum to another using saddle point searches.15 Since atomic scale models of condensed matter systems typically contain several hundred degrees of freedom or more, and analytical second derivatives of the energy are not available, the full evaluation of the Hessian matrix and evaluation of all the vibrational normal modes in searches of saddle points, as has been done in quantum chemistry calculations of molecules,16−22 require an excessive number of function evaluations, that is, evaluations of the energy of the system and forces acting on the atoms. Instead, it is possible to find saddle points using information about only the lowest eigenvector of the Hessian estimated from just a few function evaluations without even constructing the Hessian matrix.23−25 Several such methods have been developed in the context of molecular systems where the number of degrees of freedom is small. These include the Newton trajectory method that generates curves concentrated near stationary points on the energy surface,26−30 as well as the scaled hypersphere31−35 and fast-marching methods36−39 where a hypersphere which initially has a small radius is propagated outward uniformly (in scaled hypersphere) or nonuniformly (in fast-marching) in order to explore the energy surface. These are powerful methods that give good results, but the computational effort increases rapidly with the number of degrees of freedom, namely proportional to the exponential growth of the area of a hypersphere with the number of dimensions. Here, the focus is on condensed phase systems so the scaling with the number of degrees of freedom is of great importance. Also, since the number of function evaluations in each saddle point search is large, and many saddle point searches need to be carried out in order to sample the saddle points sufficiently, it is important to improve the efficiency of the saddle point search method as much as possible. This is particularly important when the energy and atomic forces are estimated using computationally demanding electronic structure calculations. We focus here on improvements to the minimum mode following (MMF) method (originally referred to as the “dimer method”).23 Several studies and improvements of the MMF method have been presented recently. Mathematical analyses and comparisons of various methods for estimating the minimum mode have been presented,40,41 a method for generating starting points from dynamics simulation proposed,42 and a method for escaping quickly from the region around the initial state minimum where all eigenvalues are positive.43 In AKMC simulations the goal is to find saddle points that are connected to the local minimum corresponding to the current state of the simulation, that is, saddle points which are connected by a steepest descent path to the local minimum. An additional constraint on the saddle point search paths has been proposed to reduce the probability of converging on saddle points that are not connected to the initial state minimum.44 To quantify the performance of methods for finding saddle points, a benchmark has been defined and a web page established to compare various variants of the MMF method.45 The goal is to find all saddle points connected to an initial state
2. METHODS 2.1. Minimum Mode Following. We first review briefly the MMF method as it has typically been applied in the past. Each time the energy is evaluated, the force on the atoms, that is, the negative gradient, is also evaluated. This will be referred to as a function evaluation. If the number of movable atoms in the system is N, the configuration of the atoms is described by a 3N-dimensional vector, R, and a function evaluation provides the energy, E(R), as well as the atomic force, F = −∇E(R). Second derivatives of the energy are not needed. The force vector at points near a first order saddle point where the lowest eigenvalue, λ, of the Hessian is negative, can be transformed to give a force vector, Ft, that corresponds to the neighborhood of a local minimum on the energy surface. The calculation of λ and the corresponding eigenvector, Pλ, the so-called minimum mode, is discussed below, in section 2.3. The transformation involves the normalized eigenvector, Pλ, socalled minimum mode, corresponding to the minimal eigenvalue Ft = F − 2(F ·Pλ)Pλ
if λ ≤ 0
(1)
Close to the initial state minimum, however, the lowest eigenvalue is positive, λ > 0, and an iterative method that displaces the system according Ft given by eqn. 1 can lead to paths that do not converge onto a first order saddle point. Instead, the displacements of the atoms can tend to involve long wavelength displacements where most if not all the atoms are displaced slightly. Such a path may never exit the region where λ > 0. It is, therefore, important to displace the system in some different way from the minimum close to and preferably into the region where λ < 0. Typically, this has been accomplished in MMF calculations by using a different definition of Ft in the λ > 0 region. To reduce the probability B
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX
Article
Journal of Chemical Theory and Computation
from the minimum. Typically, some atom in the system is selected and all atoms within a spherical region around it are displaced slightly. In some cases uniform random numbers have been used to displace the atoms,23,46 but more recently Gaussian random numbers have been used.47 More recently, a classical trajectory has been used to generate starting points.42 The N movable atoms are then displaced guided by the transformed force, Ft, using some algorithm for zeroing the force. A conjugate gradient method has typically been used without the line search,23 but more recently the limitedmemory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method has been found to perform better.45 An upper bound is applied to the length of the displacement, Δxmax in order not to advance too quickly from the minimum.23,46 This parameter is adjusted for each system so as to reach a good balance between the number of distinct saddle points found and retaining a high probability of converging onto saddle points connected to the local minimum. The value Δxmax = 0.5 Å was chosen here, as suggested by previous tests.46 The force is then reevaluated at the new geometry, etc. By repeated iterations of this algorithm, a path in configuration space, {R1, R2, ..., Rs}, is generated starting near the local minimum, R0, on the energy surface and ending at a first order saddle point. Figure 2 shows the region where λ > 0
of sliding onto soft, long wavelength modes, the component of Ft perpendicular to the minimum mode is zeroed Ft = − (F ·Pλ)Pλ
if λ > 0
(2)
Figure 1 shows the force and the transformed force for a simple two-dimensional energy surface. While the force F tends to
Figure 2. A potential energy surface showing minima (blue), high energy regions (yellow/red), and a saddle point (large black point). Black curves show saddle point searches starting from four different initial points. The red curves show where the lowest eigenvalue is zero. In the regions within those curves, all eigenvalues are positive. By starting the saddle point searches outside the region marked by the red curves, the path becomes shorter and the initial point is within a basin of attraction of a saddle point.
Figure 1. A potential energy surface with two minima (blue), high energy regions (yellow/red), and a saddle point (black point). (Top) The force vectors, F = −g, are shown with black arrows. (Bottom) The transformed force vectors, Ft, where the component along the minimum mode has been reversed, are shown with red arrows. Displacements of the system in the direction of the transformed force vectors will converge onto the first order saddle point.
on a simple two-dimensional energy surface, as well as example MMF saddle point searches starting within as well as outside the λ > 0 region. The perpendicular component included in Ft by eq 1 is necessary when λ < 0 in order to converge on a first order saddle point. But, by eliminating the perpendicular component in the region where λ > 0, the path is made to advance quickly out of that region thus reducing the risk of involving long wavelength modes. In AKMC simulations, only saddle points located on the energy rim surrounding the local minimum are important. We, therefore, place special emphasis on such saddle points here. In the context of global optimization, other saddle points are also of interest.14,15
point toward minima, the transformed force Ft tends to point toward first order saddle points. By displacing the atoms guided by Ft using some algorithm that iteratively advances the system toward a point where Ft = 0, every point R where λ < 0 maps onto a particular first order saddle point. The region around a first order saddle point where λ < 0 and each point maps onto that saddle point is referred to as the basin of attraction of the saddle point. The force transformation in eq 2 is related to the Newton trajectory method.26−30 Given an initial state minimum on the energy surface, a group of M atoms is chosen in some way and displaced away C
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX
Article
Journal of Chemical Theory and Computation Several such saddle points searches need to be carried out using starting points R1 that differ sufficiently to give paths that converge onto different saddle points. An important aspect of the method is, therefore, the choice of starting points, which however should not in any way bias the method to find preferably saddle points corresponding to one transition mechanism rather than another. The choice of starting points is an important factor affecting the sampling of saddle points by the MMF method. By generating enough different paths, the method can be used to sample and even generate a complete list of saddle points in the low energy range.45 The algorithm lends itself well to parallel or distributed computing since a number of saddle point searches can be carried out simultaneously.48,49 Typically, the searches are carried out in batches, each batch consisting of m saddle point searches. That is, m calculations are started from m different initial points. Then, a second batch of calculations is started, etc., until enough sampling of saddle points in the energy range of interest has been obtained. The computational effort is dominated by the evaluation of the energy and atomic forces at each geometry, especially in the estimation of the minimum mode. The performance of an implementation of the MMF is, therefore, naturally measured in terms of the number of times the energy and force need to be evaluated, that is, the number of function calls. The following subsections describe the modifications we have made to previous implementations of the MMF method. 2.2. Initial displacements. A new method is presented here for generating the initial points of the saddle point search paths. Instead of uniform random or Gaussian random displacements, an even distribution of points is generated on a 3M-dimensional hypersphere. The method is illustrated and compared with a Gaussian random distribution for a twodimensional energy landscape in Figure 3 and on a threedimensional sphere in Figure 4. A large number of points is generated on the hypersphere and an even distribution
Figure 4. Comparison of regular and random distribution of 120 starting points for saddle point searches. (Left) Regular distribution obtained by introducing a repulsive interaction between points confined to the surface of a sphere and minimizing the energy. The labels indicate the order in which the points are used, each new point being as far as possible from the previously selected points. (Right) A Gaussian random distribution in three dimensions, shifted to the surface of the sphere. The regular distribution leads to a greater diversity of saddle points found in the minimum mode following searches.
obtained by introducing a repulsive interaction between the points.50 After optimizing, the points are roughly equally spaced and are then ordered in such a way that each new point in the sequence is as far from the previous points as possible (see Figure 4). The saddle point searches are then started from these points until enough sampling has been obtained. Such a sequence of points on hyperspheres of various dimensions can be calculated beforehand and stored. This choice of initial displacements resembles the scaled hypersphere method.31−35 A small radius of the hypersphere tends to generate starting points for saddle point searches that converge on connected, low energy saddle points, while a large radius leads to a larger variety of saddle points, disconnected as well as connected. It is important to realize that the basin of attraction of a connected saddle point may not share a boundary with the region around the minimum where all eigenvalues are positive. Such saddle points cannot be found using a saddle point search method which is based on a gradual climb from the vicinity of the minimum to a basin of attraction of a saddle point. An example of such a saddle point on a 2-dimensional enegy surface is given in Figure 6 of ref 23. Starting points that are significantly removed from the local minimum can, therefore, be important for finding some of the saddle points, even low energy, connected saddle points. An optimal radius can be estimated from previous saddle point searches. For example, by monitoring the distance to the point where the minimum eigenvalue changes sign (the curve shown in Figure 2) and/or distance to saddle points found, an adaptive algorithm can be formulated so as to optimize the hypersphere radius for subsequent searches. This is illustrated for the two-dimensional energy landscape in Figure 3c. By starting with a small radius and then increasing it for subsequent searches, as indicated in this example, a larger variety of saddle points and reduced occurrences of reconvergence on previously located saddle points can be achieved as compared with the previously used uniform or Gaussian random distributions. 2.3. Estimation of the Minimum Mode. There are several methods for estimating the minimum mode without constructing the Hessian matrix and solving for all the eigenvalues. The dimer,23 Raileigh-Ritz,24 and Lanczos25,46
Figure 3. A potential energy surface illustrating three different approaches to the generation of starting points (red circles) for saddle point searches (black lines). (a) Gaussian random distribution of staring points. Convergence on the same saddle point is likely to occur. In this case only two saddle points are identified. (b) Regular distribution of starting points on a ring (hypersphere when more variables are involved) of fixed radius. A larger variety of saddle points is found than in panel a for the same number of searches. (c) Starting points placed on a ring (hypersphere) of increasing radius. Initially, a few (here two) searches are carried starting from a small radius. Then, using information from these searches, the radius is increased. This improves the diversity of saddle points found and reduces the length of the search paths, see text. D
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX
Article
Journal of Chemical Theory and Computation
similar benchmarking.45,46,59 The transitions involve rearrangements of a seven-atom island on an FCC(111) surface, as illustrated in Figure 5. The island is placed on a six layer slab
methods are three different options that have been used for estimating the minimum mode using only first derivatives of the energy, that is, atomic forces. The Lanczos method is frequently used to evaluate vibrational modes of molecules and is particularly easy to implement. A benchmark comparison of the performance of the dimer and Lanczos methods has been made46 as well as a mathematical analysis.41 In electronic structure calculations of molecules, the Davidson method51,52 has been the method of choice for a long time. Initially developed for large configuration interaction calculations, it has now also become the most frequently used method in density functional calculations of condensed matter. The Davidson method has also been applied to calculations of vibrational modes of molecules.53 The method is iterative and is similar to the Lanczos method but involves the use of a preconditioner which can increase the rate of convergence to the lowest eigenvalue, i.e. fewer function evaluations are needed to obtain an estimate of the minimum mode. But, the choice of preconditioner is more difficult for calculations of vibrational modes than for electronic structure calculations where the Hessian is dominated by the diagonal elements. A successful application of the Davidson method for finding the lowest vibrational mode, therefore, relies on a good estimate of the Hessian and, thereby, a useful preconditioner. Semiempirical calculations have, for example, been used to generate a preconditioner for Davidson calculations for higher level, ab initio calculations.53 The large number of saddle point searches carried out in AKMC simulations for each local minimum visited on the energy surface means that a significant investment can be made to obtain an accurate estimate of the Hessian at the initial state minimum. If the number of saddle point searches is of the same order of magnitude as the number of movable atoms, then a complete evaluation of the Hessian matrix at the initial state minimum is an insignificant addition to the overall computational effort. The task then is to update this estimate of the Hessian, or rather the preconditioner as the system is displaced along the saddle point search path. At each point the energy and atomic forces can be used to update the estimate, as in the BFGS method. Here, an estimate of the minimum mode at each point could also be used to get an even better estimate of the change in the preconditioner from the previous point along the path. A more detailed description of the implementation of the Davidson method in MMF calculations will be given elsewhere.58 2.4. Skipping Updates of the Minimum Mode. To reduce the number of function evaluations needed in a saddle point search, updates of the minimum mode are skipped until the atoms have been displaced by a sufficiently large distance since the last evaluation of the mode. This minimum displacement is specified in terms of a fraction of the maximum allowed displacement, Δxmax. It turns out that this can significantly reduce the computational effort, as illustrated in the benchmark results below. Also, the number of iterations used in a calculation of the minimum mode at any point is limited to a maximum value, as has been done previously.23,46 It turns out that the minimum mode does not need to be determined to high accuracy at each configuration along the path.
Figure 5. Initial state and three of the transitions in the benchmark system. (a and b) initial state, on-top and side view; (c) saddle point with E = 0.986 eV and (d) corresponding final state; (e) saddle point with E = 1.197 eV and (f) corresponding final state. (g) saddle point with E = 1.494 eV and (h) corresponding final state.
where each layer contains 56 atoms. Periodic boundary conditions are applied in the plane of the slab. The seven atoms in the island and atoms in the underlying three layers of the slab are free to move but the bottom three layers are kept fixed. The total number of movable atoms is 175 so the system includes 525 degrees of freedom. To make the implementation of the benchmark problem simple, the interaction between atoms is taken to be a pairwise additive Morse potential V (r ) = De(e−2α(r − r 0) − 2e−α(r − r 0))
(3)
with parameter values chosen to mimic roughly platinum atoms, De = 0.7102 eV, α = 1.6047 Å−1, and r0 = 2.8970 Å. The potential is cut and shifted to zero at 9.5 Å. The saddle point calculations are started by displacing the heptamer atoms so the initial displacement involves 3M = 21 degrees of freedom. The calculations are carried out in batches of m saddle point searches at a time. In tests of the various modifications of the method, a total of 504 saddle point searches were typically carried out using batches with m = 8 searches, and the average number of function evaluations per saddle point search was calculated, as well as the number of
3. BENCHMARK CALCULATIONS To test the performance of the MMF method, we use a model system which has been documented and used previously in E
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX
Article
Journal of Chemical Theory and Computation unique saddle points found. In calculations where all 27 saddle points with energy below 1.5 eV were found, batches of m = 25 were used and the number of function calls needed to complete the search was determined. The averages for 10 independent searches were calculated. In AKMC simulations, the task is to find saddle points on the energy rim surrounding the local minimum corresponding to the current state of the long time scale simulation, so only saddle points connected with the initial state minimum are of interest. Also, since the rate drops exponentially with the energy of the saddle points, the low energy saddle points are mainly of interest. Therefore, statistics were collected especially on the number of saddle points connected to the initial state minimum. Also, the number of saddle points below 4 eV and below 1.5 eV were counted separately. A saddle point search is considered to be converged when the magnitude of each component of the atomic force vector was reduced to below 0.001 eV/Å.
4. RESULTS We first describe results of calculations where several variants of the method were tested. The method deemed optimal from these tests was then applied to the task of finding all 27 saddle points with energy below 1.5 eV as described in the following section. 4.1. Tests of Various Features. The benchmark calculations were carried out by doing 504 saddle point searches in batches of 8 searches. The initial points were generated by placing 504 points on a 21 dimensional hypersphere and optimizing in such a way as to maximize the distance between the points. Figure 6 shows results obtained in which the hypersphere radius was fixed at some value within the interval from 0.1 to 4.0 Å. The larger the hypersphere radius is, the larger is the number of saddle points found. However, the number of saddle points connected to the initial state minimum is smaller when the radius is larger than 2.3 Å. The searches then tend to converge more often on saddle points for which a steepest descent minimization does not bring the system back to the initial state minimum. Those saddle points tend to be located farther away from the minimum. This is reflected in the average number of function evaluations needed per search which increases as the hypersphere radius is made larger. This trend is even evident for the saddle points that are connected to the minimum. The increased diversity of the saddle points found when the radius is increased more than makes up for the increase in the average number of function evaluations needed per search. The lower the energy of a saddle point is, the faster the corresponding transition is, and the greater its importance is in the AKMC simulation. Figure 6 shows the number of saddle points found below 4 eV and the number of saddle points found below 1.5 eV as a function of the hypersphere radius. Only a small fraction of the saddle points found at a hypersphere radius of 2.3 Å are over 4 eV, but a significant fraction is over 1.5 eV. In fact, the number of saddle points found below 1.5 eV is the largest for a radius of 2.1 Å. When a hypersphere radius of 2.3 Å is used, some of the low energy saddle points are not found even after 504 searches. Such a large radius seems to push the initial points too far from the minimum in order to find some of the low energy saddle points near the minimum. An adaptive algorithm where the radius of the hypersphere is initially small for the first few searches but then increases based
Figure 6. Efficiency of saddle point searches started from a regular arrangement of points on a hypersphere (see Figure 4, left) as a function of the hypersphere radius. (Top) The number of saddle points found increases as the radius is increased and the number of function evaluations needed in each saddle point search also increases. (Bottom) The number of saddle points found that are connected to the initial state minimum (black filled circles) has a maximum at around 2.3 Å. The number of connected saddle points with energy ≤4.0 eV (red filled circles) has a maximum for a similar radius, and the number of low energy saddle points with ≤1.5 eV (blue filled circles) drops when the radius is larger than 2.1 Å. Each data point represents the mean value obtained from 500 saddle point searches.
on the information gained is found to perform even better. From each saddle point search, the point at which the lowest eigenvalue of the Hessian changes from positive to negative is recorded, and its distance, d0, from the minimum is calculated. After testing a few possibilities,50 an algorithm was chosen in which the value of the hypersphere radius is reassigned and set F
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX
Article
Journal of Chemical Theory and Computation equal to the average of d0 found in the previous searches if the success rate for finding a new connected saddle point in a batch drops below 0.2. The hypersphere radius then increases gradually to a value of around 2 Å after a few hundred searches. Figure 7 top panel compares results obtained with this adaptive algorithm for the hypersphere radius to results obtained with a fixed radius of 2 Å as well as results obtained with a Gaussian random distribution of initial points. While the regular distribution of initial points on the hypersphere of fixed radius gives significant improvement over the Gaussian random distribution, the adaptive algorithm gives even better results. Almost 400 unique saddle points are found on average in 504 saddle point searches with the adaptive algorithm. Most of the saddle points found in the calculation shown in the upper panel of Figure 7 are not connected to the initial state minimum. For the AKMC simulation, only connected saddle points are of interest. The lower panel of Figure 7 gives the number of connected saddle points found. The hypersphere method again gives more saddle points than the Gaussian random distribution but there is little difference in the number of saddle points obtained using the adaptive hypersphere radius and fixed radius of 2 Å. However, some very important low energy saddle points are missed in the fixed radius calculation, so the adaptive radius is still the preferred method. Since the adaptive hypersphere radius starts with a small value and then gradually increases, the sampling of saddle points is more complete than with a fixed hypersphere radius even though the total number of connected saddle points found is similar. The number of function evaluations needed per saddle point search can be reduced by limiting the number of iterations used in the estimation of the lowest eigenvalue of the Hessian and the corresponding eigenvector. The iterative estimation is taken to be converged when successive iterations give a difference in the eigenvalue of less than 0.001 eV/Å2. But, even when convergence has not been reached, the iterations are stopped after nmax iterations have been performed. Figure 8 shows results obtained from 504 searches as a function of the value of nmax. The number of function evaluations needed per saddle point search increases by 50% as nmax is increased from 4 to 24. Interestingly, the number of unique connected saddle points found tends to drop at the same time. This means that reduced accuracy in the estimation of the minimum mode helps increase the diversity of saddle points found. There is a larger incidence of reconvergence on already found saddle points if the minimum mode is estimated to high accuracy. The search paths have a natural tendency to converge on the lower lying saddle points close to the minimum. Less than complete convergence on the minimum mode introduces an element of randomness in the search which reduces the probability of reconvergence and is thus beneficial for the sampling of saddle points. A choice of nmax between 6 and 8 is taken to be optimal in that it gives the largest number of saddle points below 1.5 eV and the average number of function evaluations per saddle points is relatively low. Further reduction in the average number of function evaluations per saddle point search can be obtained by not updating the minimum mode at each point along the search path. As seen from the results on limiting the number of iterations, the estimate of the minimum mode does not have to be highly accurate. Since the direction of the minimum mode is changing only gradually in the vicinity of a saddle point (see Figure 1), the minimum mode obtained at a previous point on the search path can be reused unless the system has been
Figure 7. Number of saddle points found as a function of the number of searches carried out. The dense black dots show average values obtained from 10 independent calculations, each involving 504 saddle point searches. The width of the colored regions indicates the standard deviation. When a Gaussian random distribution (blue) is used to generate the initial points, R0, searches often end up converging on saddle points that have already been found. When a hypersphere of radius of 2 Å is used, such reconvergence on known saddle points occurs less frequently (red). An even lower incidence of reconvergence and a larger number of saddle points is found using the adaptive hypersphere radius, where a small radius is initially used but later increased to the average distance to points where the minimum eigenvalue of the Hessian is zero (green). (Top) All saddle points included. (Bottom) Only saddle points connected by a steepest descent path to the initial state minimum are included. In this case the red curve nearly overlaps the green curve and is not shown. However, some of the lowest energy saddle points are missed when a constant hypersphere of radius of 2 Å is used.
displaced by a significant amount. The effect of making this approximation was checked systematically, and the results are shown in Figure 9. While the number of saddle points found below 1.5 eV is not affected strongly by the choice of maximum G
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX
Article
Journal of Chemical Theory and Computation
Figure 8. Effect of limiting the number of iterations in the calculations of the minimum mode. The iterative calculation of the minimum mode is stopped after nmax iterations even if the tolerance of 0.01 eV/ Å2 has not been reached. The data was obtained from 500 saddle point searches started from evenly arranged points on a hypersphere of fixed radius of 2 Å. (Top) The average number of function evaluations per saddle point search (red squares). (Bottom) Number of saddle points found that are connected to the initial state minimum (black points), with energy less than 4.0 eV (red points), and with energy less than 1.5 eV (blue points). In the optimal set of parameters, nmax is set to 8.
Figure 9. Effect of assuming the minimum mode is unchanged as long as the displacement of the system along the search path is less than a certain fraction of the maximum allowed displacement, Δxmax = 0.5 Å. The data was obtained from 500 saddle point searches started from evenly arranged points on a hypersphere of fixed radius of 2 Å. (Top) The average number of function evaluations per saddle point search. (Bottom) Number of saddle points found that are connected to the initial state minimum (black points), with energy less than 4.0 eV (red points), and with energy less than 1.5 eV (blue points). In the optimal set of parameters, this parameter is set to 0.5 Δxmax.
displacement between minimum mode updates, it drops slightly when the distance is increased beyond 0.5 Δxmax. The reduction in the average number of function evaluations per saddle point search is substantial, 25%, as compared with updates at each point along the path. 4.2. Finding All Saddle Points below 1.5 eV. The tests described above suggest the following optimal algorithm. The
initial points for the saddle point searches are generated as evenly spaced points on a hypersphere. For the first few searches the radius of the hypersphere is taken to be small, on the order of 0.1 Å. From each search, the distance d0 to the point where the minimum eigenvalue is zero is recorded. If the success rate in finding a new, connected saddle point drops below 0.2 in a batch of searches, the hypersphere radius is H
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX
Article
Journal of Chemical Theory and Computation
Table 1. Performance of Various Implementations of the Minimum Mode Following Method in Finding All 27 Saddle Points below 1.5 eVa method
Nf/105
ρ
Ns/103
Nu
adaptive_HSphere-lbfgs-lanczos-sm0 adaptive_HSphere-lbfgs-lanczos-sm50 adaptive_HSphere-lbfgs-davidson-sm50 eon-dimer-lbfgs-5-rotations-20-bowl_breakout
9.7 ± 3.7 7.1 ± 2.9 3.6 ± 1.4 13.5 ± 4.1
0.82 ± 0.03 0.81 ± 0.04 0.82 ± 0.02 0.85
2.4 ± 0.96 2.5 ± 1.0 1.5 ± 0.56 6.6
145 ± 12 146 ± 18 112 ± 19 126
The first three rows show results using a hypersphere with variable radius for choosing initial points and lbfgs method for optimization. In the first line, the minimum mode is calculated for every configuration of the atoms in the search path. In the second and third row, the minimum mode is only calculated when the system has been moved at least 50% of the maximum allowed displacement in a single move, Δxmax (here, Δxmax = 0.5 Å) since the previous calculation of the minimum mode. The second line gives results using the Lanczos method for finding the minimum mode, and the third line gives results using the Davidson method, all else being the same. A saddle point search is considered to be converged when the norm of the force is less than 10−3 eV/Å. Nf is the total number of function evaluations needed to find all 27 saddle points, averaged over 10 simulations, ρ is the fraction of saddle points found that are connected to the initial minimum, Ns is the total number of saddle searches performed, and Nu is the number of unique connected saddle points found. The fourth line shows previously reported best performance43 and is taken from the benchmark web page http://optbench.org/saddle-search.html in June 2016. a
5. SUMMARY AND DISCUSSIONS The improvements to the minimum mode following method described here significantly reduce the number of function evaluations, that is, calculations of the energy and atomic forces, needed to sample first order saddle points on the energy ridge surrounding a given initial state energy minimum. For a well established benchmark involving rearrangements of atoms in a heptamer island on a surface, the reduction is to less than a third compared with the best previously reported benchmark results.45,60 This improvement is first of all due to a systematic arrangement of starting points for the search paths on a hypersphere and an adaptive algorithm for incrementing the radius of the hypersphere. This increases the variety of saddle points found in a given number of saddle point searches as compared to Gaussian random distribution which has typically been used so far in AKMC simulations. The number of times a saddle point search reconverges on a previously identified saddle point is reduced significantly in this way. Second, the number of function evaluations is reduced by reusing the minimum mode calculated at a previous step along the path until the system has been displaced by a certain minimum amount. This distance was chosen to be half as large as the maximum allowed distance in a single iteration, but more generally could be determined in an adaptive way based on data from previous saddle point searches. Third, the minimum mode is calculated using the Davidson method and significant savings in computational effort are also obtained as compared with calculations using the Lanczos method. The reduction obtained here in the number of function evaluations needed to complete the benchmark is particularly important when the minimum mode following method is used in connection with electronic structure calculations for estimating the atomic interactions. There are several avenues for further improvements. Successes and failures in the saddle point searches could be used to eliminate or add new starting points on the hypersphere. When two points on the hypersphere are found to converge on the same saddle point, it is likely that points in between those also converge on the same point and should, thus, be eliminated from the set of possible initial points. Such a self-learning algorithm can help reduce searches which are not useful for constructing the transition rate table in an AKMC simulation. Furthermore, the estimate of the preconditioner in the Davidson method can be improved, thereby reducing the number of function calls needed to converge on the minimum mode. Here, the BFGS method has been used, but it is
increased to the average value of d0 from the previous saddle point searches. This gives good sampling of the saddle points, and the incidences of reconvergence on a previously found saddle point is reduced significantly as compared with Gaussian random displacements. The number of function evaluations per saddle point is reduced significantly by limiting the number of iterations in the Davidson calculation of the lowest eigenvalue of the Hessian and corresponding eigenvector (maximum of eight iterations) and this eigenvector is used as an estimate of the minimum mode until the system has been displaced by more than 0.5 Δxmax by the L-BFGS algorithm along the saddle point search path. Then, the calculation of the lowest eigenvalue and corresponding eigenvector is repeated, etc. A well documented benchmark problem involves finding all connected saddle points with energy below 1.5 eV, as has been described by Chill and co-workers.45 There are a total of 27 saddle points in this category, some equivalent by symmetry. We carried out 10 independent calculations with three different variations of the search algorithm and calculated the average number of function evaluations needed to find all 27 saddle points below 1.5 eV. Table 1 summarizes the results obtained. The relative importance of the three improvements presented here, (1) the hypersphere method for generating the starting points, (2) the recycling of previously calculated minimum mode, and (3) the Davidson algorithm can be seen by comparing the first three lines in the table with the previously reported optimal performance which is shown in the fourth line. The optimal algorithm sketched in the previous paragraph was able to complete the search using on average 3.6 × 105 function evaluations. This is less than a third of the number of function evaluations required in the best results reported from previous performance tests45 and updates entered subsequently to the benchmark web page60 (as of March 2016). We point out that it is strictly not necessary in AKMC simulations to find all saddle points, as long as there is not a bias in the searches causing a whole category of saddle points to be missing. If a transition corresponding to one of the higher energy saddle points with energy of 1.5 eV is missing from the rate table, the estimate of the increment in lapsed time in the AKMC simulation is in error by only an insignificant amount. But, this benchmark is, in any case, a good test of how well an algorithm can be used to construct a complete table of relevant transitions. I
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX
Article
Journal of Chemical Theory and Computation
(23) Henkelman, G.; Jónsson, H. J. Chem. Phys. 1999, 111, 7010. (24) Munro, L. J.; Wales, D. J. Phys. Rev. B: Condens. Matter Mater. Phys. 1999, 59, 3969. (25) Malek, R.; Mousseau, N. Phys. Rev. E: Stat. Phys., Plasmas, Fluids, Relat. Interdiscip. Top. 2000, 62, 7723. (26) Hirsch, M.; Quapp, W. Chem. Phys. Lett. 2004, 395, 150. (27) Quapp, W. J. Comput. Chem. 2007, 28, 1834. (28) Quapp, W. Theor. Chem. Acc. 2008, 121, 227. (29) Quapp, W.; Schmidt, B. Theor. Chem. Acc. 2011, 128, 47. (30) Liu, Y. L.; Burger, S. K.; Ayers, P. W. J. Math. Chem. 2011, 49, 1915. (31) Ohno, K.; Maeda, S. Chem. Phys. Lett. 2004, 384, 277. (32) Maeda, S.; Watanabe, Y.; Ohno, K. Chem. Phys. Lett. 2005, 414, 265. (33) Maeda, S.; Ohno, K. J. Phys. Chem. A 2005, 109, 5742. (34) Ohno, K.; Maeda, S. J. Phys. Chem. A 2006, 110, 8933. (35) Ohno, K.; Maeda, S. Phys. Scr. 2008, 78, 058122. (36) Dey, B. K.; Janicki, M. R.; Ayers, P. W. J. Chem. Phys. 2004, 121, 6667. (37) Dey, B. K.; Ayers, P. W. Mol. Phys. 2006, 104, 541. (38) Liu, Y. L.; Burger, S. K.; Dey, B. K.; Sarkar, U.; Janicki, M.; Ayers, P. W. In Quantum Biochemistry; Matta, C. F., Ed.; Wiley-VCH, Boston, 2010. (39) Burger, S. K.; Ayers, P. W. J. Chem. Theory Comput. 2010, 6, 1490. (40) Gould, N.; Ortner, C.; Packwood, D. arXiv:1407.2817v2 2014; https://arxiv.org/abs/1407.2817. (41) Zeng, Y.; Xiao, P.; Henkelman, G. J. Chem. Phys. 2014, 140, 044115. (42) Chill, S. T.; Henkelman, G. J. Chem. Phys. 2014, 140, 214110. (43) Pedersen, A.; Luiser, M. J. Chem. Phys. 2014, 140, 024109. (44) Xiao, P.; Wu, Q.; Henkelman, G. J. Chem. Phys. 2014, 141, 164111. (45) Chill, S. T.; Stevenson, J.; Ruhle, V.; Shang, C.; Xiao, P.; Farrell, J.; Wales, D.; Henkelman, G. J. Chem. Theory Comput. 2014, 10, 5476. (46) Olsen, R. A.; Kroes, G.-J.; Henkelman, G.; Arnaldsson, A.; Jónsson, H. J. Chem. Phys. 2004, 121, 9776. (47) Pedersen, A.; Hafstein, S. F.; Jónsson, H. SIAM Journal of Scientific Computing 2011, 33, 633. (48) Pedersen, A.; Jónsson, H. Mathematics and Computers in Simulation 2010, 80, 1487. (49) Chill, S. T.; Welborn, M.; Terrell, R.; Zhang, L.; Berthet, J.-C.; Pedersen, A.; Jónsson, H.; Henkelman, G. Modell. Simul. Mater. Sci. Eng. 2014, 22, 055002. (50) Gutiérrez, M. P. Searching for saddle points and global minima. Ph.D. Thesis, University of Iceland, 2015. (51) Davidson, E. R. J. Comput. Phys. 1975, 17, 87. (52) Murray, C. W.; Racine, S. C.; Davidson, E. R. J. Comput. Phys. 1992, 103, 382. (53) Reiher, M.; Neugebauer, J. J. Chem. Phys. 2003, 118, 1634. (54) Bofill, J. M. J. Comput. Chem. 1994, 15, 1. (55) Bofill, J. M. Chem. Phys. Lett. 1996, 260, 359. (56) Anglada, J. M.; Bofill, J. M. J. Comput. Chem. 1998, 19, 349. (57) Bofill, J. M. Int. J. Quantum Chem. 2003, 94, 324. (58) Argáez, C.; Gutiérrez, M. P.; Jónsson, H. (in preparation, 2016). (59) Henkelman, G.; Jóhannesson, G.; Jónsson, H. in Progress in Theoretical Chemistry and Physics; Schwartz, S. D., Ed.; Kluwer Academic: Dordrecht, 2000; Vol. 5, Chapter 10, pp 269−300. (60) Optbench. http://optbench.org/saddle-search.html#ptheptamer-island (accessed in June 2016). (61) Maronsson, J. B.; Jónsson, H.; Vegge, T. Phys. Chem. Chem. Phys. 2012, 14, 2884. (62) Gudmundsdóttir, S.; Skúlason, E.; Jónsson, H. Phys. Rev. Lett. 2012, 108, 156101. (63) Gudmundsdóttir, S.; Skúlason, E.; Weststrate, K. J.; Juurlink, L.; Jónsson, H. Phys. Chem. Chem. Phys. 2013, 15, 6323. (64) Henkelman, G.; Arnaldsson, A.; Jónsson, H. J. Chem. Phys. 2006, 124, 044706.
designed to preserve positive-semidefiniteness of the Hessian. The methods by Bofill avoid this and could give better estimates of the Hessian in regions near saddle points where one or more eigenvalue is negative.54−57 The adaption of the Davidson method to the minimum vibrational mode calculation will be described in more detail in a forthcoming publication.58 Finally, we mention the possibility of using the ridge tracking method61 which converges an elastic band of dimers on the energy ridge connecting two first order saddle points. This method can in principle be used to identify new first order saddle points in between known saddle points, analogous to the way the nudged elastic band method can identify new local minima between the end points (see, for example, refs 62−64).
■
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected]. ORCID
Hannes Jónsson: 0000-0001-8285-5421 Funding
This work was supported by the Icelandic Research Fund and by the Academy of Finland through its FiDiPro program (Grant No. 263294). Notes
The authors declare no competing financial interest.
■ ■
ACKNOWLEDGMENTS We thank Dr. Jean-Claude Berthet for helpful discussions and some of the reviewers for helpful and constructive suggestions. REFERENCES
(1) Butler, R. W. Saddlepoint approximations with applications; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, UK, 2007. (2) Pelzer, H.; Wigner, E. Z. Phys. Chem. B 1932, 15, 445. Wigner, E. Trans. Faraday Soc. 1938, 34, 29. (3) Vineyard, G. H. J. Phys. Chem. Solids 1957, 3, 121. (4) Kramers, H. A. Physica 1940, 7, 284. (5) Langer, J. S. Ann. Phys. 1969, 54, 258−275. (6) Henkelman, G.; Jónsson, H. J. Chem. Phys. 2001, 115, 9657. (7) Jónsson, H. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 944. (8) Xu, L.; Henkelman, G. J. Chem. Phys. 2008, 129, 114104. (9) Pedersen, A.; Jónsson, H. Acta Mater. 2009, 57, 4036. (10) Karssemeijer, L. J.; Pedersen, A.; Jónsson, H.; Cuppen, H. M. Phys. Chem. Chem. Phys. 2012, 14, 10844. (11) Pedersen, A.; Karssemeijer, L. J.; Cuppen, H. M.; Jónsson, H. J. Phys. Chem. C 2015, 119, 16528. (12) Batista, E. R.; Jónsson, H. Comput. Mater. Sci. 2001, 20, 325. (13) Kirkpatrick, S.; Gelatt, C. D., Jr.; Vecchi, M. P. Science 1983, 220, 671. (14) Pedersen, A.; Berthet, J.-C.; Jónsson, H. Lecture Notes in Computer Science 2012, 7134, 34. (15) Plasencia, M.; Pedersen, A.; Arnaldsson, A.; Berthet, J.-C.; Jónsson, H. Comput. Geosci. 2014, 65, 110. (16) Cerjan, C. J.; Miller, W. H. J. Chem. Phys. 1981, 75, 2800. (17) Schlegel, H. B. J. Comput. Chem. 1982, 3, 214. (18) Simons, J.; Jørgensen, P.; Taylor, H.; Ozment, J. J. Phys. Chem. 1983, 87, 2745. (19) Bell, S.; Crighton, J. S. J. Chem. Phys. 1984, 80, 2464. (20) Hoffman, D. K.; Nord, R. S.; Rudenberg, K. Theor. Chim. Acta 1986, 69, 265. (21) Jørgensen, P.; Jensen, H. J. A.; Helgaker, T. Theor. Chim. Acta 1988, 73, 55. (22) Nichols, J.; Taylor, H.; Schmidt, P.; Simons, J. J. Chem. Phys. 1990, 92, 340. J
DOI: 10.1021/acs.jctc.5b01216 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX