Firefly Algorithm for Structural Search - Journal of Chemical Theory

May 27, 2016 - The problem of computational structure prediction of materials is approached using the firefly (FF) algorithm. Starting from the chemic...
1 downloads 10 Views 3MB Size
Subscriber access provided by UNIV OF CAMBRIDGE

Article

Firefly algorithm for structural search Guillermo Avendaño-Franco, and Aldo H. Romero J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.5b01157 • Publication Date (Web): 27 May 2016 Downloaded from http://pubs.acs.org on June 1, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Firefly algorithm for structural search Guillermo Avenda˜no-Franco∗ and Aldo H. Romero∗ Department of Physics and Astronomy, West Virginia University, Morgantown, USA E-mail: [email protected]; [email protected]

Abstract The problem of computational structure prediction of materials is approached using the Firefly algorithm. Starting from the chemical composition and optionally using prior knowledge of similar structures, the Firefly method is able to predict not only known stable structures but also a variety of novel competitive metastable structures. This paper focuses on the strengths and limitations of the algorithm as a multimodal global searcher. The algorithm has been implemented in the software package PyChemia 1 , an open source python library for materials analysis. We present applications of the method to van der Waals clusters and crystal structures. The Firefly method is shown to be competitive when compared to other population-based global searchers.

1

Introduction

The discovery of new materials is crucial in our modern society. 2,3 Fundamental advances in ab-initio methods and the availability of current computational power allows materials prediction and characterization to take place from the computer before the material been actually synthesized. 4,5 In spite of technological advances, the search of energetically favorable structures remains a challenging problem. For large cells, a full exploration of the configu1

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

rational space becomes very challenging and we need efficient methods to explore large and complex structures. Computational materials discovery has been approached in the literature from basically two directions: First, collecting the knowledge of decades of traditional materials discovery and characterization with the addition of predicted structures and extrapolating that knowledge to other structures with similar composition 6 . The success of this approach is proportional to the richness of the database. The database approach is limited by the lack of variety of structures with large cells or number of species and its inability to generate new structure motifs. An alternative to these limitations is to directly explore the potential energy surface (PES) of a given composition. A Structural search is about the use of methods that explore the energy landscape with a good balance between accuracy and computational speed. The methods applied for structural search generally involve the integration of a local search algorithm and a global searcher method. The local optimization usually comes integrated with one of the state-of-the-art codes for first principles electronic-structure calculations such as density functional theory (DFT) with ABINIT 7,8 and VASP, 9 and tight binding (TB) methods such as DFTB+. 10 The global searcher tries to efficiently explore the possible basins on the potential energy landscape searching for the global minima, even though in most cases only an upper bound can be formally claimed. Among the methods discussed in the recent literature for exploring the PES it is worth mentioning genetic algorithms, 11–13 particle swarm algorithm, 14,15 random search 16,17 and minima hoping method 18,19 among many others (a good reference describing many different methods could be found in Ref. 20 and references therein). This paper presents the implementation of a global searcher called the Firefly method applied to structural search, the details of implementation and applications of the method to both finite and periodic structures. The Firefly algorithm has been proposed recently 21,22 as an alternative and generalization to other particle swarm optimization algorithms. The

2

ACS Paragon Plus Environment

Page 2 of 42

Page 3 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

paper is structured as follows: first, generalities of the method are presented. We show how this method is different from other population-based approaches and its potential advantages for multimodal problems as is the case for structural search problems. Second, a description of implementation details for the context of structural search and the strategies applied to optimize the search, accelerate relaxation and expand the diversity of the structures considered. Third, the method is applied in two cases that are prototypical of the kind of challenges that a global searcher has to face: Searching global minima configurations of finite systems, where we consider the case of Lennard-Jones clusters, and in periodic structures, where we explore the case of phosphorous crystal structures. The case of Lennard-Jones clusters helps to enlighten the strengths of the method but also its limitations in cases where the important minima are too scattered and the funnels are particularly narrow. For the phosphorus search, the efficiency of the method is compared with a simplified canonical implementation of the particle swarm algorithm and some other traditional global searchers. The method applied to crystal structures is interfaced with DFTB+ for a fast evaluation of the potential energy landscape. A comparison with the DFT landscape is presented to put in perspective the advantages and limitations of the use of Tight-Binding approach instead of Density Functional Theory. The performance of the method is studied by varying several parameters and conditions. Using the Firefly algorithm, many of the known unit cells are recovered within the considered number of atoms per unit cell. Additionally, we also identify some competitive structures that will be expanded and thoroughly discussed elsewhere. Finally, we draw conclusions and perspectives about the use of this methodology as well as some other metaheuristics for structural search of materials.

2

Generalities of Firefly algorithm as a global searcher

Consider the problem of finding the global minima, or a fairly rich amount of low-value local minima for a given function, for a given configuration space that is usually high-dimensional.

3

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

If the function is well-behaved, mostly continuous and derivable, we can start with a random configuration and the function value can be minimized by using a gradient-based method such as the conjugate gradient. 23 Following the gradient, the final configuration will end most likely in one of the local minima of the energy landscape. In such case, small perturbations will minimize back to the same configuration and it will not be easy to get out from this minimum, unless we apply methods against the gradient or in the case of atomic structures, dynamical methods that provide energy to the structure in order to overcome the barrier. In general, the total number of available local minima increases with the dimensionality of the configuration space. This can be easily shown if you consider many of those local minima being just defective versions of a dominant local minima. A good global searcher tries to find a balance between targeted exploration and diversity. Most of the global searchers assume some underlying structure about the landscape where the search is carried out. The general topography of the energy landscape has been studied 24,25 where several funnel configurations are characterized. Usually good local minima are surrounded by other minima, and such cluster structure is referred as “palm tree” by Wales 26 . Without some geometrical distribution of minima, any method could be as good as any other and making a random search would be the most simple approach to use. On the other hand, the structure of minima in the configuration space could be such that the space could be fragmented in meaningful subspaces that carries some information about the basin on which the structure is build up. Changing one configuration (mutating) or crossing fragments from different configurations is the base of the genetic algorithms (GA). The success of GA relies on an effective identification of such meaningful fragments and a clear method to define the mutation and crossing rules. The Firefly method (FF) that is presented in this paper belongs to the class of stochastic optimization (SO) methods, that rely on some randomness in contrast to the purely deterministic minimization of the gradient algorithms. Due to the higher-level heuristic designed

4

ACS Paragon Plus Environment

Page 4 of 42

Page 5 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

to find the optimal candidate, FF belongs to a class called metaheuristics. The common characteristic of metaheuristic methods is to avoid “heuristics” about the space where it is applied even if the low level details of about how structures changes could use heuristics. An important subclass of metaheuristics are population based algorithms where a set of candidates (population elements), evolve on each iteration exploring the configuration space. The practical advantage of this kind of algorithms is that in general each element of the population could be evaluated concurrently. Going deeper with this taxonomy, the Firefly method takes the active candidates of a given iteration and makes them evolve according to their present and eventually past locations. This behavior type is built on the so-called swarm based-algorithms. The basic idea behind swarm-based algorithms is to modify candidates based on the best minima found up to now. Three basic ingredients are needed for swarm-based algorithms to work: First, the definition of a target function to minimize, usually the energy per atom, or more generally the enthalpy per atom. Second, a real-valuated function to use as a measure of distance between two given candidates. Such a function should evaluate to zero for two identical candidates and the value of the distance should be somehow proportional to the dissimilarity of those structures. We cannot claim in rigor that such function is a metric, but for the purpose of the method such condition is not necessary. The final ingredient is a parametric function that effectively defines a path on configuration space and allows us to transform one candidate in the direction of another candidate. The main idea behind swarm algorithms is to redirect interesting candidates close to promising regions of configuration space. If the distribution of local minima is clustered in funnels, ie, good minima are surrounded by other local minima, moving close to promising areas could increase the chances of finding a better candidate compared to those already found. Such movement and subsequent relaxation could also create situations where two or more candidates end on the same basin. When such situation happens only one candidate should remain replacing the others by fresh candidates. This procedure helps keeping the

5

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 42

diversity of the population and avoid bias towards several equivalent minima. In the literature, many models have been proposed for swarm behavior and consequently a similar number of models for swarm based optimization. 27 One widely used is particle swarm optimization (PSO) and it has been recently implemented for structural search. 14,15 PSO scans over the configurational space by considering a population of individuals or agents that adjust their locations based on the value of the local best fit candidate as well as the locations of the entire swarm. There are many variants of PSO. 28 One basic version, a candidate (particle) is attracted toward the “position” of the global best in the current swarm and some random movement is added to the deterministic component of the movement. 27 For our purpose, “position” is defined as the configuration of a given structure: atomic positions and cell parameters. The position of a particle is updated as

xt+1 = xti + β(xt∗ − xti ) + αt ǫti i

(1)

where x∗ is the best particle in the swarm. This individual will always be promoted from generation to generation until defeated by a better individual. Using this approach the search is concentrated around the best candidate and diversity is provided by the randomness of the third term in the Eq.1. The Firefly algorithm was proposed recently as a promising alternative for multimodal optimization problems. 21,22 The Firefly algorithm shares several similarities with other swarm based algorithms. Indeed, it can be shown that Firefly method constitutes a general class of swarm algorithms, covering some traditional particle swarm algorithms as restricted versions. Firefly algorithm (FF) is based on the flashing patterns and behavior of fireflies. In essence, FF uses the following three idealized rules: 1. Fireflies are unisex so that one firefly will be attracted to other fireflies regardless of their sex. 2. The attractiveness is proportional to the brightness, and they both decrease as their 6

ACS Paragon Plus Environment

Page 7 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

distance increases, typically brightness decreases exponentially with distance but other functional forms can also be used. 3. For any two fireflies, the brighter one will attract the fainter. If there is no brighter one than a particular firefly, it will move randomly or not move at all. The brightness of each firefly is proportional or equal to the objective function. Each individual firefly i is attracted towards all brighter fireflies. The equation for their movement is determined by:

xt+1 = xti + i

X

2

β0 e−γrij (xtj − xti ) + αt ǫti .

(2)

j

The first term on the right hand side represents the current configuration of the candidate (positions and cell parameters for periodic structures). The sum on the second term runs over all the candidates that are brighter than i (lower in energy/enthalpy). This exponential term is the characteristic dependency of attractiveness with relative distance between the fireflies. The third term introduces some randomness in the final configuration with a value that decreases with each iteration (for α < 1). Two important cases result from extreme values for β0 and γ: If β0 = 0, a random search method is obtained and when γ = 0 we get a variant of the particle swarm algorithm. The Firefly method is based on a swarm attraction which decreases with distance. Under some circumstances the whole population can automatically subdivide into sub-swarms, each of them exploring close to competitive minima, which makes it more suitable for multimodal problems as it is the case of structural search on a multi-funnel energy landscape. This subdivision naturally allocates candidates around interesting regions. The effect of subswarming could vary accordingly with the number of candidates of each generation and the size of the different basins and funnels. In general problems of global optimization, the candidates are directly evaluated when a gradient minimization cannot be applied. For structural search we can take advantage of 7

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

a local minimization to concentrate candidates close to local minima, by doing this we have to take an extra step, removing candidates that minimize on the same basin using a criteria based on the distance. The local minimization and evaluation of the objective function is the most time-consuming part of the global searcher. Even for Lennard-Jones clusters, the local minimization takes most of the actual time of the algorithm. For our second example using tight-binding method or first-principles calculations the evaluation and local minimization is even more computationally demanding. Taking advantage of concurrency is critical to effectively advance the exploration using a global searcher.

3

Specifics of Firefly algorithm for structural search

Metaheuristic methods usually are described and tested on functions defined on RN . This space offers trivial definitions for configuration distance, vector displacement and random generation of new individuals. An equation such as Eq.2 offers no ambiguities for simple, continuous, derivable, i.e., academic examples used in the literature to test stochastic optimization 29 . In the case of structural search the same quantities can be defined but their definitions are far from trivial and in some cases some acceleration of convergence can be achieved by an appropriated selection of such quantities. Three elements deserve attentive discussion in the context of the Firefly algorithm: the generation of new individuals, the definition of a distance between structures and finally, the parametric displacement between two structures. For the creation of new structures several methods are considered. One simple option is starting from purely random structures: for a given composition we can associate random positions for the atoms and for crystals, in addition, random lattice vectors to describe the cell. Without further constraints this would be, in general, a very bad starting point. Randomly locating atoms will create structures where atoms, too close or too far from each other makes local optimization slower and ineffective. To avoid these issues for randomly

8

ACS Paragon Plus Environment

Page 8 of 42

Page 9 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

generated structures, several strategies are implemented. Each time a new random structure is generated its lattice vectors are scaled until the distance between two atoms is never lower than the sum of their covalent radii. Another option is stretching the lattice using a stress matrix with appropriated eigenvector in the direction of the interatomic distance that needs to be separated. The procedure is applied to every atom pair making the entire structure compatible with the premise of all the distances between atoms being at least their sum of covalent radius. Several instances are tested and the procedure is stopped when the best structure survives several iterations. Another method to generate new structures is by using similar structures which are already known or predicted. The compatible structures to consider we mean those than contain the same number of atoms and the same basic composition, even cases with different species. Using a source database has several advantages: good structures are already there, a large database will return a good representation of the symmetry groups where structures for other compositions have been found. The structures with an appropriate scaling will eventually be close to one of the minima for the composition to be searched. This method is close to the idea of creating new structures using a random symmetry group as a template. The advantage comes from having already a good statistical sample about which symmetry groups are more likely to be found for a given abstract composition and the actual locations for certain atoms in the lattice. An appropriate definition of distance is the second element to consider in the Firefly method. There are several proposed methodologies for a distance in configuration space. 30 One natural solution is to use a quantity related to the pair correlation function 31,32 . The usual procedure is counting distances in the crystal by separating the components of the fingerprint function coming from different species-type pairs 33 . For each pair, the fingerprint the peaks are smoothed by gaussians and the resulting function discretize in bins to create an N dimensional vector. The equation of the fingerprint for each pair is given as:

9

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

FAB (R) =

X X δ(R − Rij) −1 2 NA N B ∆ 4πR ij V A ,cell B

Page 10 of 42

(3)

j

i

To create a fingerprint, Eq. 3 is discretized and converted into a long vector of numbers. For practical applications, the fingerprint in Eq. 3 is simple to compute and store. The distance between two structures is given by

Dcos

1 = 2



F1 · F2 1− |F1 ||F2 |



(4)

The evaluation of the distance is relatively inexpensive, and provide a criteria for evaluating the similarity between two structures. For non periodic structures the direct calculation of a pair correlation function with an appropriate discretization provides the same capabilities of computing distances between structures. To get an idea about how the definition of distance applies for structures, Fig. 1 shows a comparison performed for local minima structures of pure phosphorus crystals as described in more detail below. There are two important features in this figure: First, the maximal distance between two basins is around 0.57 with most of the distances lower than 0.53. All the structures with different space groups are separated at least 0.15 one from another. Lower distances have almost zero difference in energy and no difference in space group. They correspond, with good confidence, to the same crystal structure. The last ingredient to specify for the Firefly algorithm is how to change one structure in the direction of another structure. One of the peculiarities of the Firefly algorithm and other similar swarm-based global optimization methods is the idea of changing candidates in the direction of one or more leader candidates. In order to reproduce that for the case of structural search, a path in the configuration space needs to be defined. For finite systems, a configuration is the set of positions in Cartesian coordinates ({~ ri }). For periodic structures the configuration is the set of positions in reduced coordinates {~ ri } plus the cell vectors as columns of a matrix h. The representation for a given crystal is not unique and depend on

10

ACS Paragon Plus Environment

Page 11 of 42

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 42

the unit cell definition {{~ ri }, h}. We try to avoid these ambiguities by transforming the cell vectors in such a way that the matrix h is lower triangular and the lattice vectors are sorted on decreasing order of length. A configuration path is a continuous map from structure A towards the configuration of structure B. We can define the path by an association one to one between the atoms on both structures and a continuous function that moves the positions in A into the positions in B by using one parameter λ in the closed [0, 1]. The lattice vectors can be transformed by

I3 + λ(T − I3 ),

(5)

where I3 is the identity matrix 3 × 3. The matrix T is defined by

T = htA · (htB )−1 ,

(6)

where hA and hB are crystal matrices as defined before. This matrix describes a transformation such that when applied to the cell matrix hA will be continuously transformed from hA to hB with the parameter λ in the range [0, 1]. Before the structures are modified according to the FF method, each structure is locally minimized. In cases where two structures go to the same minima, only one is left and the others are replaced by new ones. Using α0 = 0, the lowest energy structure is passed to the next generation unchanged (similar to the elitist approach in genetic algorithms). The other structures are moved following Eq. 2. A diagram of the implementation of this method inside the PyChemia 1 code is shown in Fig. 2. Traditional implementations of algorithms for structural search preserve the number of atoms in the unit cell. Our implementation lift this constrain and elements in the population can have different number of atoms per unit cell. This offers advantages, when working with small numbers of atoms per unit cell, because the computations can be done in the

12

ACS Paragon Plus Environment

Page 13 of 42

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

primitive cell. However, structures with different numbers of atoms need to have a way of being comparable and their atoms and positions matched accordingly. This is particularly important for a method such as Firefly where one structure is moved in the direction of another. By matching two structures, we mean create a one-to-one association between the atoms of both structures. The matching algorithm try to associate atoms in both structures, minimizing the displacement needed to move the atoms in one structure towards the other.

4

Applications

The Firefly method in two different contexts. Finding global minima for Lennard-Jones clusters is an important and intensively studied global optimization problem 34 . Modeling even these simple finite systems proves to be a rather complex problem. As a second case, the Firefly method is applied to crystalline phosphorus structures.

4.1

Lennard-Jones Clusters

The determination of the global minima of Lennard-Jones (LJ) clusters by numerical global optimization techniques has been been intensely studied for several decades now. 35,36 Despite of their apparent simplicity, finding the the global minima is a challenging problem for a global search algorithm. The Lennard-Jones potential is given by " # X  σ 12  σ 6 E = 4ǫ − , r r ij ij i 26 when the global minima are found after a number of candidates and several generations. Considering that the algorithm is working with 16 candidates per generation and the current lowest structure is automatically promoted to the next generation the number of candidates considered after m generations is 16m − m. We can see that in many cases the known global minima is found with less than 50 generations. It is true, in general, that the number of local minima increases with the cluster size. However, the chances of finding the global minima for a particular cluster size is not proportional to the inverse of cluster size. Consider for example the region 51 ≤ N ≤ 57. The number of generations needed to find the reported local minima are far less than for its predecessors 41 ≤ N ≤ 50. The clusters N ≥ 58 requires a particularly large number of generations to converge. For the particular case where N = 75 the method fails to find the global minima after 47 thousand evaluations. We attribute the limitations of the method for

17

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 18 of 42

Page 19 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

these cases to a very narrow funnel for the global minima 56 . Random generation will have very little chances of relaxing into the global minimum. Once you reach the second or third lowest minima, the other minima will be attracted to it favoring the search of structures close to those minima, and far from the actual global one. It is important to notice that for some cluster sizes, there is basically a continuum of local minima, all of them considered different according to our discussed criteria. Further analysis can be done using a network analysis to the exploration using the Firefly algorithm. 4.1.1

Network analysis for searching LJ clusters

Most global searchers include some exploration of the space based on a purely random generation of candidates. For the case of Firefly the random exploration occurs for the first generation of random candidates or when a candidate is replaced to avoid duplicates. A fair analysis of any global search should be able to account the rate of success due to a purely random generation compared to the target exploration enforced by the method. Population methods such as Firefly change the candidates in such a way that a particular candidate can be trace its origin to a random generated one or as the evolution of a previous one. The set of relations can be accounted in a graph and we can study the properties of those graphs to get a global perspective of the actuation of the method during a the search. There are two kinds of graphs that we will consider here. The first one is just a graph that connects candidates as they evolve from generation to generation. If considered as a directed graph, each candidate is a node and the nodes are connected either to the candidate after being moved and relaxed or to another candidate in the same generation if the node is merged as duplicate. This graph serves more to the purpose of tuning the different parameters of the algorithm, if β0 is too small the candidates remain in the same basin, the networks connect equivalent candidates and the exploration is limited. Another graph can be created by considering how equivalent structures are connected. This graph provides a view about how the different basins are connected. Distant basins will

19

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

have little chance of having candidates that connect one to each other. Some nodes could have many edges indicating low energy clusters that attract easily other non-equivalent clusters to its basin. From the graphs generated for each cluster size, we observe that for all the small clusters with N < 26, the global minimum is found just from random candidates. This kind of graphs offers an alternative perspective to the connectivity graphs formulated by Doye 57 . For the contrary, most of the global minima for larger clusters are found as product of the evolution of existing candidates. In the range from 30 to 60 only in 7 cases the global minima was found randomly.

4.2

Pristine Phosphorus Crystals

As a proof of principle we apply the method to pure phosphorus crystals as a example of searching periodic structures. Phosphorus is one group-V element than can exist in several allotropes exhibiting a wide range of properties. 58–60 White phosphorus (WP) or yellow phosphorus, exist as arrangements of tetrahedral P4 clusters. 61–64 Black phosphorus (BP) is a layered material similar to graphite but with non-flat layers due to an sp3 hybridization. 65,66 BP is a semiconductor with a direct gap of about 0.3 eV 67 . Black phosphorus is the thermodynamically stable form of phosphorus at room temperature and pressure and has attracted interest in the last few years 68 due to the similarities with graphite and as a possible anode material for Lithium-based batteries. 69 Red Phosphorus (RP) is created from white phosphorus, it is an amorphous network of chained tetrahedrals. 70 Amorphous red phosphorus is studied as a suitable anode material for Sodium-based batteries. 71 Hittorf’s violet phosphorus (VP) is another allotrope created by exposing red phosphorus at 530o C. The structure is monoclinic or rhombohedral. 72 Other possible allotropes are cages and ring-shaped structures 73 and fibrous forms. 74 For the purpose of this work, structures with at most 16 atoms in the cell are considered. Larger structures with 21, 42 and 84 have been explored but the computational cost to relax each candidate is an strong limitation to 20

ACS Paragon Plus Environment

Page 20 of 42

Page 21 of 42

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

is found or all the compatible structures are exhausted. By simply choosing structures from a raw list of compatible structures we bias the search towards those with the highest representation on the database. Each time one structure is chosen from the database, that structure is marked to avoid recurrence. This procedure is similar to the creation of a taboo list that implicitly avoids revisiting configurations. To evaluate each structure we have used the Density-Functional Tight-binding method (DFTB). DFTB is based on an expansion of the Kohn-Sham total energy in DensityFunctional Theory (DFT) up to second-order with respect to charge density fluctuations. In the limit of zeroth-order, this method is equivalent to a non-self-consistent Tight binding (TB) Scheme. DFTB relies on parameters accounting for the electronic part of the DFTB model and repulsive energy contributions, similar to a two-body force field term. For the purpose of this work we use the DFTB method as it was implemented on the DFTB+ software package. 10 Each structure was relaxed using DFTB+ with Slater-Koster files for P taken from the matsci-0.3 package. 83,84 The Slater-Koster files for phosphorus on the matsci-0.3 dataset were created fitted and used to represent Phosponic acids on alumina Al2 O3 83 and phosphonic acids on titania TiO2 (rutile, anatase). 84 Even if the Slater-Koster files for phosphorus have been tested in coordination conditions of finite systems, their use for crystals is unclear, where it is possible to find structures with unusual coordination. We start by doing a simple analysis on the reliability of DFTB+ with the aforementioned Slater-Koster files to simulate the known structures of phosphorus with less than 16 atoms per unit cell. The ICSD database collects 13 entries for pure phosphorus structures, they are presented in the Table 1. Structures with less than 16 atoms per unit cell were relaxed using DFTB+. We can notice that DFTB with the corresponding the Slater-Koster files was able to preserve the space group of four structures, one with space group 221 (Pm3m) and three structures under 64. The final structures are systematically lower in density. However, in some other cases the relaxation conducts the structure to a very different crystal group. That is particularly evident for the two high density structures with space group 65 (Cmmm).

22

ACS Paragon Plus Environment

Page 22 of 42

Page 23 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 1: Reported pure phosphorus structures from the ICSD 81,82 database. The last column shows the space group of the structure after being relaxed by DFTB+ using the Slater-Koster files on matsci-0-3 83,84 ICSD 98121 78 98120 78 162242 76 98122 78 600019 77 162243 76 647886 70 27847 70 98119 78 169539 85 154318 79 391323 74 29273 75

natom 1 2 2 4 6 8 8 8 8 8 16 42 84

Space Group original relaxed 221 (Pm3m) 221 (Pm3m) 166 (R3m) 2 (P1) 65 (Cmmm) 2 (P1) 74 (Imma) 10 (P2/m) 166 (R3m) 2 (P1) 65 (Cmmm) 10 (P2/m) 64 (Cmca) 12 (C2/m) 64 (Cmca) 64 (Cmca) 64 (Cmca) 64 (Cmca) 64 (Cmca) 64 (Cmca) 12 (C2/m) 2 (P1) 13 (P2/c)

Density original relaxed 2.18 1.40 2.05 1.50 3.39 1.40 2.17 1.43 2.16 1.43 3.39 1.40 1.62 1.56 1.62 1.56 1.77 1.56 1.58 1.56 1.19 1.43 1.42

DFTB is fast route to evaluate a large set of structures and with the appropriate warnings can accelerate the search in comparison with DFT, which happens to be more computationally demanding. To improve reliability, the different structures found by DFTB are additionally relaxed using DFT as it is implemented in the VASP. From this point the method follows a similar path as we already described for LennardJones clusters. The search of phosphorus crystals was executed with 32 candidates in each generation. Structures were relaxed until forces and stresses were below 10−4 eV/˚ A. After each relaxation the structure is further refined if the structure is very close to a symmetrical configuration. The cell refining could create a smaller cell for its primitive representation. Once all structures are relaxed, duplicated structures are excluded using the PCF fingerprinting scheme. 33 The energetic criteria to consider two structures as similar is 10meV. The “distance” criteria to consider two similar structures equivalent is 0.2. The global search algorithm stops when the global minimum is unchanged for at least 10 generations.

23

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

One of the advantages of the Firefly method is its favorablility towards multimodal problems and clustered minima. To quantify the effectiveness of the new methodology, we measure the amount of different low-energy structures found as a function of the total number of evaluated structures. Low-energy structures are those up to 1eV above the global minimun, constrained to the predefined number of atoms per unit cell. For the next comparisons a fixed number of 8 atoms per unit cell were considered. Panels (upper-right and left) in Fig. 5 compares Firefly with other methods also implemented in PyChemia 1 . The methods compared include the Harmony Search (HS) method 86 , the particle swarm algorithm (PS) with one leader mentioned before and a genetic algorithm (GA) with pure crossing. Firefly outperforms all other methods finding a variety of very good local minima. When counted the total number of different structures (GA) offers an advantage, but only because many of the structures produced by GA are high energy configurations. One of the important ingredients for swarm based optimization is the proper association of atoms. Two structures with a different association of atoms could take very different paths, eventually moving structures in unintended directions. Above we discuss how the representation structures is normalize to facilitate the association of atoms in two structures. Upper figures are similar executions where in (upper-right) the matching of atoms has been disabled and atoms are associated in their original ordering they where stored. We notice that a good matching has an important effect on improving the efficiency of the method. A second set of comparisons are related to two of the parameters to tune in the Firefly method, β0 and γ Assuming γ = 0, β0 will determine how much one candidate moves toward some other candidates. If the β0 is close to 0, the structure will remain too close to the original structure, maybe inside its own basin, so no new structures will be created. For β0 close to 1 the structure will move closer to the more promising candidates and the again the search is not effective. The figure (lower-right) in Fig. 5 compare Firefly for several values of β0 and we found that a value of β0 = 2 offers the best performance in discovery of new basins

24

ACS Paragon Plus Environment

Page 24 of 42

Page 25 of 42

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 26 of 42

Page 27 of 42

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

4.2.1

Network structure of the phosphorus energy landscape

Similarly to the network analysis for the Lennard-Jones clusters, connectivity relations will be extracted from different basins. One particular feature that helps in this endeavor is that population evolves on the potential energy surface, therefore, each candidate is linked to other previously visited structural minima. This behavior contrasts with genetic algorithms were candidates are not only mutated but also mixed with a more complex historical path. It is also different from random search were candidates are discarded in each generation and no heritage exists between candidates from successive generations. Having a single connected evolution means that successive generations can be connected with their ancestors, the population of candidates from previous generations. The interest in getting this is two fold: the configuration space is typically too large to be represented visually and a network representation of the basins of different configurations could help to identify structures that are somehow related to each other. Consider a particular candidate A for a given iteration, this candidate will be moved according to the rules of the Firefly method to a new position B and relaxed to a position C. The initial and final positions are local minima, there is a parametric function as was defined for the Firefly method that moves A to B and B belongs to the basin where C is the attractor. On Fig. 9 we have an idea about how the method improuve the generations with time. The inner circles are represents the firsts generations and they are mostly high energy structures. As the method evolves, lower energy structures are found and the entire population follow the general energetic improvement. The lowest energy structure (the dark blue structure at 0 degrees) with space group 64 succeed to remain as the lower energy structure for a number of generations and the iteration stops. The higher energy structures on the outer shells of the figure are mostly due to fresh structures created randomly when their ancestors were removed as duplicates. Fig. 10 depicts the network representation of the evolution of the configurations for the case of up to 6 atoms per unit cell. The dots represent the relaxed structures that were 28

ACS Paragon Plus Environment

Page 28 of 42

Page 29 of 42

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

generated during the entire run of the search. Two structures with the same space group are plotted with the same color. The links represent a historical link between two structures. A structure could be modified by the algorithm and relaxed to the same or different attractor. The link width is proportional to the distance between the two structures. Dots connected by very slim lines and with the same color indicate that very likely the structure have remain trapped. This configuration can survive for several generations until conditions make favorable to escape to a new basin. Two or more dots could converge onto the same dot: these correspond to structures that were discarded in a given generation for duplicity. If for each iteration, all structures are relaxed and their energy is computed, it means that the network will be fully connected. However, in some cases just a few structures can take a large amount of time to converge, and the waiting time to step to the next generation can damage the algorithm performance. In order to avoid that, the algorithm wait until a large percentage of structures are converged. We use a criteria of 95% of structures relaxed, or at most missing two structures and at least they all have passed a minimum iteration threshold. When the criteria limit is reached, non-relaxed structures are discarded, the algorithm is applied with the subset of converged structures and those discarded candidates are replaced by new structures in the next generation. Fig. 10 offers a pictorial representation of the evolution of the Firefly method, in particular it shows how the different visited minima are connected to each other. Line thickness represent how similar two structures are by measuring their distance and the color gives the space group symmetry. It is clear the appearance of long branches, which represent a single walk of connected structures in the PES. Two or more branches could eventually merge, this happens when some structures are within the same funnel and found the lowest energy structure in that basin. This plot summarizes the candidates connectivity for the performed structural search over a maximum number of 6 atoms in unit cell. On Fig. 10, during a single generation several structures were trapped by the local dominant minimum, and they were effectively discarded.

30

ACS Paragon Plus Environment

Page 30 of 42

Page 31 of 42

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

We can get an alternative image about the relations not between candidates, but between the basins of attraction. By creating a class of equivalence between structures whose distance is lower than a given value. Therefore, we can get a picture not only about the method performance and how the different minima are visited, but about the network structure of the potential energy landscape. The links are now showing the different paths connecting nearby local minima as represented on Fig. 11. This offers a basin connectivity for the potential energy landscape. From the figure we observe that not all basins are connected with every other basin. There are relatively well defined paths connecting the different minima. Some minima are more connected and some others barely linked to the rest of the network. Structures with space group 139 (I4/mmm) and 221 (Pm3m) are relatively close to each other on the network but linked through a high energy structure with space group 123. Many structures with high symmetry are agglomerated close to those with space group 221 (Pm3m). One basin with several edges such as the space group 139 (I4/mmm) shows the tendency of low energy structures to move towards better candidates and the overall advantage of Firefly to improve the candidates for each new iteration. On Fig. 11 the links between several minima happen as the result of one evolution that moves one structure into the basin of another one and due to a proximity of basins making the transition favorable. In the center of the figure there is one structure that despite of its lower symmetry, P1, it attracts many other structures that merge with it before moving towards other structures such as 51 and 221 (Pm3m). The correlation of network connectivity with group-subgroup relations or physical considerations to measure how different structures are connected is out of the scope for this work but network analysis like this could be very useful in pursuing such analysis.

32

ACS Paragon Plus Environment

Page 32 of 42

Page 33 of 42

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

5

Conclusions

We have introduced a powerful algorithm for structural search of materials and a set of tools that could accelerate the convergence and discovery of favorable candidates. The Firefly algorithm has been implemented within the PyChemia 1 code and used in the context of structural search of materials. This method is a metaheuristic population method that relies on defining a population size and some parameters that make the method more or less random, as well as defining how close one structure moves in the direction of another. The implementation has proven to be efficient when applied to both periodic and non-periodic structures. We show that at least in the case presented here, the Firefly method outperforms a single leader particle swarm optimization and some other metaheuristic methods implemented. The Firefly algorithm relies on a real-valued function that serves as the distance to describe how the structures are close to each other. By defining a parametric path connecting two structures, the method promotes higher energy candidates towards low energy ones. The pair correlation function was used to define fingerprints and as a way to compute the structural distance between candidates. We show that the use of a source database could effectively accelerate the discovery of different candidates, but the method still performs well without its use. In particular, it is shown that using structures taken from a database improves the search compared to simple random structure generation. Methods to compare structures with different atoms in the primitive cell have been considered, which allow a parallel search over different number of atoms per unit cell. Network-based tools have been introduced to rationalize the behavior of swarm based methods and get a better understanding of the complex structure of the potential energy landscape. For Lennard-Jones clusters we use the BFGS algorithm to relax each candidate. For phosphorus crystals we use DFTB combined with DFT to compute energies and the internal gradient-based algorithms incuded in DFTB+ and VASP respectively. We show that DFTB, using the Slater-Koster files for P, gives a good account for most reported phosphorous 34

ACS Paragon Plus Environment

Page 34 of 42

Page 35 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

crystal structures but the complete landscape offered by DFTB could be very different from the landscape that DFT shows. The DFTB method is fast enough to be useful for a first screening of structures, and a second relaxation using DFT could solve some of the misses. The Firefly method is successful in finding the basic structures known for phosphorus under the limitations imposed to the number of atoms in the cell and the corresponding method used to described the atomic interactions. A large diversity of structures was found and even for the monoatomic case, we were able to find all reported crystal structures up to 16 atoms per unit cell as well as some other metastable crystal structures. The method described here is fairly general and can be applied to crystal structures using different levels of theory and applied also to the search of finite systems and surfaces. In general, swarm methods such as Firefly offers novel perspectives to access the problem of structural search and network representation, which are very useful tools to observe an otherwise highly dimensional configuration space. Specifically, we propose hybrid methods that would accelerate the search as well as to increase the possible number of atoms to be considered.

6

Acknowledgements

We acknowledge the support from the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation Grant No. ACI-1053575 as well as the support of NSF under project 1434897 and the Donors of the American Chemical Society Petroleum Research Fund for partial support of this research under contract 54075-ND10.

References (1) PyChemia. https://github.com/MaterialsDiscovery/PyChemia. 35

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(2) Crabtree, G.; Glotzer, S.; McCrdy, B.; Roberto, J. Computational materials science and chemistry: accelerating discovery and innovation through simulation-based engineering and science. Rep. US Dep. Energy Work. Comput. Mater. Sci. Chem. Innov. 2010. (3) Moskowitz, S. L. The advanced materials revolution: technology and economic growth in the age of globalization; John Wiley and Sons, 2014. (4) Woodley, S. M.; Catlow, R. Nat. Mater. 2008, 7, 937–946. (5) Sch¨on, J. C.; Doll, K.; Jansen, M. Phys. status solidi 2010, 247, 23–39. (6) Ceder, G.; Morgan, D.; Fischer, C.; Tibbetts, K.; Curtarolo, S. MRS Bull. 2006, 31, 981–985. (7) Gonze, X.; Beuken, J.-M.; Caracas, R.; Detraux, F.; Fuchs, M.; Rignanese, G.-M.; Sindic, L.; Verstraete, M.; Zerah, G.; Jollet, F.; Torrent, M.; Roy, A.; Mikami, M.; Ghosez, P.; Raty, J.-Y.; Allan, D. Comput. Mater. Sci. 2002, 25, 478–492. (8) Gonze, X.; Amadon, B.; Anglade, P.-M.; Beuken, J.-M.; Bottin, F.; Boulanger, P.; Bruneval, F.; Caliste, D.; Caracas, R.; Cˆot´e, M.; Deutsch, T.; Genovese, L.; Ghosez, P.; Giantomassi, M.; Goedecker, S.; Hamann, D.; Hermet, P.; Jollet, F.; Jomard, G.; Leroux, S.; Mancini, M.; Mazevet, S.; Oliveira, M.; Onida, G.; Pouillon, Y.; Rangel, T.; Rignanese, G.-M.; Sangalli, D.; Shaltaf, R.; Torrent, M.; Verstraete, M.; Zerah, G.; Zwanziger, J. Comput. Phys. Commun. 2009, 180, 2582–2615. (9) Kresse, G.; Furthm¨ uller, J. Comput. Mater. Sci. 1996, 6, 15–50. (10) Aradi, B.; Hourahine, B.; Frauenheim, T. J. Phys. Chem. A 2007, 111, 5678–5684. (11) Glass, C. W.; Oganov, A. R.; Hansen, N. Comput. Phys. Commun. 2006, 175, 713–720. (12) Oganov, A. R.; Lyakhov, A. O.; Valle, M. Acc. Chem. Res. 2011, 44, 227–237.

36

ACS Paragon Plus Environment

Page 36 of 42

Page 37 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(13) Lyakhov, A. O.; Oganov, A. R.; Stokes, H. T.; Zhu, Q. Comput. Phys. Commun. 2013, 184, 1172–1182. (14) Wang, Y.; Lv, J.; Zhu, L.; Ma, Y. Phys. Rev. B - Condens. Matter Mater. Phys. 2010, 82, 1–8. (15) Wang, Y.; Lv, J.; Zhu, L.; Ma, Y. Comput. Phys. Commun. 2012, 183, 2063–2070. (16) Pickard, C. J.; Needs, R. J. Phys. status solidi 2009, 246, 536–540. (17) Pickard, C. J.; Needs, R. J. J. Phys. Condens. Matter 2011, 23, 053201. (18) Goedecker, S. J. Chem. Phys. 2004, 120, 9911. (19) Amsler, M.; Goedecker, S. J. Chem. Phys. 2010, 133, 224104. (20) Oganov, A. R. Modern Methods of Crystal Structure Prediction; John Wiley and Sons, 2011. (21) Yang, X.-S. Int. J. Bio-Inspired Comput. 2010, 2, 78–84. (22) Yang, X. S.; He, X. Int. J. Swarm Intell. 2013, 1, 36. (23) Hestenes, M. R.; Stiefel, E. 1952, (24) Doye, J. P. K.; Miller, M. A.; Wales, D. J. J. Chem. Phys. 1999, 110, 6896. (25) Doye, J. P. K. Phys. Rev. E 2000, 62, 8753–8761. (26) Wales, D. J.; Miller, M. A.; Walsh, T. R. Nature 1998, 394, 758–760. (27) Kennedy, J.; Eberhart, R. Particle swarm optimization. Proc. ICNN’95 - Int. Conf. Neural Networks. 1995; pp 1942–1948. (28) Poli, R. J. Artif. Evol. Appl. 2008, 2008, 3.

37

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(29) Cavazzuti, M. Optim. Methods; Springer Berlin Heidelberg: Berlin, Heidelberg, 2013; Chapter Stochastic, pp 103–130. (30) Sadeghi, A.; Ghasemi, S. A.; Schaefer, B.; Mohr, S.; Lill, M. A.; Goedecker, S. J. Chem. Phys. 2013, 139 . (31) Van Eijck, B. P.; Kroon, J. J. Comput. Chem. 1997, 18, 1036–1042. (32) Verwer, P.; Leusen, F. J. J. Rev. Comput. Chem.; 2007; Vol. 12; pp 327–365. (33) Oganov, A. R.; Valle, M. J. Chem. Phys. 2009, 130 . (34) Doye, J. P. K. In Glob. Optim.; Pint´er, J. D., Ed.; Springer US: Boston, MA, 2006; Chapter Physical P, pp 103–139. (35) Wales, D. J. Science (80-. ). 1999, 285, 1368–1372. (36) Wille, L. T. Annu. Rev. Comput. Phys. 2000, 7, 25–60. (37) Leary, R. H. J. Glob. Optim. 1997, 11, 35–53. (38) Locatelli, M.; Schoen, F. Comput. Optim. Appl. 2003, 26, 173–190. (39) Wille, L. T. Chem. Phys. Lett. 1987, 133, 405–410. (40) Niesse, J. A.; Mayne, H. R. J. Chem. Phys. 1996, 105, 4700–4706. (41) Daven, D.; Tit, N.; Morris, J.; Ho, K. Chem. Phys. Lett. 1996, 256, 195–200. (42) Wolf, M. D.; and Uzi Landman, J. Phys. Chem. A 1998, 102, 6129–6137. (43) Romero, D.; Barr´on, C.; G´omez, S. Comput. Phys. Commun. 1999, 123, 87–96. (44) Barr´on, C.; G´omez, S.; Romero, D.; Saavedra, A. Appl. Math. Lett. 1999, 12, 85–90. (45) Northby, J. A. J. Chem. Phys. 1987, 87, 6166.

38

ACS Paragon Plus Environment

Page 38 of 42

Page 39 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(46) Hoare, M.; Pal, P. Adv. Phys. 1971, 20, 161–196. (47) Barr´on, C.; G´omez, S.; Romero, D. Appl. Math. Lett. 1996, 9, 75–78. (48) Wales, D. J.; Doye, J. P. K. J. Phys. Chem. A 1997, 101, 5111–5116. (49) Broyden, C. G. IMA J. Appl. Math. (Institute Math. Its Appl. 1970, 6, 76–90. (50) Broyden, G. C. J. Inst. Math. Its Appl. 1970, 6, 76–231. (51) Fletcher, R. Comput. J. 1970, 13, 317–322. (52) Goldfarb, D. Math. Comput. 1970, 24, 23–23. (53) Shanno, D. F. Math. Comput. 1970, 24, 647–647. (54) Shanno, D. F.; Kettler, P. C. Math. Comp. 1970, 24, 657–664. (55) Lennard-Jones Clusters. http://doye.chem.ox.ac.uk/jon/structures/LJ.html. (56) Leary, R. H. J. Glob. Optim. 2000, 18, 367–383. (57) Doye, J. P. K.; Miller, M. A.; Wales, D. J. J. Chem. Phys. 1999, 111, 8417. (58) Peruzzini, M.; Gonsalvi, L.; Romerosa, A. Chem. Soc. Rev. 2005, 34, 1038–1047. (59) Pfitzner, A. Angew. Chemie - Int. Ed. 2006, 45, 699–700. (60) Bachhuber, F.; von Appen, J.; Dronskowski, R.; Schmidt, P.; Nilges, T.; Pfitzner, A.; Weihrich, R. Angew. Chemie Int. Ed. 2014, 53, 11629–11633. (61) Natta, G.; Passerini, L. Nature 1930, 125, 707–708. (62) Corbridge, D. E. C.; Lowe, E. J. Nature 1952, 170, 629–629. (63) Simon, A.; Borrmann, H.; Craubner, H. Phosphorus Sulfur Relat. Elem. 1987, 30, 507–510.

39

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(64) Simon, A.; Borrmann, H.; Horakh, J. Chem. Ber. 1997, 130, 1235–1240. (65) Brown, A.; Rundqvist, S. Acta Crystallogr. 1965, 19, 684–685. (66) Jamieson, J. C. Science (80-. ). 1963, 139, 1291–1292. (67) Low, T.; Rodin, A. S.; Carvalho, A.; Jiang, Y.; Wang, H.; Xia, F.; Castro Neto, A. H. Phys. Rev. B 2014, 90, 75434–75435. (68) Ling, X.; Wang, H.; Huang, S.; Xia, F.; Dresselhaus, M. S. Proc. Natl. Acad. Sci. 2015, 112, 4523–4530. (69) Sun, L.-Q.; Li, M.-J.; Sun, K.; Yu, S.-H.; Wang, R.-S.; Xie, H.-M. J. Phys. Chem. C 2012, 116, 14772–14779. (70) Hultgren, R.; Gingrich, N. S.; Warren, B. E. J. Chem. Phys. 1935, 3, 351. (71) Kim, Y.; Park, Y.; Choi, A.; Choi, N.-S.; Kim, J.; Lee, J.; Ryu, J. H.; Oh, S. M.; Lee, K. T. Adv. Mater. 2013, 25, 3045–3049. (72) Thurn, H.; Kerbs, H. Angew. Chemie Int. Ed. English 1966, 5, 1047–1048. (73) Karttunen, A. J.; Linnolahti, M.; Pakkanen, T. A. ChemPhysChem 2007, 8, 2373–2378. (74) Ruck, M.; Hoppe, D.; Wahl, B.; Simon, P.; Wang, Y.; Seifert, G. Angew. Chemie Int. Ed. 2005, 44, 7616–7619. (75) Thurn, H.; Krebs, H. Acta Crystallogr. Sect. B Struct. Crystallogr. Cryst. Chem. 1969, 25, 125–135. (76) Marqu´es, M.; Ackland, G. J.; Lundegaard, L. F.; Falconi, S.; Hejny, C.; McMahon, M. I.; Contreras-Garc´ıa, J.; Hanfland, M. Phys. Rev. B 2008, 78, 054120. (77) Kikegawa, T.; Iwasaki, H. Acta Crystallogr. Sect. B Struct. Sci. 1983, 39, 158–164. (78) Ahuja, R. Phys. status solidi 2003, 235, 282–287. 40

ACS Paragon Plus Environment

Page 40 of 42

Page 41 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(79) Okudera, H.; Dinnebier, R. E.; Simon, A. Zeitschrift f¨ ur Krist. J. Struct. Phys. Chem. Asp. Cryst. Mater. 2005, 220, 259–264. (80) Saal, J.; Kirklin, S.; Aykol, M.; Meredig, B.; Wolverton, C. JOM 2013, 65, 1501–1509. (81) Bergerhoff, G.; Brown, I. D.; et al Allen, F. H. Int. Union Crystallogr. Chester 1987, 77–95. (82) Belsky, A.; Hellenbrandt, M.; Karen, V. L.; Luksch, P. Acta Crystallogr. Sect. B Struct. Sci. 2002, 58, 364–369. (83) Luschtinetz, R.; Oliveira, A. F.; Frenzel, J.; Joswig, J.-O.; Seifert, G.; Duarte, H. A. Surf. Sci. 2008, 602, 1347–1359. (84) Luschtinetz, R.; Frenzel, J.; Milek, T.; Seifert, G. J. Phys. Chem. C 2009, 113, 5730– 5740. (85) Du, Y.; Ouyang, C.; Shi, S.; Lei, M. J. Appl. Phys. 2010, 107, 093718. (86) Lee, K. S.; Geem, Z. W. Comput. Methods Appl. Mech. Eng. 2005, 194, 3902–3933. (87) Bl¨ochl, P. E. Phys. Rev. B 1994, 50, 17953–17979. (88) Perdew, J. P.; Ernzerhof, M.; Burke, K. J. Chem. Phys. 1996, 105, 9982.

41

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 42 of 42