Fully Automated Molecular Design with Atomic Resolution for Desired

Keywords: genetic algorithm, COSMO-SAC, octanol-water partition coefficient, ... 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 4...
2 downloads 0 Views 846KB Size
Subscriber access provided by TUFTS UNIV

Thermodynamics, Transport, and Fluid Mechanics

Fully Automated Molecular Design with Atomic Resolution for Desired Thermophysical Properties Hsuan-Hao Hsu, Chen-Hsuan Huang, and Shiang-Tai Lin Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.8b01004 • Publication Date (Web): 28 Jun 2018 Downloaded from http://pubs.acs.org on July 1, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Fully Automated Molecular Design with Atomic Resolution for Desired Thermophysical Properties Hsuan-Hao Hsu,† Chen-Hsuan Huang† and Shiang-Tai Lin* Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan

Abstract The value of fine and specialty chemicals is often determined by the specific requirements in their physical and chemical properties. Therefore, it is most desirable to design the structure of chemicals to meet some targeted material properties. In the past, the design of specialty chemicals has been based largely on experience and trial-and-error. However, recent advances in computational chemistry and machine learning can offer a new path to this problem. In this presentation, we demonstrate a successful example where the structure of a chemical of specified value of octanol-water partition coefficient (Kow) can be predicted by computers. This method consists of two parts, a robust method, the COSMO-SAC activity coefficient model, that predicts the activity coefficient with input of only the molecular structure. The second component of this method is a derivative-free optimization algorithm that searches in the multidimensional structure space for the desired value of Kow. In particular, the genetic algorithm (GA), based on the Darwinian theory of evolution and natural selection, combined with simulated annealing (SA) is adopted for this purpose. Compared to other optimization algorithms, GA can overcome the problem of being trapped in local minima and SA can help improve the convergence. Therefore, the GA-SA combination has been found to be very suitable for molecular design. We show that the value of Kow can be achieved within 1% of the target in 30 generations with a proper set of evolution parameters (including the size of the population, the probability of selection, and the rate of temperature annealing, etc.). The same method can be applied to the search for chemicals with other desired properties, such as vapor pressure and solubility.

Keywords: genetic algorithm, COSMO-SAC, octanol-water partition coefficient, molecular design, simulated annealing †Both authors contributed equally to the work *to whom all correspondence should be addressed. E-mail: [email protected]

ACS Paragon Plus Environment 1

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 27

1. Introduction Physical properties of a chemical and its fluid phase behavior when mixed with other chemicals are necessary data for determining the energy efficiency of a manufacturing process and the selection of suitable separation and purification units.1 Conventionally such data are obtained from experimental measurements. However, experimental approaches become ineffective for problems such as screening for solvents for CO2 capture, replacement of chlorofluorocarbons (CFC) for refrigeration, suitable solvents for developing new drugs, etc.,2 where one needs data for many new chemicals and their mixtures with other chemicals. Without reliable data, the search for candidate chemicals must be based on chemical intuition, experiences from past research, or on trial and error. Fortunately, with advances in molecular theory and computer sciences, the search for chemicals with targeted function and features using computers is becoming possible. Computer-Aided Molecular Design, or CAMD, is a computational approach that finds chemicals with specified properties and constraints, such as solubility in water, vapor pressures, range of molecular weight, thermal stability. The search algorithm typically consists of two parts,3 a forward algorithm that determines the property of a chemical based on its molecular structure (i.e. property estimation), and an inverse algorithm that searches for the chemical structure that results in the target properties (i.e., structure search). The forward algorithm usually relies on methods for property estimation, such as group contribution (GC),4,

5

quantitative structure-property relationship (QSPR),6,

7

and more

8

recently molecular simulations. With the property estimation given by the forward algorithm, the search for structures with target properties is essentially an optimization problem, i.e., f(x)=property and find x, with x representing the molecular structure and f(x) representing the forward algorithm. Since molecular structures cannot be represented by continuous variables, the inverse problem belongs to the mixed integer non-linear programming (MINLP) problem in mathematics.9 The key feature of such types of problems is that the function is not differentiable. Therefore, these problems must be solved by derivative-free optimization algorithms.10 Figure 1 illustrates the general framework of CAMD.

ACS Paragon Plus Environment 2

Page 3 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Output: chemical candidates

Input: target properties

property estimation (forward algorithm) molecular structures

structure search (inverse algorithm)

properties of fluids

Figure 1. Illustration of Computer-Aided Molecular Design (CAMD) for searching for chemicals possessing target properties.

A pioneering example of CAMD is the work of Gani and Brignole in 1983 who demonstrated the search for optimal solvents for liquid extraction.11 In their work, the group contribution (GC) method, UNIFAC,12 was used for the prediction of activity coefficient (forward algorithm), and an exhaustive search for all possible combination of functional groups was conducted (inverse algorithm). Constraints were imposed to reduce the number of possible combinations. However, since group contribution methods do not differentiate compounds with different connectivities between the functional groups, the output from the search is not a specific chemical species but any species with the same content of functional groups. The GC approach was later developed into a multi-level CAMD method where the connectivity of functional groups and 3-dimensional structure are taken into account after the initial functional group search.13 Since then, GC based CAMD has been applied toward the search for alternative refrigerants,14 and gas absorbents,15 in addition to the design of extraction solvents.16-26 The QSPR methods determine the properties of a chemical based on contributions from molecular descriptors that are calculated based on structure information of the chemical. Therefore, the CAMD problem can be resolved without reconstructing the 3D structure as in the GC approach. QSPR based CAMD has been applied for chemicals with desired surface tension,27, 28 toxicity,29 octanol-water partition coefficient,30 and drug design.31 It is useful to note that the focus of research for CAMD based on QSPR was often on the development of search algorithms. 29, 32 Given that QSPR methods usually provide properties at some fixed conditions (a specific temperature and solution concentration), they are not applicable for design problems that

ACS Paragon Plus Environment 3

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

require varying operation conditions (for example, extraction). Recently, there are emerging theories of property estimation based on quantum-mechanical solvation calculations. The outstanding examples are the COSMO-based activity coefficient models, such as COSMO-RS33,

34

and COSMO-SAC.35-37 In such methods, the properties of a fluid are

determined based on molecular interactions quantified through the screening charges on the molecular surface when the molecule is embedded in a perfect conductor. These methods do not contain any species dependent parameters and thus are applicable for, in principle, any chemical species. Recently, extensive studies were conducted on the performance of COSMO-SAC on various fluid properties.38 It was found that the COSMO-SAC model outperforms group contribution based modified UNIFAC models for liquid-liquid equilibrium and activity coefficients at infinite dilution conditions. Its prediction inaccuracy is about 7% for the pressure at vapor-liquid equilibrium, which is about 2 to 3 times higher than that from modified UNIFAC. Nonetheless, the predictions from COSMO-SAC are generally satisfactory. There are recent examples of incorporation of COSMO-based methods for property estimation in CAMD.39-41 Qi and coworkers developed a COSMO-GC-IL model39 for efficient prediction of activity coefficient of ionic liquids and demonstrated its use in IL design for solvent extraction.39, 40 Scheffczyk et al.42 generated the chemical structures with LEA3D30 and used COSMO-RS for the design of extraction solvents. It should be noted that these examples use either functional groups or fragments as the building blocks of molecules. This is different from this work where the atoms are taken as the building blocks. Many MINLP algorithms have been used for the search for the chemical structure with desired properties. Among them the genetic algorithm (GA)43-48 based on Darwinian theory of evolution and natural selection may be the most widely used method. In this method, new chemical species are created via crossover/hybridization and mutation of an existing population. The species that better fit the desired criteria are selected and others are discarded. Therefore, the new generation of species would, on average, show a better fit to the target criteria, and the repetition of the creation and selection process could eventually produce a collection of species that meets (or are very close to meeting) all the desired criteria. More recently, the combination of simulated annealing (SA)49-54 with GA was found to effectively prevent the search from being trapped in local optima. Therefore, GA-SA55 has becoming a popular algorithm for CAMD. Table S1 in Supporting Information summarizes the recent works on CAMD, categorizing in terms of the property estimation and search algorithm. Interested readers are referred to many excellent reviews for further details.56-58

ACS Paragon Plus Environment 4

Page 4 of 27

Page 5 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

In this work, we combine COSMO-SAC, GA and GA-SA as a way for CAMD. Unlike the previous works, we propose a new molecular encoding scheme that allows for design of new chemical species at the atomic level. The effectiveness and parametric sensitivity of this scheme are tested with different target ln(Kow). Our results show that COSMO-SAC/GA or GA-SA does provide an effective means for the search for target properties and could be of great potential for molecular design for a variety of applications.

2. Theory 2.1 The Octanol-Water Partition Coefficient (Kow) Water and 1-octanol are partially miscible under ambient conditions and form two liquid phases, one is water-rich (nearly pure water) and the other is octanol-rich (containing about 27.5 mol% of water).1 When a third component is introduced to the mixture, its equilibrium distribution among the two phases, also known as the octanol-water partition coefficient, Kow,i, can be used to quantify the hydrophobicity of the chemical species i K , = lim →





= lim →

  

  

. ,

=(

. ,

)

(1)

where c, x and γ indicate the molar concentration, mole fractions, and activity coefficient, respectively. The superscripts O, W, and ∞ represent the octanol-rich, water-rich phase, infinite dilution, respectively. The amphiphilic character of 1-octanol makes it a good surrogate of lipids in animals and organic matters in sediments. Therefore, Kow can be used for the estimation of fate of a chemical when discharged to the environment59, 60 and the likelihood of its accumulation in animals through the food chain.61 Kow is also used as a molecular descriptor for toxicity.62 In this work, we use Kow as a target property for molecular design.

2.2 Property Estimation from the COSMO-SAC Model The COSMO-SAC model provides activity coefficient of a chemical in liquid mixtures with input of only connectivity between atoms. In this model, the activity coefficient is determined based on the solvation free energy, i.e., the work required to transfer a molecule from the ideal gas phase to the solution under constant temperature and pressure.63 In the COSMO theory,33 the solvation process is modified to transferring the molecule to a perfect conductor (molecule is perfectly screened by apparent charges on the molecular surface) and

ACS Paragon Plus Environment 5

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 27

then transforming the conductor to the real solution (removal of screening charges). In COSMO-SAC model, the activity coefficient of a species i in solution j is calculated as64 ln / =

∗#$! ∗#$! ∆ /! %∆ / &'

+ + )*/

(2)

+ where )*/ is the Staverman-Guggenheim model65, 66 accounting for the size and shape

effects to the activity coefficient ∗-./ ∗-./ ∆ /! %∆ / &'

+ )*

!



1

4

7

+ = )*/ = ln 0 2 3 + 5 6 ln 0 3 + ) − 

8

12 

∑< ;< )
 = ; ? / ∑ ; ? , ) = (z/2)(1 − 6 ), ; is the mole fraction of compound i, ? is the normalized volume for i, 6 is the surface area parameters for i, and z ∗DE is the coordination number. ∆C/< is the work associated with the transform the perfect

conductor to solution j FG∗HIJ /! KL

= * ∑QR M (NO ))*Γ (NO )

(4)

In eq. (4) * =Ai/aeff is the number of segments in molecule i, with aeff being the area of a segment and Ai the total surface area of species i. pi(σ) is the σ-profile, i.e., the distribution of the screening charges on the molecular surface. The σ-profile can be obtained from first principle COSMO calculations33 with input of atomic coordinates and atomic radii for defining the solvation cavity. The segment activity coefficient in the last time in eq. (4) is determined based on segment interactions ΔT(NO , NU ) lnΓV (N ) = −ln W∑QX M ( NU )ΓV (NU )exp [−

F](QR ,QX ) ^'

]`

(5)

The coefficients in calculating segment interactions and aeff are the only needed parameters in COSMO-SAC model. Their values can be found in previous works35,

36

and are not

reproduced here. It should be noted that the COSMO-SAC model has been shown to provide satisfactory predictions for fluid phase behaviors of liquid mixtures38 and Kow.67

2.3 The Search for Chemical Structures through Genetic Algorithm The COSMO-SAC model provides the needed activity coefficients for calculating Kow with only the information of the chemical structure of the molecule of interest. The search for structure of some desired value of Kow is then an optimization problem, i.e., finding the structure that minimizes the difference between the Kow of corresponding chemical and the desired value, for example, objFunc(structure) = ∑ |(kl)k M?mMn?op) − (ol?qno rl)sn) |

ACS Paragon Plus Environment 6

(6)

Page 7 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

where the summation is over all requirements, including Kow, size of molecule, etc. The genetic algorithm (GA) is a powerful approach for such types of problems. The GA follows the idea of Darwin’s theory of evolution and natural selection. A population of different species should first be generated. New generations are then created from the previous one (parent generation) through crossover (hybridization of genes) and mutation. The size of population is restricted by selections based on fitness, i.e., how close the property of a species is to the target value. The GA has been found to be very effective in problems containing multiple local minima.68 To apply this method for molecular design, it is necessary to express the three dimensional molecular structures in terms of genes and develop a data structure allowing for crossover and mutation operations.

2.3.1 Molecular DNA and data structure The simplified molecular-input line-entry system (SMILES)69 provides a line notation of a molecule and is used as the molecular gene in this work. The basic gene elements considered so far are listed in Table 1. Hydrogen atoms are not explicitly included in SMILES and are assumed to fill all the unsaturated bonds. For example, the SMILES representation of water is O; 1-octanol is CCCCCCCCO; ketone is CC(=O)C. Note that the 3D structure of a molecule can be generated from SMILES with the help of open source program OpenBabel.70 OpenBabel can also generate unique, canonical SMILE representation for the same molecule.

Table 1. The basic elements of molecular genes Name

Carbon (C)

Oxygen (O)

Nitrogen (N)

Orbital hybridization sp3

Probability†

1

SMILES representation C(-)(-)(-)(-)

sp2

2

C(=)(-)(-)

0.24

sp sp

3 4

C(#)(-) C(=)(=)

0.08 0.008

sp3

5

O(-)(-)

0.12

sp3

6

O(-)

0.12

sp

2

7

O(=)

0.075

sp

3

8

N(-)(-)(-)

0.05

sp

2

9

N(=)(-)

0.05

10 11 12

N(#) F(-) Cl(-)

0.02 0.005 0.005

sp Fluorine (F) Chloride (Cl)

Gene ID

ACS Paragon Plus Environment 7

0.24

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 27

Bromine (Br) 13 Br(-) 0.005 †probability of a gene being selected in the creation of new spices (initial structure and alienization) or in the mutation operation New species can be generated from a pool of existing ones through gene operations: hybridization and mutation. To handle the gene operations, it is necessary to establish a data structure to store the connectivity between the atoms. We propose a data structure containing 4 types of information for each element in the molecule: index, gene ID, parent index, and bond order. Atom index is a sequential integer assigned to each element in the SMILES string. Gene ID records the ID of that element as indicated in Table 1. The connectivity between elements is stored in the parent index, indicating which element it is connected to on the left-hand-side of the SMILES string. The bond order represents the type of bond (single, double, triple) between the present atom and its parent atom. (Note that the parent index and bond order of the first element in the SMILES string is assigned to be 0.) An example of the data structure for acetone is given in Table 2. The 4 non-hydrogen atoms in acetone CC(=O)C are indexed sequentially. The first carbon atom (gene ID=1 for sp3 C) does not have any other elements on its left, therefore its parent index and bond order are both zero. The second carbon atom (sp2 C with gene ID=2) is connected to the first carbon atom through a single bond. Therefore, its parent index is 1 (index of first carbon) and bond order is 1. The 3rd oxygen atom (sp2 O with gene ID=7) is connected to the second carbon through a double bond. Therefore, its parent index is 2 (index of second carbon) and bond order is 2. The last (4th) carbon atom (sp3 C with gene ID=1) is connected to the second carbon through a single bond. Therefore, its parent index is 2 (index of second carbon) and bond order is 1.

Table 2. Example data structure for element connectivity Name

Stored information

Example: CC(=O)C

Index

index of element in the molecule.

1,2,3,4

Gene ID

gene ID of the atom (as in Table 1)

1,2,7,1

Parent index

index of the parent element

0,1,2,2

Bond order

bond order with the parent element

0,1,2,1

2.3.2 Crossover and mutation for creating new species

ACS Paragon Plus Environment 8

Page 9 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

All molecules in one generation are subjected to gene hybridization (crossover) operation with a probability of (Pc). Therefore, for a generation of N molecules, N×Pc of them will be selected and randomly paired for crossover operation. For a selected pair, one of the bonds in each molecule will be selected and checked if their bond orders match for crossover operation. If the bond orders are different, bond selection would be repeated at least 3 times to find a possible match. Figure 2 shows an example of crossover operation between acetone and propanal to form propane and methylglyoxal.

propane

acetone

methylglyoxal

propanal

Figure 2. Illustration of generation of new species via crossover operation. Two new species, propane (CCC) and methylglyoxal (CC(=O)C=O), on the right of the figure are generated from the parent species, acetone (CC(=O)C) and propanal (CCC(=O)).

In addition, all molecules are also subjected to mutation operation with a probability of (Pm). In other words, in each generation, N×Pm molecules will be selected for the subsequent mutation operation. We propose four types of mutation operations: exchange, addition, subtraction, and alienization. Figure 3 illustrates the mutation of propanal to propane through subtraction of (=O), to propanamide through addition of (-N), to propanol through exchange of (=O to -O), or to 1-butene (CCC=C) through alienization. The subtraction, addition, and exchange mutations have equal probability of occurrence. The alienization operation is to transform the original molecule to a completely new species. This operation takes once place every 25 generations and only if the fitness of the original molecule is less than 100. The purpose of this operation is to introduce new genes to the population and reduce the probability of the evolution being trapped in a local optimum. Table S2 provides the data structures of molecules generated from acetone and propanol through crossover and mutation.

ACS Paragon Plus Environment 9

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 27

Subtraction propane Addition propanamide Exchange 1-propanol propanal Alienization

1-butene

Figure 3. Illustration of mutation operation. Four possible new species, propane (CCC), (CCC(=O)N), propanol (CCCO), and 1-butene (CCC=C) are generated from 1-propionaldehyde (CCC(=O)) with different types of mutations.

2.3.3 Fitness for selection of better species The value of fitness of a species represents its ability of survival to the next generation. In general, molecules with properties closer to the target value should have a higher fitness. In this work, the fitness of a species i is defined as: F = t + u expv− ∑< ;,< w + r(k )

(7)

where A and B are coefficients that affects the convergence, ;,< is the absolute difference between the calculated and target property j of species i. The summation runs over all desired target properties. The r(k ) is some penalty function for additional constraints. In this work, the ln(Kow) is used as the target property and the deviation of size of species from desired value is considered as a penalty, i.e., }{D~E}

z{|z F = t + u exp(−|lnxy], − lnxy],

|) − C €V %V.#‚$ €

(8)

where si is the size (number of non-hydrogen atoms) of molecule i and starget is the target molecular size. The fitness function has a strong influence on the convergence of GA. Equation. (12) gives an upper bound value, A+B in fitness. If the fitness falls below zero (due to the penalty of size), then the fitness is set to zero. We found that setting A to 40, B to 160, and C to 3 provides satisfying performances. After new species are created (via crossover and mutation), the probability of the species to be selected to stay in the next generation (survival probability) can be determined according to roulette wheel selection method,71 i.e., ƒ

„

…

=∑

(9)

…

ACS Paragon Plus Environment 10

Page 11 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

2.4 Simulated Annealing for better convergence Simulated annealing (SA) is a technique to prevent premature convergence or being trapped in local minimum in an optimization process. The idea of SA comes from annealing in metallurgy, a technique involving heating and cooling of a material to control the size of its crystals. At high temperatures, the acceptance for good and poor solutions (evaluated by fitness) are equally probable. At low temperatures, the good solutions have a much higher probability to be accepted compared to the poor ones. Therefore, accepting poor solutions in the early stages of GA could allow for the exploration of more possibilities, thus reduce the likelihood of being trapped in local minima. When enough species are sampled, it is desired to differentiate the good and poor species to accelerate convergence. Therefore, an annealing rate α is used to control the annealing temperature at different generations ƒ

„%+„

†‡ˆ

%(…

= exp(

†

%… )

‰ † 'Š

)

(10)