3788
Ind. Eng. Chem. Res. 1996, 35, 3788-3794
Novel Mathematical Programming Model for Computer Aided Molecular Design Nachiket Churi and Luke E. K. Achenie* Department of Chemical Engineering, U-222, University of Connecticut, 191 Auditorium Road, Storrs, Connecticut 06269
The synthesis of new compounds by computer aided molecular design approaches has received considerable attention recently due mainly to the promise of reducing the time and effort required using traditional empirical approaches. These conventional approaches have involved trial and error methods in which a large number of compounds are synthesized and tested in a laboratory. This paper describes a general mathematical programming model for designing compounds that have prespecified performance characteristics. This formulation is novel in that it gives nearly complete information about the molecular structure and is therefore able to exploit accurate property prediction methods that require such information. The computer aided molecular design strategy is able to generate a set of promising candidate compounds for experimental evaluation. The new formulation is illustrated with a refrigerant design case study. 1. Introduction There is an ever-increasing need to design and utilize new and innovative compounds to replace a number of existing compounds that are deemed to be unsuitable for industrial use for various environmental and economic reasons. The search for new compounds usually involves a combination of design heuristics, prior experience, and direct experimental studies. While there is no substitute for experimental study, there is a definite need for a systematic methodology which will generate new molecules that are promising enough to be considered for experimental verification. computer aided molecular design is one such methodology. Computer aided molecular design (CAMD) is a reverse engineering procedure which generates molecules that conform to various physical property requirements that are specified a priori. Various investigators have successfully developed CAMD approaches using group contribution methods, notably Macchietto et al. (1990), Gani et al. (1991), Venkatasubramanian et al. (1994), Vaidyanathan and El-Halwagi (1996) and Duvedi and Achenie (1996) among others. Macchietto et al. (1990) approached the CAMD problem by evaluating feasible molecules formed from a starting set of groups implicitly within the framework of a constrained optimization problem. Gani et al. (1991) have outlined a groupcontribution-based CAMD approach involving preselection of groups, generation of feasible molecules, and, finally, rating the generated molecules in terms of a performance index. Venkatasubramanian et al. (1994) applied a combinatorial optimization method to polymer design using genetic algorithms for property prediction. Vaidyanathan and El-Halwagi (1996) have applied a CAMD procedure to polymer design using various structural and thermodynamic constraints for monomers and the corresponding addition and condensation polymers. This procedure uses the number-occurrence of UNIFAC groups in the monomer and the repeat unit to predict the polymer’s properties. Duvedi and Achenie (1996) developed a mixed-integer nonlinear programming approach to CAMD in which integer variables are * Author to whom correspondence should be addressed. Phone: (860) 486 2756. Fax: (860) 486 2959. E-mail address:
[email protected].
S0888-5885(96)00192-3 CCC: $12.00
employed to give the occurrence-frequency of a particular group in the molecule. This approach is limited in terms of the amount of information that can be obtained. For example, it is unable to specify connectivity between groups, which is required to fully specify a molecule and to take advantage of various accurate property prediction methods. CAMD has been hampered by the availability of accurate physical property prediction models. In recent years, however, a number of group contribution techniques have been developed that are accurate enough to be used in CAMD (Reid et al., 1987; Joback and Reid, 1987; Lai et al., 1987). Some of these methods are able to predict properties to within 5% of the experimental values. The method proposed by Constantinou and Gani (1994) is also able to predict property differences in isomers. This method uses second-order groups that are formed by combinations of first-order groups and whose contributions act as corrections to first-order contributions. In this paper, a systematic mixed integer nonlinear programming (MINLP) approach to CAMD is described that is able to take advantage of various new group contribution techniques such as the one by Constantinou and Gani (1994). The new formulation uses discrete variables that give structural and connectivity information and is thus able to give a nearly complete description of the molecule. This method is able to distinguish between isomers and has the advantage of being able to take advantage of accurate property prediction methods and of providing the designer complete control over the type of constraints to be introduced for the application of interest. This novel methodology is described here with reference to refrigerant design. 2. MINLP Formulation 2.1. Basis Set. Initially, we consider a starting set of distinct structural groups. By structural groups we mean groups of connected atoms formed by a combination of functional groups and whose net valence is at least 1. Examples of structural groups include CH3-, CH2F-, CHCld, etc. The selection of the basis set depends on the intended application and availability of accurate group contribution techniques for predicting the properties of interest. © 1996 American Chemical Society
Ind. Eng. Chem. Res., Vol. 35, No. 10, 1996 3789
available from the z variables. The z variables are interpreted in a manner similar to those of u. A nonzero z713 implies that the seventh group’s first site is attached to the third group. Since u73 and u35 are nonzero, we know that the seventh group is of type 3 (CH) and the third groups is of type 5 (F). Thus a nonzero z713 indicates a bond between CH and F. Since there are a total of seven groups in the molecule, n ) 7, and the first seven terms of w are zero and the remaining terms (w8...wnmax) are nonzero. In this case, nmax, which is specified a priori, is at least 7. In summary, the parameters for this example are
m)6 Figure 1. Example molecule.
v ) [1 2 3 4 1 1]
2.2. Variables. We define the following parameters:
m ) number of structural groups in the basis set
smax ) 4 ) max{vk}
vk ) valence of the kth group in the basis set
n)7
smax ) maximum valence of all the groups in the basis set n ) number of groups in the designed molecule nmax ) maximum number of structural groups allowed in a molecule We then specify a basis set of m structural groups having valencies {vk}, with maximal valency smax. The maximum number of groups allowed in a molecule is nmax. The actual number of groups, n, is obtained from the solution of the mathematical programming model. Throughout the formulation, the index k specifies a structural group’s position in the basis set while indices i and p specify its position in the designed molecule. In addition, we define the following discrete variables:
k
2.3. Structural Feasibility Constraints. The purpose of the structural feasibility constraints is to generate molecules that do not violate basic feasibility criterion such as the octet rule. The molecules should not have any unattached sites or multiple bonds attached to the same site. Our basic philosophy here is to develop a model that is linear in the integer variables. Using the variables defined above, these criteria are expressed as follows: nmax m
∑ ∑ uik e nmax i)1 k)1
nmaxsmax
∑∑
{
zijp )
∑ uikvk
i ) 1...nmax
(2)
k)1
i-1 smax
∑ ∑ zijp g -wi
i ) 2...nmax
(3)
p)1 j)1
{
1 if the ith group’s jth site is attached to the pth group 0 otherwise
Finally, we define w to be a binary vector of length nmax. Its first n terms are zero, while the remaining terms are one. This vector is introduced in the formulation to ensure that a connected molecule is formed from the structural groups. Figure 1 explains how these variables and parameters are to be interpreted for 1,1,1-trichloro-2,2-difluoroethane. The bold numbers next to the groups are the group numbers in the molecule, and the numbers next to the bonds are the site numbers for the individual groups. A basis set of six groups (m ) 6), namely, CH3, CH2, CH, F, and Cl is chosen for illustration, and only the nonzero terms of the variables are given. For example, the nonzero terms of u are u16, u24, and u35; a nonzero u16 implies that the first group in the molecule is of type 6. From the basis set we can see that the sixth group is Cl. Similarly, u24 implies that the second group is a C. At this stage, we do not distinguish between the various ways in which C can be bonded. Distinction between the various types of bonding (>CCd) is done within the property prediction procedures since bonding information is
m
zijp )
p)1 j)1
uik ) 1 if the ith group in the molecule is of the kth kind 0 otherwise
(1)
nmax m
∑ ∑ i)1 k)1
nmax
uik +
wi ) nmax ∑ i)1
w1 ) 0 wi e wi+1 smax nmax
∑∑
(4) (5)
i ) 1...(nmax - 1)
(6)
nmax
zijp -
j)vk p)1
∑ ziv p + Muik e M
p)1
k
i ) 1...nmax, k ) 1...m (7) smax
∑ j)1
smax
zijp )
zpji ∑ j)1
i ) 1...(nmax - 1), p ) (i + 1)...nmax (8)
nmax
∑ zijp e 1
i ) 1...nmax, j ) 1...smax
(9)
p)1 m
∑
k)1
m
uik -
∑ ui-1,k e 0
k)1
i ) 2...nmax
(10)
3790 Ind. Eng. Chem. Res., Vol. 35, No. 10, 1996
Equation 1 puts a limit on the number of structural groups that can constitute a molecule. The double summation on the left-hand side is equal to n, the actual number of groups present in the molecule. Since a minimum of two groups are required to form a molecule and an upper limit of nmax is specified, 2 e n e nmax. Equation 2 is an implementation of the octet rule. The expression on the left gives the number of bonds attached to the ith group, while that on the right states that if the ith group is of the kth type, its valence is vk. This ensures that the number of bonds attached to a group equals the valence of the group. Equations 3-6 ensure that only one molecule is formed. This is realized by constraining the ith group to be attached to one of the groups before it, that is, groups 1 to (i - 1) (eq 3). Thus, the second group has to be attached to the first group, the third group has to be attached to either of the first two groups, and so on. Note that the presence of eq 4 makes eq 1 redundant. The first group is always present (eq 5), and the (i + 1)th group is present only if the ith group is present (eq 6). Equation 7 has to be introduced to account for different valencies of the groups. This equation is a linear analogue of the nonlinear equation smax
uik
Figure 2. Loops and submolecules.
or polymerize. In many cases, polymerization can be eliminated by ensuring that the refrigerant does not have double or triple bonds. There is no systematic way of introducing this constraint mathematically in the absence of detailed structural information. With the new formulation, the constraint that has to be added to eliminate multiple bonds is smax
zijp e 1 ∑ j)1
i ) 1...(nmax - 1), p ) (i + 1)...nmax (12)
nmax
∑ ∑ zijp ) 0
i ) 1...nmax, k ) 1...m
(11)
j)vk+1 p)1
which states that if the ith group is of the kth kind, then that group should not have any connections for its sites (vk + 1) to smax which are nonexistent. M is a number that is significantly larger than all other terms in the equation. Equation 7 is written in a form that is convenient from a computer programming point of view. Equation 8 is the symmetry constraint. Hence, if a group is attached to a second group, the second group is automatically attached to the first one. Equation 9 ensures that a group’s site can be attached at most once to some other group. Equation 10 is applied to force existence of the ith group if the (i - 1)th group is present. The structural feasibility constraints (eqs 1-10) are linear; hence, they form a convex hull, separating feasible molecular structures from infeasible ones. 2.4. Characteristics of the Formulation. The variables used to describe the molecule give a nearly complete information about the molecule’s composition and connectivity. One of the advantages of this new formulation becomes apparent when one looks at the various group contribution methods. Quite a few of these methods make use of a certain amount of structural information in giving accurate predictions. For example, the group contribution method for liquid specific heat (Chueh and Swanson, 1973) requires the addition of 4.5 to the value of Cpl for “...any carbon group which fulfills the following criterion: ‘A carbon group which is joined by a single bond to a carbon group which is connected by a double or triple bond with a third carbon atom.’” It is obvious that this correction cannot be added without any connectivity information. A search of the open literature gave no previous CAMD formulation that was able to give the level of detailed molecular information required for the more accurates and more complexsgroup contribution techniques. Another way in which structural information proves useful is in incorporating bonding constraints. One of the desired properties of a refrigerant is its stability; i.e., the compound should not spontaneously decompose
If for some reason we do not want cyclic molecules, we can introduce a constraint based on the following relation which is a variation of the well-known Euler equation in graph theory:
l)b+a-n
(13)
In this equation, l is the total number of independent loops, b is the number of “submolecules” being designed, a is the total number of attachments, and n is the total number of groups. Obviously, b ) 1 since we want a single connected molecule. A loop is a series of distinct groups that are attached in such a manner that one can go from one group to others, and return to the first group via a different route. A loop is independent if it cannot be described by a combination of other loops. Two groups are said to be attached if there exists at least one bond between them. These terms are explained with examples in Figure 2. The total number of attachments (a) and the number of groups (n) are given by nmax-1 nmax
a)
∑ ∑ cip i)1 p)i+1
(14)
where smax
cip ) 0
if
zijp ) 0, ∑ j)1
i ) 1...(nmax - 1), p ) (i + 1)...nmax
smax
)1
if
zijp > 0, ∑ j)1
i ) 1...(nmax - 1), p ) (i + 1)...nmax nmax m
n)
∑ ∑ uik i)1 k)1
(15)
By using these relations in eq 13 and setting l ) 0, the
Ind. Eng. Chem. Res., Vol. 35, No. 10, 1996 3791
yi - ∑yi e |B| - 1 ∑ i∈B i∈N
Figure 3. Indistinguishable molecules.
Figure 4. Multiplicity.
resulting molecule will not have any loops and will thus be acyclic. In summary, the formulation gives us a lot of control over the features of the target molecules. We can distinguish between isomers, control multiple bonds, and also specify whether or not we desire a cyclic molecule. One small limitation to the connectivity information is that one cannot distinguish between the individual bonds in the case of multiple bonds. For example, the two structures in Figure 3 cannot be distinguished under the formulation. However, this is not a serious drawback since such cases are not common. In any case, no existing group contribution method is able to distinguish between such structures in a systematic manner. The formulation also does not consider any constraints based on bond angles. While it is possible to incorporate bond angles, we could not find any property prediction methods that can take advantage of this additional information. Figure 4 shows two possible ways in which the variables can describe a molecule. In general, there are several such multiplicities in specifying a molecule. This can lead to increased computational expense. The degree of multiplicity depends on the number of groups present in the molecule and also their valencies. Since these different solutions describe the same molecule, they have identical objective function values. This fact can be used to eliminate multiplicities by applying a constraint to the objective function value of the type fnew e fold - . However, such a constraint can cause problems with the termination criterion which depends on an increase in the objective value to detect the minima. Also, since other structures may have the same objective function value, there is a risk of eliminating potentially good molecules by this method. The method used in this work introduces an integer cut (constraint) every time a new solution has the same objective function value as that obtained in the previous iteration. The integer cut given below eliminates y from the search space
(16)
where B ) {i|yi ) 1}, N ) {i|yi ) 0}, and |B| is the cardinality of B. For the rth iteration, there will be (r - 1) such constraints, each eliminating one of the previous solutions from the search space. These integer cuts are in addition to those introduced at each major iteration when a new molecule is obtained. A more rigorous approach is to introduce constraints that force the sites to be ordered; however, this approach was not used in the case study considered in this paper. 2.5. Property Constraints and the Objective Function. In addition to the structural constraints given in the previous section, property constraints have to be enforced to obtain desired molecules. The structural constraints ensure that the molecules formed are physically realizable, while the property constraints eliminate all those molecules that do not possess the desired properties. Thus, the structural constraints act as the core of the MINLP formulation to which various property constraints can be added to get molecules of interest. Selection of property constraints can be quite difficult if the desired properties cannot be expressed mathematically. For example, “low volatility” can be expressed in terms of the boiling point, while “toxicity” cannot be easily interpreted in terms of any fundamental physical property. In such cases, correlations have to be developed using a large number of existing compounds. The selection of appropriate property constraints is very important to the success of the formulation. These constraints are, in general, nonlinear and nonconvex, making it more difficult to reach global optimality. One property constraint that has to be incorporated in all cases is thermodynamic feasibility. A molecule is thermodynamically feasible at a particular temperature T if its Gibbs’ free energy of formation Gf at that temperature is negative; i.e.,
Gf(T) < 0
(17)
While this constraint ensures stability for the generated molecule, it can also eliminate some potentially useful molecules. This may happen if a thermodynamically unstable molecule’s dissociation kinetics are such that it can exist for time periods that are long enough for commercial use. The study of dissociation kinetics is beyond the scope of this paper. Solution of the MINLP gives an optimal molecule based on some specified performance objective that can be a function of an appropriate set of physical properties. As in the case of property constraints, a nonlinear objective function makes the problem nonconvex in general. 3. Property Prediction Methods All CAMD approaches are heavily dependent on the accuracy of the property prediction methods used. Therefore it is of utmost importance to use the most accurate prediction technique available for the properties of interest. Many group contribution techniques are available in the open literature that are reasonably accurate within their domain of applicability. Since these domains frequently do not cover all the molecules that may be of interest, one can use a hybrid of various methods to get the most accurate estimate of a physical
3792 Ind. Eng. Chem. Res., Vol. 35, No. 10, 1996
property. One such method by Duvedi and Achenie (1996) uses a hybrid of various group contribution techniques to get accurate estimates for the boiling point, critical temperature, and critical pressure of a compound. Since all group contribution techniques have an error margin, it is quite possible that the property constraints may eliminate feasible molecules due to errors in a predicted property. This problem can be tackled by performing flexibility and sensitivity analysis. A less rigorous way of countering this problem is by relaxing the property constraints by an extent equal to the error margin of the group contribution method used for that property. For example, the heat of vaporization Hv of Freon 12 as predicted by a group contribution method (Joback and Reid, 1987) is 18.74 kJ/mol. This group contribution method has an average error margin of about 4%. If the new molecule desired should have Hv greater than that of Freon 12, the relaxed constraint that can be employed is Hv g 17.99 kJ/mol. The value of 17.99 kJ/mol is obtained by relaxing the predicted value of Hv by 4%. A similar procedure can be followed for all property constraints. The aim of this relaxation is to minimize elimination of feasible molecules due to errors in prediction of their properties. By the same token, this relaxation can also declare feasible molecules that do not satisfy the property constraints. As mentioned before, the impact of inaccurate estimation techniques can be studied in more detail by performing sensitivity and flexibility analysis. See, for example, Fiacco (1983) or Straub and Grossmann (1993) for an explanation of these concepts. This is beyond the scope of the present paper. 4. Case Study The new formulation is applied to the problem of refrigerant design to obtain substitutes for Freon 12. Refrigerant design has been extensively studied in recent years; see, for example, Joback and Stephanopoulos (1989), Gani et al. (1991), and Duvedi and Achenie (1996). Environmental constraints in the form of ozone depletion potential has been explicitly considered by Duvedi and Achenie (1996) and Churi and Achenie (1996). The case study is used only as a proof-ofconcept. Commercial refrigerant performance requires optimization of the pressure ratio, discharge temperature, COP sensitivity to cycle variability, and a number of other parameters in addition to those considered here. In addition, material and lubricant compatibility will also be needed. 4.1. Solution Methodology. A MINLP is usually difficult to solve, especially if the problem is nonconvex. The CAMD problem discussed in this paper can be expressed as follows:
min f(x)
(18)
0 e Ax e b
(19)
g(x) e 0
(20)
x ≡ binary[0,1]
(21)
subject to
where
The linear constraints (eq 19) are structural constraints, while the nonlinear constraints (eq 20) are property
constraints. The augmented penalty-equality relaxation-outer approximation (AP/ER/OA) algorithm (Viswanathan and Grossman, 1990) is used to solve this problem. The algorithm involves successive solution of mixed integer linear programs (MILPs) and nonlinear programs (NLPs) until the optimum is reached. In this case study, the MILP was solved using IBM’s OSL (optimization subroutine library) and the NLP was solved using NPSOL (Systems Optimization Laboratory, Stanford University). The starting point for the MILPNLP sequence is obtained by solving the relaxed version of the MINLP formulation. The relaxed problemswhich is a NLPsis the same as the MINLP except that the variables are continuous with lower and upper bounds of 0 and 1, respectively. Since the starting point for the MINLP is a local optimum of the relaxed MINLP, the binary optimum point is found quickly. This is important since MINLPs are typically difficult to solve. 4.2. Variables. For the case study considered, the physical properties of interest are the enthalpy of vaporization Hv, liquid specific heat Cpl vapor pressure Pvp, and the ozone depletion potential (ODP). The heat of vaporization at the boiling point is obtained using Vetere’s modification of the Kistiakowsky equation (Vetere, 1973). This is used to get Hv at the temperature of interest using the generalized Watson’s relation proposed by Viswanath and Kuloor (1967). The liquid heat capacity is predicted using the group contribution method developed by Chueh and Swanson (1973). The vapor pressure is determined using the Riedel-PlankMiller equation (Reid et al., 1977), while the ozone depletion potential is obtained from the equations developed by Duvedi and Achenie (1996). One small change made in the tabulated ODP correlations is that the ODP for high carbon compounds is assumed to be high only if the molecule contains any of the halides except for fluorine. For the thermodynamic feasibility constraint, the Gibbs’ free energy of formation is calculated using the group contribution method described by Joback and Reid (1987). Following Duvedi and Achenie (1996), the following conditions are used for the refrigeration cycle: (i) evaporating temperature Te ) -1.1 °C; (ii) condensing temperature Tcd ) 43.3 °C; (iii) average temperature Tmean ) 21.1 °C; and (iv) saturated conditions, assumed so that superheat temperature Ts equals Te. The relevant physical properties are hv(Te), Cpl(Tmean), Pvp(Te), and Pvp(Tcd). Selection of parameter values depends on the application to which the designed compounds are to be applied to. For the case study, the aim is to identify hydrochlorofluorocarbons (HCFCs) that have properties comparable to, or better than, Freon 12, but with significantly lower ODP. The basis set for this case study consists of CH3, CH2, CH, F, and Cl. This gives m ) 5, smax ) 3, and v ) [1 2 3 1 1]. The maximum number of groups in the molecule (nmax) is specified as 8, which corresponds to the highest substitution possible for propane from the groups in the basis set. The limit comes from the observation that higher molecular weight molecules do not have vapor pressures in the range desired for refrigerants. 4.3. Property Constraints. The object of the case studies was to obtain refrigerants that have properties comparable to Freon 12, but with a lower ozone depletion potential. Hence, the property constraints are
Ind. Eng. Chem. Res., Vol. 35, No. 10, 1996 3793 Table 1. Results of Case Study iteration 0 1 2
molecule
objective Cpl/hv
hv(Te), kJ/mol
Cpl(Tmean), cal/mol
Pvp(Te), bar
Pvp(Tcd), bar
ODP
CH3sCl FsHCdCHsF
0.5726 0.8513 0.8911
19.23 20.44 20.20
11.01 17.40 18.00
3.77 2.44 2.37
14.00 9.17 9.92
0.01 0.00
Table 2. Series of Feasible Molecules no.
molecule
objective Cpl/hv
hv(Te), kJ/mol
Cpl(Tmean), cal/mol
Pvp(Te), bar
Pvp(Tcd), bar
ODP
1 2 3 4
CH3-Cl F-HCdCH-F CH3-CH2-F F-CH(CH2)2
0.8513 0.8911 1.1282 1.2458
20.44 20.20 17.78 18.88
17.40 18.00 20.06 23.52
2.44 2.37 4.19 3.51
09.17 09.92 13.07 13.72
0.01 0.00 0.00 0.00
based on the physical property values of Freon 12.
hv(Te) g 15.0 kJ/mol
(22)
A larger enthalpy of vaporization reduces the amount of refrigerant required. The heat of vaporization of Freon 12 is 18.40 kJ/mol (ASHRAE, 1972). By giving a lower limit of 15.0 kJ/mol, the generated molecule will not be significantly worse than Freon 12 in terms of Hv. This relaxation is not done to account for errors in the group contribution method but to prevent elimination of potentially good molecules whose Hv is only slightly lower than that of Freon 12.
Cpl(Tmean) e 30.0 cal/mol
(23)
A low liquid specific heat reduces the amount of refrigerant that flashes upon passage through the expansion valve. This is evaluated at Tmean. The value of the specific heat for Freon 12 obtained from a group contribution method at this temperature is 27.12 cal/ mol. As above, the limit is a relaxation of this value.
Pvp(Te) g 1.4 bar
(24)
The lowest pressure in the refrigeration cycle is kept above atmospheric pressure to reduce the possibility of air and moisture leaking into the system.
Pvp(Tcd) e 14.0 bar
(25)
A high-pressure system increases the size, weight, and cost of the system (Dossat, 1981). A pressure ratio of 10 is considered the upper limit for a refrigeration cycle (Perry and Chilton, 1973)
ODP e 0.2
(26)
By the Clean Air Act of 1995, refrigerants should have ODPs below 0.2 by the year 2000. Hence, this value is used as the upper limit for ODP. The ODP of Freon 12 is 1.0. In addition to the above constraints, thermodynamic stability of the generated molecule is ensured by using Gibbs’ free energy of formation (eq 17). For simplicity, Gf is calculated at standard temperature and pressure. 4.4. Performance Objective. While many types of performance objectives can be proposed to define a good refrigerant, the function used in this case study is
f ) Cpl(Tmean)/Hv(Te)
Table 1 shows the results obtained for the problem described above. The values of the property constraints for the molecules formed is also given. The first row gives the result of the relaxed NLP, which is the best possible solution for the given problem. However, it has no physical significance since the solution variables are not integers. The first feasible molecule formed is CH3Cl with an objective function value of 0.8513. The next iteration gives FsHCdCHsF with an objective function value of 0.8911. Since there is an increase in the objective value, iterations stop at this point with CH3sCl as the “best” molecule. Once an optimal solution is found, the next best solution is found by introducing an integer cut in the formulation. In this manner, it is possible to get a series of solutions in decreasing order of optimality as given in Table 2. As can be seen from Table 2, not all molecules may be suitable for practical purposes. Molecules like FsHCdCHsF may polymerize over the long run and hence cannot be used as refrigerants. It is possible to eliminate such molecules by introducing bonding constraints (eq 12) into the formulation. In any case, the formulation can give a series of molecules that conform to all applied constraints and which can be transferred to the experimental stage for further evaluation. As explained before, one major reason for high computation times is the existence of multiplicities. Constraints to the objective function value were not used to eliminate multiplicities mainly to prevent inadvertent elimination of feasible solutions. Instead, an integer cut was introduced every time a multiple solution was obtained. While it is possible to eliminate multiple solutions, the additional constraints are either nonlinear, or if linear, result in an increase in the problem size. In any case, fast computation is not too important when the alternative is a very time-consuming and expensive experimental search procedure. Another issue of interest is the stability of the molecules obtained from this method. As mentioned earlier, the use of Gibbs’ free energy of formation has some limitations. Another less rigorous way of eliminating infeasible molecules would be to introduce constraints based on heuristics. For example, constraints can be put to limit the number of amino groups in a molecule or to eliminate connections like -OsF. This approach has the potential of increasing the problem size and complexity since every additional heuristic results in an additional nonlinear constraint.
(27)
which is to be minimized. This arises from the desire to have a high Hv(Te) and a low Cpl(Tmean) for reasons given above. The property prediction methods for Cpl and Hv are nonlinear, and hence there is no way of making the objective linear.
5. Conclusion Many accurate group contribution methods require connectivity information in addition to the number of times various structural groups are present in a molecule. There appears to be no CAMD procedure in the
3794 Ind. Eng. Chem. Res., Vol. 35, No. 10, 1996
open literature that gives such detailed molecular information at the computational stage itself. Hence, such procedures are unable to take full advantage of the modern group contribution techniques. This work presents a new molecular representation technique based on binary variables that can be used in CAMD. The new formulation gives nearly complete molecular information and is thus amenable to application of any future property prediction techniques that may require detailed molecular information. The formulation is flexible enough so that it can be used to design any kind of compound by incorporating appropriate property constraints and by defining a suitable performance criterion. By giving complete molecular information, the new formulation opens up a wide range of applications which can be analyzed in a systematic manner. Literature Cited ASHRAE. American Society of Heating, Refrigeration and AirConditioning Engineers, Inc. Handbook of Fundamentals; ASHRAE: New York, 1972. Chueh, C. F.; Swanson, A. C. Estimating Liquid Heat Capacity. Chem. Eng. Prog. 1973, 69, 83. Churi, N.; Achenie, L. E. K. Design of refrigerants and refrigerant mixtures: Incorporating environmental constraints. AIChE 1996 Spring National Meeting, New Orleans, LA, 1996; AIChE: New York, 1996; Paper 88b. Constantinou L.; Gani, R. New Group Contribution Method for Estimating Properties of Pure Compounds. AICHE J. 1994, 40 (10), 1697. Dossat, R. J. Principles of Refrigeration; John Wiley & Sons: New York, 1981. Duvedi, A.; Achenie, L. E. K. Designing Environmentally Safe Refrigerants Using Mathematical Programming. Chem. Eng. Sci., 1996, in press. Fiacco, A. V. Introduction to Sensitivity and Stability Analysis in Nonlinear Programming; Academic Press: New York, 1983. Gani, R.; Nielsen, B.; Fredenslund, Aa. A Group Contribution Approach to Computer-Aided Molecular Design. AICHE J. 1991, 37 (9), 1318. Joback, K. G.; Reid, R. C. Estimation of Pure-Component Properties from Group-Contributions. Chem. Eng. Commun. 1987, 57, 233.
Joback, K. G.; Stephanopoulos, G. Designing molecules possessing desired physical property values. In Proceedings of the Foundations of Computer Aided Process Design (FOCAPD), Snowmass, CO, July, 1989; p 363. Lai, W. Y.; Chen, D. H.; Maddox, R. N. Application of a Nonlinear Group-Contribution Model to the Prediction of Physical Constants. 1. Predicting Normal Boiling Points with Molecular Structure. Ind. Eng. Chem. Res. 1987, 26, 1072. Macchietto, S.; Odele, O.; Omatsone, O. Design of Optimal Solvents for Liquid-Liquid Extraction and Gas Absorption Processes. Trans. the Inst. Chem. Eng. 1990, 68, 429. Perry R. H.; Chilton, C. H. Perry’s Chemical Engineers’ Handbook, 5th ed.; McGraw-Hill, Inc.: New York, 1973. Reid, R. C.; Prausnitz, J. M.; Poling, B. E. The Properties of Gases and Liquids, 4th ed., McGraw-Hill, Inc.: New York, 1987. Straub, D. A.; Grossmann, I. E. Design Optimization of Stochastic Flexibility. Comput. Chem. Eng. 1993, 17 (4), 339. Vaidyanathan, R.; El-Halwagi, M. Computer-Aided Synthesis of Polymers and Blends with Target Properties. Ind. Eng. Chem. Res. 1996, 35, 627. Venkatasubramanian, V.; Chan K.; Caruthers, J. Computer Aided Molecular Design using Genetic Algorithms. Comput. Chem. Eng. 1994, 18, 833. Vetere, A. Modification of the Kistiakowsky Equation for the Calculation of the Enthalpies of Vaporization of Pure Compounds; Technical report; Laboratori Ricerche Chimica Industriale, SNAM PROGETTI: San Donato Milanese, Italy, 1973. Viswanath, D. S.; Kuloor, N. R. On a Generalized Watson’s Relation for Latent Heat of Vaporization. Can. J. Chem. Eng. 1967, 45, 29. Viswanathan, J.; Grossmann, I. E. A Combined Penalty Function and Outer Approximation Method for MINLP Optimization. Comput. Chem. Eng. 1990, 14, 769.
Received for review April 3, 1996 Revised manuscript received June 17, 1996 Accepted June 18, 1996X IE9601920
X Abstract published in Advance ACS Abstracts, September 1, 1996.