Biomimetic Adaptation of the Evolutionary Algorithm, NSGA-II-aJG

Feb 17, 2009 - Page 1 ... Biogenetic Law of Embryology for Intelligent Optimization ... Several of the recent optimization techniques have been adapte...
0 downloads 0 Views 509KB Size
8054

Ind. Eng. Chem. Res. 2009, 48, 8054–8067

Biomimetic Adaptation of the Evolutionary Algorithm, NSGA-II-aJG, Using the Biogenetic Law of Embryology for Intelligent Optimization Manojkumar Ramteke† and Santosh K. Gupta* Department of Chemical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai 400 076, India

Several of the recent optimization techniques have been adapted from nature. The elitist nondominated sorting genetic algorithm with the adapted jumping gene operator (NSGA-II-aJG) is one such evolutionary technique inspired by genetics. This algorithm is quite useful for solving multiobjective optimization problems. The drawback of these techniques is the inordinately large amount of computational effort required for solving real-life problems, even though these techniques are quite robust as compared to conventional techniques. Their use for online optimization is particularly limited. Many industrial optimization problems require frequent changes in the objective functions as well as the decision variables, even though the system itself remains the same. Surprisingly, no algorithm has been developed which makes use of previous information for solving a different problem for the same system in a comparatively short (computational) time. The algorithm developed in this study, namely, B-NSGA-II-aJG, carries this out more intelligently using the biogenetic law of embryology. This algorithm increases the speed of convergence significantly. Introduction Several evolutionary algorithms mimicking natural processes have been developed for solving multiobjective optimization problems, e.g., simulated annealing,1 ant-colony optimization,2 particle swarm optimization,3 genetic algorithm (GA),4-6 etc. One of these, inspired by the process of evolution, is the nondominated sorting genetic algorithm with the fixed-length jumping gene (JG) adaptation7 (NSGA-II-aJG). In the conventional genetic algorithm, the initial population (of several solutions, called chromosomes) is generated randomly. These then undergo an evolutionary process with the crossover, mutation, jumping gene, and other operations, until the population reaches the fittest (converged, optimum) stage. This algorithmic process is akin to the production of single-cell animals randomly, following its entire life cycle, and continuing until the evolutionary process leads to the formation of multicellular organisms, all the way up to mammals. It took ages for protozoa to evolve into mammals. But, in real life, if a mammal produces an offspring, these evolutionary steps are repeated at the developmental stage of the embryo [in humans, 9 months are required for the zygote (single cell, similar to protozoa) to take the form of a child]. Thus, the time period associated with evolution is represented by a very small time period comprising the developmental stage of an embryo. The same concept is used in the nondominated sorting genetic algorithm with the fixed-length jumping gene adaptation (NSGAII-aJG). In the present study, a new adaptation of NSGA-IIaJG, namely, B-NSGA-II-aJG, is developed based on the biogenetic law of embryology.8 This law states that ontogeny repeats phylogeny. In other words, the developmental stages of the embryo of a particular species show all the steps of evolution prior to that species (not exactly, but quite reasonably). This has been an evolutionary perspective of Baer’s law8 by Ernst Haeckel. This important biological concept is used to add intelligence in an evolutionary optimization algorithm. The optimization problems encountered in industries mostly involve multiple objective functions. Quite frequently, seVeral * To whom correspondence should be addressed. On leave from IIT Kanpur. E-mail: [email protected]. Tel.: 91-22-2576 7256. Fax: 91-22-2572 6895. † Department of Chemical Engineering, IIT Kanpur 208016.

different multiobjective optimization (MOO) problems need to be solved for the same system. In the petroleum industry, the demands of various products like kerosene, diesel, petrol, etc., as well as the quality of crude (cost of the crude also changes frequently these days), changes from time to time. Thus, the objective functions, constraints, and the decision variables also change, though the system remains essentially the same. Sometimes the existing configuration of reactors needs to be changed due to technological improvements. For such cases, the results of the original MOO problem are available. For example, in the MOO of an industrial phthalic anhydride reactor,9 the number of catalyst zones/beds can be increased since the larger the number of beds in this system, the better its performance. In this case, results of a simpler original optimization problem involving a lower number of catalyst beds are available. Sometimes, the number of objective functions increases for the same system to meet newer and stiffer standards set by governments, e.g., the permissible limits of pollutants become more stringent and the consumption of scarce natural resources, even water, needs to be reduced. An increase in the number of decision variables also comes up quite frequently in the automobile industry with technological advancements. In addition to these, one may have to solve a different problem (arising due to different kinds of failures) for the same system a few times for online optimizing control, and this must be done quickly in real time (no such restriction on the computational time may exist for offline optimization). Though several studies10-16 have been reported on online optimizing control, the use of evolutionary algorithms is quite cumbersome because of the high computational effort required. In all such cases the results of an optimization problem solved previously, often a simpler (original) problem can be used intelligently to solve the new (modified) optimization problem with lower computational effort than required otherwise. The modified problem could be completely different from the original one but should involve the same system (model equations; though minor changes in these can also be handled by this adapted algorithm) and could involve an increase in the number of decision variables, a change in the bounds, a change in the objective functions or their number, addition of extra constraints, etc. In

10.1021/ie801592c CCC: $40.75  2009 American Chemical Society Published on Web 02/17/2009

Ind. Eng. Chem. Res., Vol. 48, No. 17, 2009

this study, an attempt is made to develop a strategy to achieve this goal and a new algorithm, B-NSGA-II-aJG, is presented. Four different sample MOO problems are solved to illustrate the use of B-NSGA-II-aJG in the present study. Problems 1 and 2 are simple benchmark problems while the other two are computationally more intense and industrially relevant problems. Problem 3 deals with an industrial phthalic anhydride reactor whereas problem 4 deals with the online optimizing control of an industrial semibatch nylon-6 reactor to negate the effect of a disturbance, e.g., temporary heater failure. In all these problems, it has been assumed that the original problem has already been optimized (converged) and the existing optimal solutions are used to optimize the newer, modified problem using the biogenetic law of embryology. Summary of NSGA-II-aJG A short summary7,17 of the binary-coded NSGA-II-aJG is first presented. In this algorithm, the decision variables are taken in the form of a set of lstr binary numbers (describing the complete set of decision variables). These constitute a chromosome. The binaries for each chromosome are generated randomly at the beginning (first generation). A population of Np chromosomes is generated in this manner. The binaries corresponding to each decision variable are then coded/mapped into decimal values so that they all lie within their bounds. The decoded values of each variable are used in the model equations to calculate all the objective functions in an MOO problem. The chromosomes are then classified into “ranks” or “fronts” based on the concept of nondomination.5,6 Also, the Euclidean distance in the objective-function space between nearest neighbors, referred to as the “crowding distance”, is evaluated for all the chromosomes. Two chromosomes are picked randomly from the population of parent chromosomes, and the better of the two, based on the rank and the crowding distance, is copied into a mating pool. This is known as tournament selection. Np chromosomes are, thus, copied for future operations. It may be mentioned that this procedure may lead to some inferior chromosomes also finding their way into the mating pool (this is a characteristic of GA, since it is possible that a poor chromosome may give a good daughter). These are the better “parents”. Two chromosomes are picked randomly from these Np in the mating pool to undergo crossover, mutation, and the jumping gene operation5-7,17 to produce two (initial) daughter chromosomes. The Np daughters so produced are mixed with the Np better parents. These 2Np chromosomes are reranked and the best Np (final) daughter chromosomes are selected from these. The process of picking up Np (final) daughter chromosomes from among the (2Np) better parents and the initial daughters is called elitism since some of the better (elite) parents pass on to the next generation. The newly selected daughter chromosomes are then used as the starting population for the next generation. This process continues until the number of generations exceeds a user-specified maximum number, maxgen, after which the Pareto front does not change much. A general mathematical approach is proposed to characterize the convergence of the Pareto set of optimal solutions in any generation. The MOO problem involving N objective functions, I1, I2,..., IN, is solved over a large number, maxgen, of generations until Np fully-converged solutions, I1,i, I2,i,..., IN,i, i ) 1, 2,..., Np, are obtained. The stored results of this problem, including the fully converged ones, are then arranged in ascending order of values of, say, I1. These are then used to calculate the sum-of-squared deviation, σ2, of the Np set of points in any generation (with respect to the fully converged set) using

N

σ2 )

Np

∑∑ j)2 i)1

(

Ij,i - Ij,opt,i range of Ij,opt

(N - 1)Np

)

8055

2

(1)

In eq 1, Ij,opt,i is the interpolated value of the jth objective function in any (earlier) generation corresponding to I1,i. The “range of Ij,opt” is the range of the jth objective function in the fully-converged Pareto front and is used for normalization. Thus, σ2 gives the normalized mean square deviation of the Pareto set of solutions at any generation with respect to the fully converged Pareto front. The value of σ2 decreases with an increase in the number of generations, and at some point, it becomes smaller than 0.1. This point, the maxgen1th generation (maxgen1 < maxgen), is assumed to correspond to the nearconverged solution. Higher values of σ2 reflect either initial unconverged solutions or final convergence to local Pareto sets. At times, maxgen1 is taken slightly beyond the point where σ2 e 0.1 if this leads to a slightly better spread of the Pareto solutions. The value of σ2 can be used to check the level of convergence of a set of solutions obtained from any multiobjective algorithm. It is simpler and more informative as compared to other methods used in the literature.5,6,17 B-NSGA-II-aJG In the proposed algorithm, B-NSGA-II-aJG, we first solve the original optimization problem, a relatively simple one (optimization results are often available for similar and simpler problems on the same system), using the conventional NSGAII-aJG. The results of the maxgenth generation can be used for calculating values of σ2 for the solutions of earlier generations. Results (nonoptimal; associated with σ2 g 0.1) from the first to the maxgen1th generation of the original problem are then selected randomly to give Np chromosomes as the initial population (seed) for the modified problem. This mimics the biogenetic law, as explained later. For example, if nearconvergence is attained in maxgen1 ) 200 generations when Np is 100, then one member is picked randomly from every two generations so that the total number of chromosomes selected by the end is Np. These Np solutions are the initial chromosomes of the modified problem (in contrast to what is done in NSGA-II-aJG, where all the initial binaries in the chromosomes are generated randomly). Each of the decision variables in the initial Np chromosomes for the modified problem is available both in the form of a real (decimal) number and as a set of binary numbers. Since the bounds of the decision variables could be different in the modified problem, we choose the former. The real value (say, x ) 2.0 for the case where the bounds are 0 e x e 6.0 and where two binaries describe each decision variable, so x can take on values of 0, 2.0, 4.0, and 6.0) of the decision variable in the initial chromosome is taken to be the same in the modified problem (i.e., x ) 2.0, where the new bounds are 1.0 e x e 10 and where two binaries again describe each decision variable). However, if the bounds in the new problem are different, the nearest permissible real value (in this example, x ) 1.0 from among the choices: x ) 1.0, 4.0, 7.0, and 10.0) of that decision variable is taken (note that the entire range of the decision variable is digitized into 2lstring points or 2lstring - 1 intervals and, so, the real value of the decision variable of the original problem may not match any of the digitized locations in the modified problem; however, since lstr is usually taken above about ten, this procedure ensures that the value of x in the modified problem is quite close to that in the original problem). The

8056

Ind. Eng. Chem. Res., Vol. 48, No. 17, 2009

binary numbers corresponding to this modified real number are then obtained (00 in the above modified problem, as opposed to 01 in the original problem; the accuracy improves as we increase lstr; in all problems, we have used lstr g 10). The binaries so generated may differ from those in the original problem, as in the example above. In case the real number in the initial chromosome falls outside the bounds in the modified problem, the initial chromosome is rejected and is replaced by a randomly generated chromosome. The starting population for the modified problem as per the biogenetic law, thus, incorporates solutions from all the previous generations of the original problem until maxgen1. To understand this in more detail, let us assume that each generation in the optimization of the original problem is one individual evolution step (or one species like “protozoa”). In general, while going from one species to the next (like from protozoa to sponge) of the evolutionary ladder, some information of the previous step (protozoa) gets encoded in the genetic content (of the sponge), and thus in the developmental stage, it follows the same path as that of evolution. The same thing is done in the present study to mimic the biogenetic law as some chromosomes (number based on maxgen1) of each generation of the original problem are used to constitute the starting population, Np, of the modified problem (0th generation). This is akin to encoding the information of all the evolutionary steps prior to that species to constitute its embryo (instead of generating it randomly as in NSGA-II-aJG). A few important comments must be mentioned at this point. If the modified MOO problem has an extra constraint than in the original problem, it is possible that the conVerged optimal solution of the original problem lies beyond the feasible reason of the modified problem. In such a case, if we decide to use all the starting chromosomes for the modified problem as those of the maxgen1th generation in the original problem, several of the starting chromosomes of the modified problem will have to be discarded since they will be violating the constraint and regenerated randomly. This is as good as using NSGA-II-aJG. Use of the biogenetic law avoids this difficulty since it uses a starting population from all the generations (1st to maxgen1th). Several of these are nonoptimal for the original problem anyway, yet carry some amount of embryonic information (and so are better than the chromosomes generated randomly). In general, the biogenetic law adaptation does not preserve only the best points of the embryos of the original problem but also some of the bad points because of random selection. The latter may turn out to be close to the optimal solution of the modified MOO problem or may turn out to be good parents for the modified MOO problem. Indeed, this is akin to not rejecting outright bad chromosomes in GA, since it is possible that the worst chromosomes may produce great daughters after crossover, etc. Use of the biogenetic law makes the new algorithm much more general and enables it to take care of changes in the constraints, objective functions, bounds, and, at times, even model equations, as in biological evolution. It is observed that this procedure leads to a considerable decrease in the computational time required for convergence of the modified problem, not only for the four sample problems discussed in this study, but a few others studied as well. It may be added that it is possible that the starting chromosomes for the modified problem do not have information on all the decision variables. For example, if the modified problem has more decision variables than the original one, the starting chromosomes for the missing decision variables need to be generated randomly. Similarly, if the modified problem has fewer decision variables, all we need to do is to drop an

appropriate number of binaries in the starting chromosomes, etc. The other changes in the modified problem (e.g., changes in the bounds or in the objective functions, addition of extra constraints, etc.) can be taken care of quite easily. Four sample problems are now studied. These are described below. Problem 1 The first problem is a simple benchmark5,6 MOO problem. It has been solved primarily to illustrate the use of the adapted algorithm. The original and modified problems are described below: Original. min I1 ) X2 min I2 ) (1 - X)

(2a) 2

(2b)

s.t.: - 1000 e X e 1000 (2c) It is difficult to obtain the Pareto set for this problem5 for the large range of X given in eq 2c. Modified. The second objective function, as well as the bounds, is slightly different in this case: min I1 ) X2

(3a)

min I2 ) (2 - X )2

(3b)

s.t.: - 5000 e X e 5000 (3c) The range of the Pareto set of optimal solutions is quite different for the modified problem as compared to that for the original problem. Problem 2 The second problem is also a benchmark5,6 MOO problem. It has been solved primarily to illustrate the use of the new algorithm when the number of objective functions is larger in the modified problem. The original and modified problems are described below: Original. min I1 ) X12 + X22

(4a)

min I2 ) (X1 + 1)2 + X22

(4b)

s.t.: - 50 e Xi e 50 for i ) 1,2

(4c)

Modified. An extra objective function is present in the modified problem. In addition, the range of the Pareto front is different for the modified problem: min I1 ) X12 + X22

(5a)

min I2 ) (X1 + 2)2 + X22

(5b)

min I3 ) X12 + (X2 + 2)2

(5c)

s.t.: - 50 e Xi e 50 for i ) 1,2

(5d)

Problem 3 In this problem, the two-objective optimization of an industrial phthalic anhydride (PA) reactor9 is studied. PA is used widely in industry for the production of vinyl chloride-based plastics and for the production of polyester resins, plasticizers, dyes, insect repellants, etc. Several zones of catalyst are used in a multitubular reactor for the gas-phase catalytic oxidation of o-xylene, the intermediate regions between the catalyst packings

Ind. Eng. Chem. Res., Vol. 48, No. 17, 2009

8057

Figure 1. Reaction scheme18 for phthalic anhydride production. Reaction numbers are indicated on the arrows.

being used for cooling the flowing gas and removing the exothermic heat of reaction. Such a configuration helps avoid the temperature of the gas from becoming too high. Two “switch” condensers, used alternately, are used to treat the gas leaving the reactor. The exit gas from the reactor is sent to one of the cooled finned condensers where PA solidifies on the metal surface. The treated gas from the condenser is then scrubbed with water or incinerated catalytically or thermally. Once a sufficient amount of PA has solidified on the condenser surface, the exit gas from the reactor is sent to the second switch condenser. During this time, PA is removed from the previous condenser by melting. The reaction scheme suggested by Skrzypek et al.18 (see Figure 1) is used for modeling the reactor. The LangmuirHinshelwood-Hougen-Watson (LHHW) rate expressions corresponding to this reaction scheme, and the model equations, kinetic, and adsorption parameters are given in refs 9 and 19 and are not provided here. Steady-state mass and energy balance equations9 between the bulk gas and the outer surface of the imperVious catalyst are solved using the DNEQNF program in the IMSL library. The differential equations corresponding to the steady-state balance9 for the bulk gas phase are solved using the code, DIVPAG, in the IMSL library. The simulation results are then used to maximize the yield, XPA, of PA and to minimize the total length of the catalyst bed. The original problem is simpler and involves seven catalyst zones (16 decision variables), whereas there are nine catalyst zones (20 decision variables) in the modified problem. These problems are described below. Original. max I1(u) ) XPA

(6a)

7

min I2(u) ) Lcat )

∑L

i

(6b)

i)1

Figure 2. Reactor systems: (a) original with seven beds and (b) modified with nine beds.

The upper bound on the feed temperature, TF,in, is taken to be 307 °C since several optimal solutions for the modified problem occur in the range 247 e TF,in e 297 °C. Constraints have been put on the spacings, Si (i ) 1, 2,..., 6), between the seven catalyst zones (see Figure 2a) as well as on the lengths, Li (i ) 1, 2,..., 6) of the catalyst zones [note that L7 ) (3.25 6 Li - ∑6i Si)]. In order to ensure near-complete conversion ∑i)1 of o-xylene and to take care of the constraints, the objective functions are reformulated using penalty functions (the problem is converted to a pure maximization problem by using6,20 max I ) 1/(1 + min I)6,20 for the second objective function in eq 6b). Equation 6a is rewritten as:

[

max I1(u) ) XPA + w1 1 -

max I2(u) )

(

Tmax e 510 °C

(6c)

total length of the reactor tube, Ltube ) 3.25 m

(6d)

Bounds. 65 e cin e 85 g OX/(m3 air at NTP)

(6e)

147 °C e TF,in e 307 °C

(6f)

337 °C e Tc,in e 447 °C

(6g)

0.001 e m ˙ e 0.005 (kg coolant)/s 0.2 e Si e 0.55 m; i ) 1, 2, ..., 6

(6h) (6i)

0.05 e L1 e 0.9 m

(6j)

0.01 e Li e 0.2 m; i ) 2, 3, ..., 6

(6k)

] [ 2

) [

+ w2 1 -

XPA 1 + w1 1 1 + Lcat 1.2

Lcat 3.6

] [ 2

+ w2 1 -

]

2

+ w3

Lcat 3.6

(7a)

]

2

+ w3 (7b)

(values of XPA ) 1.2 and Lcat ) 3.6 m in the penalty functions are somewhat arbitrary and are used to ensure that these two variables do not exceed expected or real values) subject to

(

L7 ) Ltube -

subject to Constraints.

XPA 1.2

6

∑ i)1

6

Li -

∑S

)

i

i

g0

(7c)

eqs 6e-6k (7d) The weighting functions in the penalty terms are given by If XPA e 1.1, w1 ) -500; else w1 ) 0 (forces XPA to be larger than 1.1) (8a) If L7 e 0 m, w2 ) -3000; else w2 ) 0 (forces L7 to be larger than 0) (8b) If Tmax e 510 °C in bed i; i ) 1, 2, ..., 7; w3 ) 0; else w3 ) -3000 + 250(i - 1) (8c) If Li e 0.01 m; i ) 1, 2, ...,7; w3 ) 0; else w3 ) -300 (forces Li e 0.01 m) (8d) The decision variables for this problem are

8058

Ind. Eng. Chem. Res., Vol. 48, No. 17, 2009

u ) [cin, TF,in, Tc,in, m ˙ , S1, S2, ..., S6, L1, L2, ..., L6]T

(9)

In eq 9, cin is the mass of OX per cubic meter of air at NTP. The values of L1-L6 and S1-S6 are selected optimally by the optimization algorithm. L7 is then computed using Ltube () 3.25 m). The values of wi have been decided using trial and error. They should not be too low (so as to be ineffective in forcing the solutions toward satisfying the constraints), nor too large (so as to distort the contours of the modified objective functions too much and make it difficult to obtain the optimal solutions). The catalyst of Skrzypek et al.18 is used with the diameter of the catalyst particle as 3 mm. Other details of this reactor are the following: diameter of each reactor tube ) 25 mm and the mass flux G ) 19 455 kg/(m2 h). Additional details are given in the work of Bhat and Gupta.9 Modified. In the modified problem there are nine catalyst zones (instead of seven in the original problem). The two-objective optimization problem solved for this system is described by max I1(u) ) XPA

(10a)

9

∑L

min I2(u) ) Lcat )

(10b)

i

i)1

subject to Constraints. Tmax e 510 °C

(10c)

Ltube ) 3.5 m

(10d)

65 e cin e 85 g OX/(m3 air at NTP)

(10e)

147 °C e TF,in e 287 °C

(10f)

337 °C e Tc,in e 447 °C

(10g)

0.001 e m ˙ e 0.005 (kg coolant)/s 0.2 e Si e 0.45 m; i ) 1, 2, ..., 7

(10h) (10i)

0.1 e S8 e 0.45 m

(10j)

0.05 e L1 e 0.9 m

(10k)

0.01 e Li e 0.2 m; i ) 2, 3, ..., 8

(10l)

Bounds.

If Li g 0.01 m; i ) 1, 2, ..., 9; w3 ) 0; else w3 ) -300 (12d) The decision variables for this problem are u ) [cin, TF,in, Tc,in, m ˙ , S1, S2, ..., S8, L1, L2, ..., L8]T Problem 4 One of the problems studied in the area of polymer reaction engineering is the MOO of nylon-6 polymerization.20 Nylon-6 is an important commodity polymer. Most of the polymer produced is used for the preparation of fibers, fishing nets, supports to electronic materials, mechanical parts, etc. Nylon-6 is produced commercially by the hydrolytic polymerization of ε-caprolactam with a small amount of dissolved water. The kinetic scheme for this polymerization (see Figure 3) comprises five reversible reactions, namely, ring opening, polycondensation, polyaddition, ring opening of cyclic dimer, and polyaddition of cyclic dimer. The rate and thermodynamic parameters are available in ref 20, as are the other details of the reactor and its operations. A schematic diagram of the industrial semibatch reactor studied is shown in Figure 4. The reactor is fitted with a low speed anchor agitator and is surrounded by a heating jacket with a fluid at a constant temperature, TJ. Initially, an inert atmosphere of pure nitrogen is maintained above the reaction mass. The high temperature in the reactor leads to the vaporization of water and ε-caprolactam, and the pressure, p, inside the reactor builds up with time, t, since the output valve is kept closed at the beginning. A prescribed pressure history, p(t), is implemented in the reactor after some time, using the release of vapor at an appropriate rate, VT(t) (mol vapor mixture)/h. This operation allows high concentrations of water to be present in the liquid reaction mass at the beginning of polymerization (driving the ring-opening step in the forward direction), while low concentrations of water are maintained in the liquid phase at higher values of t (driving the polycondensation reaction forward and leading to high molecular weight product).

The objective functions are reformulated using penalty functions as:

[

max I1(u) ) XPA + w1 1 -

max I2(u) )

subject to

(

XPA 1.2

] [ 2

+ w2 1 -

) [

XPA 1 + w1 1 1 + Lcat 1.2

(

L9 ) 3.5 -

] [ 2

+ w2 1 -

]

2

+ w3

Lcat 3.6

(11a)

]

2

+ w3 (11b)

8

8

∑L -∑S i

i)1

Lcat 3.6

)

i

i

g0

(11c)

eqs 10e-10l (11d) The weighting functions in the penalty terms are given by If XPA e 1.1, w1 ) -500; else w1 ) 0

(12a)

If L9 e 0 m, w2 ) -3000; else w2 ) 0

(12b)

If Tmax e 510 °C in bed i; i ) 1, 2, ..., 9;w3 ) 0; else w3 ) -3000 + 250(i - 1) (12c)

(13)

Figure 3. Kinetic scheme of nylon-6 polymerization.

Ind. Eng. Chem. Res., Vol. 48, No. 17, 2009

8059

Original. min I1[p(t), TJ] ≡

( ) (

) ( ) ( ) ( ) ( ) ( ) ( ) tf

tf,ref

+ w1 1 -

µn,f µn,ref

2

+ w2 1 w3 1 -

min I2[p(t), TJ] ≡

µn,f [C2]f + w1 1 [C2]f,ref µn,ref

2

xm,f 2 + xm,ref Tf 2 (14a)

Tdegrad

xm,f xm,ref

+ w2 1 -

w3 1 -

Tf

2

+

2

Tdegrad

(14b)

subject to Constraints. model equations (mass balance and other equations given in refs 20 and 21; except the energy balance equation, which is modified as given below; AMULT ) 1)

[

AMULT × UA(Tj - T) +

Figure 4. Schematic diagram of the industrial semibatch reactor.

The model of this industrial nylon-6 reactor has been developed by Wajge et al.21 and is used for offline MO optimization. The mass and energy balance and the moment equations are in the form of a set of coupled, nonlinear ordinary differential equations (ODEs) of the initial value kind (IVPs). The state of the reactor is described by fifteen variables, namely, the liquid-phase concentrations of the monomer, amino caproic acid, cyclic dimer, and water, the first three moments of the linear polymer chains, the total mass of the liquid-phase in the reactor, the temperature of liquid (and vapor) phase, the vaporphase concentrations of monomer, water and inert gas, and the rate of change of the total moles of monomer, water, and inert, respectively, in the liquid reaction mass. The 15 first-order ordinary differential equations (ODEs) for each of these state variables are given in refs 20 and 21. The set of ODEs has been integrated along with several correlations20,21 for the viscosity, heat and mass transfer coefficients, activity coefficients, etc. This is done using the NAG library program, D02EJF, based on Gear’s technique.22 A relatively high value of the error-tolerance of 10-3 is used in the program since the ODEs are quite stiff. Integration of the set of ODEs gives the values of the degree of polymerization (µn) and the concentrations of the various species in both the liquid and vapor phases as functions of time. For the optimization of this semibatch reactor, the first objective function is the minimization of the batch or reaction time, tf (where subscript, f, denotes the final value), since this leads to an increase in the annual production of the polymer. The cyclic dimer, C2, present in the polymer product (at tf) affects the quality of the polymer significantly and has to be removed by hot water extraction. Minimization of [C2]f in the product is, therefore, another meaningful objective function. To prevent the conversion, xm,f, from becoming too low, it is constrained20 to lie above a fixed (large) value, xm,ref. Similarly, the value, µn,f, of the number average chain length of the final product is constrained to lie at a design (or reference, ref) value, µn,ref, so as to ensure that the product has the desired mechanical properties. In all cases, the temperature, T(t) (or its maximum value, Tf), of the reaction mass must not go above the degradation temperature, Tdegrad, of the polymer. The above (original) problem is first solved using NSGAII-aJG (offline optimization). This problem is written mathematically as

dT ) dt

[

5



]

F [r (-∆Hi)] 1000 i)1 i

[Rvmλm(Tr) + Rvwλw(Tr)]v (0.113RvmCvp,m + 0.018RvwCp,w )(T - Tr) +

]

×

Clp,mix(0.113Rvm + 0.018Rvw)(T - Tr) 1 (14c) [Clp,mix + 2.0925 × 10-3(T - Tr)]F

Bounds. pmin()p0) e p(t) e pmax

(14d)

TJ,min e TJ e TJ,max

(14e)

Reference values, tf,ref and [C2]f,ref, are used to nondimensionalize tf and [C2]f, respectively {the values of tf,ref and [C2]f,ref are taken, somewhat arbitrarily, as 16 h and 0.03222 mol/kg mixture, respectively, based on the ranges given in ref 20]. These are in the range of current industrial values.20 In eqs 14a and 14b, wi are the weighting factors for the penalties. These are given by If |1 - (µn,f/µn,ref)| g 0.01, w1 ) 1.0 × 108 ; else w1 ) 0 (15a) If xm,f e xm,ref, w2 ) 1.0 × 108 ; else w2 ) 0

(15b)

If T(t) g Tdegrad, w3 ) 1.0 × 10 ; else w3 ) 0

(15c)

8

This problem differs from the earlier two problems in that one of the decision variables is a function of time (a history) and is not an optimal “value”. This is taken care of by discretizing the history of the decision variable and describing it in terms of 39 sub-variables (at “nodes”) that are “numbers”. A relatively large number of nodes is used to give better online optimizing control, though this leads to higher computational effort. The set of binaries describing each of the subvariables are mapped within bounds fixed by trial and error. A continuous function is then curve-fitted to the discrete set of 39 subvariables generated at intervals of ∆t ) 25.26 min. This is done using Hermite interpolation, the continuous function being required for the integration of the model equations. The bounds for p(t) and TJ in eqs 14d and 14e are written differently than in the conventional approach and are given for each node as p1,min(t ) t1 ) 0) ) 105.37 kPa ) p0

(16a)

p1,max(t ) t1 ) 0) ) 105.37 kPa ) p0

(16b)

p2,min(t ) t2 ) t1 + ∆t) ) 105.37 kPa

(16c)

8060

Ind. Eng. Chem. Res., Vol. 48, No. 17, 2009

p2,max(t ) t2 ) t1 + ∆t) ) 121.59 kPa

(16d)

p3,min(t ) t3 ) t2 + ∆t) ) 105.37 kPa

(16e)

p3,max(t ) t3 ) t2 + ∆t) ) 151.98 kPa

(16f)

p4,min(t ) t4 ) t3 + ∆t) ) 105.37 kPa

(16g)

p4,max(t ) t4 ) t3 + ∆t) ) 182.38 kPa

(16h)

p5,min(t ) t5 ) t4 + ∆t) ) 105.37 kPa

(16i)

p5,max(t ) t5 ) t4 + ∆t) ) 253.31 kPa

(16j)

p6,min(t ) t6 ) t5 + ∆t) ) 105.37 kPa

(16k)

p6,max(t ) t6 ) t5 + ∆t) ) 303.97 kPa

(16l)

p7,min(t ) t7 ) t6 + ∆t) ) 105.37 kPa

(16m)

p7,max(t ) t7 ) t6 + ∆t) ) 354.63 kPa

(16n)

p8,min(t ) t8 ) t7 + ∆t) ) 105.37 kPa

(16o)

p8,max(t ) t8 ) t7 + ∆t) ) 405.30 kPa

(16p)

p9,min(t ) t9 ) t8 + ∆t) ) 105.37 kPa

(16q)

p9,max(t ) t9 ) t8 + ∆t) ) 455.96 kPa

(16li)

p10,min(t ) t10 ) t9 + ∆t) ) 151.98 kPa

(16r)

p10,max(t ) t10 ) t9 + ∆t) ) 354.63 kPa

(16lii)

p11,min(t ) t11 ) t10 + ∆t) ) 151.98 kPa

(16s)

p11,max(t ) t11 ) t10 + ∆t) ) 253.31 kPa

(16t)

pi,min ) 105.37; i ) 12, 13, ..., 39

(16u)

pi,max ) 131.72; i ) 12, 13, ..., 39

(16v)

TJ,min ) 493.15 K

(16w)

TJ,max ) 543.15 K

(16x)

One of the Pareto set of optimal solutions of this MOO problem, the operating point or the preferred solution, is then assumed to be implemented on the reactor. Disturbances will always be present while the reactor is running at this offline computed preferred solution. The actual operation of the reactor will be optimal only if there are no “disturbances” present. In the presence of unplanned disturbances, however, online optimization must be carried out continuously, so that their effects are negated. There is a small problem here since the optimization (genetic algorithm) code works with a set of discrete values of the pressure at the nodes of t while the operation and simulation of the reactor is continuous. Because of this, any newly generated optimal solution can be implemented only at one of the nodes. Since there are 39 subvariables in 0 e t e tf,ref () 16 h), the time period between two consecutive nodes is ∆t (25.26 min ∼ 2.63% of the total batch time). Thus, the new optimal history can be implemented only at multiples of ∆t. A typical disturbance in the operation of the reactor is the failure of the cooling system from, say, t ) t1 to, say, t ) t2. It is assumed (without loss of generality) that t1 ) n∆t min. The reoptimized (modified) problem is solved starting from t ) n∆t min. The solution is assumed to become available at t ) n∆t + ∆tc min (∆tc may or may not be below ∆t). It is assumed that the disturbance continues until before t ) [(n + i)∆t] min, where i is an integer. At this time, the reoptimized (modified, in eq 17a subscript, mod, indicates the modified problem) solution, TJ,mod and pmod(t), is implemented over [(n + i)∆t] min e t e tf,mod (tf,mod will not be the same as tf). Since i is not known a-priori, we keep solving the modified problem over and over

Table 1. Computational Parameters Used in GA for Optimizationa problem 3 parameter

problem 1

problem 2

original

modified

problem 4

Np Pcross Pmut PJG lstr variables

100 0.9 0.02 0.6 50 1

100 0.9 0.0167 0.6 30 2

100 0.95 0.025 0.5 20 16

100 0.95 0.025 0.5 20 20

100 0.85 0.08 0.5 30 11

a

Parameters are the same for the modified problems 1, 2, and 4.

again from node to node until the disturbance is over and the latest reoptimized solution can be implemented. In the case of the disturbance being a cooling system failure, the condensing vapors will not flow into the jacket and we can assume that the heat transfer coefficient from the reaction mass to the jacket fluid becomes (near-)zero. AMULT then becomes zero for t1 () n∆t min) e t e [(n + i)∆t] min in the energy balance equation (eq 14c). The modified MOO problem is, therefore, described by Modified. min I1[p(t), TJ] ≡

( ) (

) ( ) ( ) ) ( ) ( ) ( )

µn,f tf,mod + w1 1 tf,ref µn,ref

2

w3 1 -

min I2[p(t), TJ] ≡

(

xm,f 2 + xm,ref Tf 2 (17a)

+ w2 1 -

Tdegrad

µn,f [C2]f,mod + w1 1 + [C2]f,ref µn,ref xm,f 2 Tf + w3 1 w2 1 xm,ref Tdegrad 2

2

(17b)

subject to mass balance equations refs 20 and 21 energy balance equation (eq 14c) with

(17c)

AMULT ) 0 for (n∆t min) e t e [(n + i)∆t] min (17d) AMULT ) 1 for [(n + i)∆t] min e t e tf,mod Bounds. pmin e p(t) e pmax

(17e)

TJ,min e TJ e TJ,max

(17f)

It may be emphasized that in the modified problem, the conditions in the reactor over 0 e t e t1 are the same as for the selected operating point in the original problem for all Np chromosomes. In addition, continuity of all the variables must be ensured at t ) t1. The procedure of applying the bounds remains the same as for the original problem. Failures of other kinds can be taken care of in a similar manner, by modifying the model equations appropriately. However, as mentioned earlier, the more the modifications made in the model equations, the less the advantage offered by B-NSGA-II-aJG. Use of the original NSGA-II-aJG11 for computing the reoptimized solutions of eq 15a requires excessively large amounts of real time. Because of this, premature and only approximately optimal solutions have been implemented in our earlier experimental studies11 on the online optimizing control of methyl methacrylate polymerizations. B-NSGA-II-aJG clearly provides a better alternative. Results and Discussion We now use NSGA-II-aJG and B-NSGA-II-aJG to solve the four sample problems. The same computational parameters of

Ind. Eng. Chem. Res., Vol. 48, No. 17, 2009

8061

Figure 5. (a) Pareto optimal sets for the original and the modified versions of problem 1 (using NSGA-II-aJG with maxgen ) 100). (b) Corresponding optimal values of variable X. (c) σ2 for the original problem using NSGA-II-aJG. (d) σ2 for the modified problem using both B-NSGA-II-aJG and NSGAII-aJG. (e) Pareto optimal front for the modified problem using B-NSGA-II-aJG and using (f) NSGA-II-aJG for 10 generations each.

GA are used for both the techniques for any particular problem. The computational parameters used for the four problems are given in Table 1. The results for problem 1 are shown in Figure 5. The converged Pareto fronts for the original as well as the modified problems, using the conventional NSGA-II-aJG (without using the biogenetic law adaptation) for both cases are shown in Figure 5a at the 100th (maxgen) generation in each case. This diagram illustrates the change in the range of the Pareto solutions. The optimal value of the decision variable, X, is plotted against the

first objective function, I1, in Figure 5b. This diagram shows that as I1 increases, X decreases for the original problem, but the reverse is observed for the modified problem. This figure also confirms that the optimal search space is quite different for the two problems. Figure 5c shows how σ2 varies with the number of generations for the original problem (NSGA-II-aJG used). It is observed that near-converged solutions for the original problem are obtained in as few as 8 generations (σ2 e 0.1), but a better spread of the Pareto points is obtained in about 13 generations. Hence, maxgen1 is chosen as 13. A hundred

8062

Ind. Eng. Chem. Res., Vol. 48, No. 17, 2009

Figure 6. Pareto optimal points vs chromosome number for the (a) original problem 2 and (b) modified problem 2 (using NSGA-II-aJG with maxgen ) 100). (c) Corresponding optimal decision variables (X1 and X2) for the maxgenth generation. (d) σ2 for the original problem using NSGA-II-aJG. (e) σ2 for the modified problem using both B-NSGA-II-aJG and NSGA-II-aJG.

members are selected randomly from these 13 generations of the original problem and used as the initial population in B-NSGA-II-aJG, and the modified problem is then solved. Figure 5d shows σ2 (deviation from the converged results shown in Figure 5a) for the modified problem using both NSGA-IIaJG (using a randomly generated initial population) as well as B-NSGA-II-aJG. The converged reference results for the modified problem, obtained using NSGA-II-aJG (shown in Figure 5a), are used to compute σ2 instead of the high-generation converged results from B-NSGA-II-aJG, since the two are the

same. Figure 5d shows that B-NSGA-II-aJG converges in about 5-8 generations whereas NSGA-II-aJG takes about 13 generations to give near-converged results for the modified problem. Figure 5e and f compares the Pareto optimal fronts obtained for the modified problem at the 10th generation, using both B-NSGA-II-aJG and NSGA-II-aJG, respectively. The Pareto front obtained in the latter case is not converged, even for this simple problem. The CPU time (Pentium 4, 3.4 GHz, 1.0 GB RAM) for this benchmark problem is less than 1 min for 100 generations.

Ind. Eng. Chem. Res., Vol. 48, No. 17, 2009

8063

Figure 7. (a) Pareto optimal sets for the original and the modified versions of problem 3 (using NSGA-II-aJG and B-NSGA-II-aJG with maxgen ) 71). (b) σ2 for the original problem using NSGA-II-aJG. (c) σ2 for the modified problem using both B-NSGA-II-aJG and NSGA-II-aJG. (d) Pareto optimal front for the modified problem using B-NSGA-II-aJG and using NSGA-II-aJG for 25 generations each.

The results for problem 2 are shown in Figure 6. Here, the modified problem has three objectives whereas the original problem has only two. A better method is used to plot17 the converged results (objective functions) than in problem 1. The converged results (objective functions) are plotted against the chromosome number, such that (any) one of the objective functions is in ascending or descending order. The results are shown in Figure 6a and b at the 100th (maxgen) generation for the two problems (using NSGA-II-aJG in both cases). The corresponding optimal variables, X1 and X2, are plotted against each other in Figure 6c. A clear Pareto set is observed for the original problem. In contrast, the points for the decision variables show some scatter for the modified problem. Figure 6d shows how σ2 varies with the number of generations for the original problem (NSGA-II-aJG used). It is observed that near-converged solutions for the original problem are obtained in 21 generations (σ2 e 0.1), but a better spread of the Pareto points is obtained in about 25 generations. Hence, maxgen1 is chosen as 25. A hundred members are selected randomly from these generations of the original problem and used as the initial population in B-NSGA-II-aJG. The modified problem is then solved. Figure 6e shows σ2 for the modified problem using B-NSGA-II-aJG. σ2 is also plotted here for the case when NSGA-II-aJG (randomly generated initial population) is used. The converged reference results for the modified problem, obtained using NSGA-II-aJG (shown in Figure 6b), are used to compute σ2. Figure 6e shows that B-NSGA-II-aJG converges in about ten

generations whereas NSGA-II-aJG takes about twenty generations to give near-converged results for the modified problem. The CPU time (Pentium 4, 3.4 GHz, 1.0 GB RAM) for this benchmark problem is less than 1 min for 100 generations. In problem 3 (phthalic anhydride reactor), the sum of the upper bounds taken for each bed (catalyst zone + spacing) is larger than Ltube, something that cannot be avoided if one wishes to explore a more extensive decision-variable space and generate better solutions. At the same time, the more expanded the decision-variable space, the lower the efficiency of the algorithm. In such cases, the algorithm requires a large number of generations to converge. This problem is overcome by using some kind of a “guided” strategy. A random generation of the decision variables by the MOO code frequently leads to low yields (