Multiobjective Batch Plant Design: A Two-Stage Methodology. 2

2. Development of a Genetic Algorithm and Result Analysis ... for multiobjective batch plant design, that is, the implementation of a dedicated geneti...
0 downloads 0 Views 507KB Size
Ind. Eng. Chem. Res. 2002, 41, 5743-5758

5743

Multiobjective Batch Plant Design: A Two-Stage Methodology. 2. Development of a Genetic Algorithm and Result Analysis Leonardo Bernal-Haro, Catherine Azzaro-Pantel,* Luc Pibouleau, and Serge Domenech Laboratoire de Ge´ nie Chimique, UMR CNRS 5503, ENSIACET INPT, 118, Route de Narbonne, 31077 Toulouse Cedex, France

This second part of this series of papers deals with the second stage of the methodology developed for multiobjective batch plant design, that is, the implementation of a dedicated genetic algorithm. This procedure can be viewed as the search engine for workshop configurations and uses the simulation model as a subroutine for evaluating the generated structure feasibility. First, the basic principles of genetic algorithms (GAs) are briefly recalled; second, the GA, especially developed for treating batch plant design, is largely presented. An interesting concept has been introduced, that is, the so-called “gene fridge” to prevent population degradation by allowing the introduction of previous genes during evolution. Besides, this procedure allows a better screening of search space and the evaluation of the problem combinatorics. Finally, this concept drives the search toward critical steps, by introducing a varying mutation probability function of the gene locus. An illustrative example, already presented in part 1 of this series, is largely analyzed and provides useful guidelines for treating similar problems. The parametric study is particularly interesting for GA parameter setting, which constitutes a key problem. 1. Introduction Part 2 of this series of articles deals with the development and application of a genetic algorithm (GA), which, coupled with the design-oriented discrete-event simulator, constitutes the core of our two-stage methodology for batch plant design. Let us recall that stochastic algorithms are used more and more for global optimization (see Pibouleau et al.1 for a detailed survey of optimization techniques in chemical engineering applications). This kind of algorithm has been applied successfully to many combinatorial optimization problems in such diverse areas as computer-aided design of integrated circuits, image processing, design of heat exchanger networks2 and scheduling.3 To our knowledge, in the chemical engineering field, several batch plant scheduling and design problems have been solved using simulated annealing (SA) procedures.4-10 Investigators have applied SA5,11 for minimizing the total time to produce a set of batches in the serial flowshop with unlimited storage,8 under various storage policies,7 and for minimizing tardiness (difference between completion time of late products and their prior due dates) in a network of single-stage, unrelated parallel units.9 These experiments showed that SA represents a good alternative over heuristic methods most widely used for this purpose, above all for real-world size problems. A hybrid global optimization technique combining SA algorithm with linear programming (LP), SA/LP algorithm, is also proposed in Yuan and Chen12 for solving nonconvex MINLP for optimal design of multipurpose batch process with multiple routes. It allows the existence of intermediate storage, parallel units in and out of phase, and discrete equipment size. A SA/LP algorithm is implemented to solve the proposed MINLP * To whom correspondence should be addressed. E-mail: [email protected].

model. The results of illustrating examples are conducted and compared with those of some previously published papers. SA/LP procedure finds the global optima and is shown to be far more efficient in computing time. This work shares the similar idea of the methodology architecture presented in this work. In the chemical engineering literature, very few papers are concerned with genetic algorithms. In Cartwright and Long,13 a GA has been implemented as an optimization procedure for determining both chemical feed order and/or topology in a chemical flowshop, with minimization of the makespan of the products as an objective function. In their study, GA has provided a reliable method for finding near optimum feed order/ topology combinations. A GA has also been implemented for the optimal design of the multiproduct batch chemical process.14,15 Another application concerns the optimization of reactor networks16 or heat exchanger networks.17 In a previous work, we investigated the potential of GA for solving the problem of optimization of a product sequence so as to minimize a criterion based on product average residence time.18,19 Use of SA to solve the same problem has been previously treated by Peyrol et al.20 The main advantage of GA over SA is that it works with a set of potential solutions instead of a single one. This asset has been taken into account in the optimization loop of our two-stage methodology. This second step of our work can be viewed as the search engine for workshop configurations and uses the simulation model as a subroutine for evaluating the generated structure feasibility. This article is divided into three sections: first, the basic principles of genetic algorithms are briefly recalled; second, the GA, especially developed for treating batch plant design, is largely presented; third, results obtained from the twostage methodology are analyzed. The illustrative example concerns the didactic workshop which serves as a test bench in part 1 of these articles.

10.1021/ie0106478 CCC: $22.00 © 2002 American Chemical Society Published on Web 10/12/2002

5744

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002

Figure 2. Structure encoding. Table 1. Available Size Range (Encoding Example) equipment type/operation

Figure 1. Basic principles of a GA.

2. Basic Principles of Genetic Algorithms21,22 The GA uses a “guided random search” in which many different solutions to a problem are investigated and refined simultaneously to identify near-optimum solutions in a reasonable time. This procedure is based on the natural processes of heritage and selection:22 in the “population”, each individual represents a potential solution. In our case, each individual is constituted by a jobshop plant configuration, which will be tested by the GA. Although there are many possible variants of the basic genetic algorithm, the fundamental underlying mechanism operating on a population of individuals is relatively standard and involves three essential operations: (i) A method for coding solutions to the problem in strings of digits (or “chromosomes”). (ii) An evaluation function with takes a string as input and returns a fitness value which measures the quality of the solution that the chromosome represents. (iii) An adaptive plan, which consists mainly of the evolution and selection, based on the string crossover and mutation operators. As illustrated in Figure 1, the process of a GA consists of the following steps. A population of encoded solutions is created randomly. The different strings are evaluated with a fitness function. Two strings are selected with a probability depending on their evaluation, the strings with the highest evaluation being more likely to be selected. The two strings are then manipulated with the genetic-like operators (i.e., crossover and mutation) to create two children that then replace two strings of the previous generation. The cycle {evaluation, selection, crossover, and replacement} is then repeated until a stop criterion is reached (maximum number of iterations

size range

(1) reactor

(2) settling tank

(3) filter

1 2 3

2000 1000 500

2000 1000 500

1000 500 250

associated with a maximum number of generations, nonevolution of the evaluation function, etc.). Genetic algorithms differ from most optimization methods in that they make almost no assumptions about the problem space and yet produce a global search. They tend not to become trapped in local optima because they do not search along the contours of the function being optimized and do not rely on gradient information. 3. Implementation of the Design-Oriented GA This section presents the specific procedures introduced in the development phase of the design-oriented GA. Let us note that it is very similar to a classical GA. The main differences lie in the so-called “hill-climbing” procedure implementation and in the use of an innovative concept called the “string fridge”. 3.1. Encoding of Workshop Structures. In the GA developed in our research, a chromosome represents a specific jobshop configuration. Only the set of equipment process units is encoded (the discrete size of intermediate storage vessels is determined according to needs calculated by the simulator). The size of the chromosome is computed for the maximum number of operations in the whole recipe list. Each gene encodes the number of parallel equipment for each available size into a string of decimal digits. The encoding procedure is illustrated for an example of a plant with three operations (see Figure 2), that is, reaction, settling, and filtration. The available size range is presented in Table 1. The chromosome 010-210-002 contains the genes [010], [210], and [002]. The first gene corresponds to the first unit operation and the list 010 implies that there is no reactor of 2000 L, one reactor of volume 1000 L, and no reactor of volume 500 L. The encoding is the same for the other genes. In Figure 2, the unselected equipment units appear in a gray background. A coherence test verifies the presence of at least one equipment unit per unit operation.

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002 5745

3.2. Initial Population Creation. Although several methods exist for creating the initial population,22 the chosen technique consists of a random string initialization. This strategy guarantees a population various enough to explore several zones of the space search. Note also that the optimum position in the search space is completely unknown. The configurations proposed by the population generation procedure are uniformly distributed with respect to parallel equipment number per unit operation. This means that the probability is the same for generating a configuration with two equipment units as for a configuration of 4 equipment units per any unit operation, even if the number of configurations with 4 units is greater than this with 2 units in solution search space: 6 possible configurations with 2 units instead of 15 configurations for 4 units (see the results in the Appendix for the configuration enumeration). Note also that all the configurations of the initial population verify the establishment of the steady-state regime and, consequently, the required production level. The feasibility of each configuration is actually tested before being integrated in the initial population, until the set of the so-generated configurations reaches the predetermined population size. 3.3. Fitness Evaluation. The optimization criterion considered for fitness evaluation involves investment cost for equipment and storage vessels. Each plant configuration cost (Costp) is calculated by NOP NEQi

Costp )

∑ ∑ i)1 j)1

NSV

(Ai + BiVijCi) +

(As + BsVskC ) ∑ k)1 s

where NOP is the number of operations, NEQi is the number of equipment items (for operation i), NSV is the number of storage vessels, Ai, Bi, and Ci are the cost coefficients for operation i, As, Bs, and Cs are the cost coefficients for storage vessels, Vij is the volume of equipment ij, and Vsk is the volume of storage vessel k. Traditionally, a GA uses a fitness function, which must be maximized. Since in this work a minimization case is involved, the problem has been transformed so that the strongest individuals represent the cheapest configurations, to favor the propagation and evolution of this structure. Two classical approaches have been used: (i) Min[Cost(i)] equivalent to Max[1/F(Cost(i)] (ii) Min[Cost(i)] equivalent to Max[“Maximum cost” - Cost(i)] In both cases, a factor of elitism has been introduced to adjust the discrepancy between “good” and “bad” solutions (either by increasing or decreasing its value). Since the results obtained are comparable, the second approach has been arbitrarily adopted. The individual fitness (fitnessi) is calculated by

Fitnessi ) Cmax - Costi where Cmax represents the adjustment factor:

Cmax ) [elitism factor] × [cost of the less favorable actual configuration] It has been observed, on one hand, that values of the elitism factor near 1 favor the elimination of the worst solution, generation after generation, whereas values

greater than 2 lead to less elitist evolutive paths. On the other hand, the configurations which do not allow the steady-state regime to be reached after some campaigns are considered as unfeasible and a fitness equal to 0 is attributed to them to avoid propagation of undesirable genes. 3.4. Selection Procedure. The selection process is founded on the individual fitness; in this study, due to the simplicity of implementation, Goldberg’s biased roulette wheel22 has been adopted. To prevent a premature local convergence, a limitation in the number of copies of the same individual is introduced. To control the copy numbers, the selection of an individual occurs in two steps: i. First, a stochastic selection by Goldberg’s wheel is carried out. ii. Then, if the individual has been preselected, a second test is carried out: its genes are compared with those of the individuals already selected; if the maximum value of the copy number is reached, the individual cannot be selected anymore and a zero force is attributed, to suppress him in future random sorts. Other methods exist, such as scaling,22 but have not been adopted here due to a relatively limited size of the chromosomes. The best individual is also preserved from one generation to another to avoid its premature loss during genetic operator application. This procedure is commonly referred to as elitism. A privileged place is attributed to keep the best individual generated until the actual generation and to give him the possibility to generate offspring. For instance, if the population is composed of 50 individuals, a 51st place is attributed to the best historical solution, which can be coupled or mutated according to the classical stochastic rules of crossover and mutation. Of course, the birth of a better individual will imply its replacement of the chromosomes of the 51st individual. When using a GA, a compromise must of course be found between elitist trends (and even conservative policies) and solutions favoring the diversity of the explored solutions (and even revolutionary policies). Thus, an efficient GA must be able to extract from evolutive traps and must also have the possibility to explore the whole search space. In this way, mutation has been combined with the concept of the string fridge. 3.5. Implementation of the “Gene String Fridge”. The performance of the GA is enhanced if the diversity of the solutions is increased. The usual way to do this is to increase the size of the population but this leads to increased computing efforts since more solutions need to be evaluated and the danger still remains that the population could be overrun by one superfit individual. To prevent this, a method has been suggested by Costello23 and used by Tuson.24 This method, known as the string fridge, can achieve an increase in diversity without drastically increasing computational effort, hence avoiding premature convergence. In addition to the population that the GA operates on, the original method consists of generating a subset of the population where the solutions are left untouched during the application of genetic operators, thus composing the socalled string fridge. This procedure has the effect of increasing diversity as more schemata are available to the GA without increasing the computation phase required, as the solutions of the string fridge are not recalculated every generation. Besides, as solutions are

5746

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002

10 times greater (see Bernal et al.25). This is why the remainder of this paper only considers the string fridge implementation. In what follows, three interesting applications of the gene fridge implemented in this study are presented: (i) String Fridge as an Antidegenerating Policy. In the approach selected here, the set of test genes (ensuring feasibility) is first available at any time; second, since genes instead of chromosomes in the original method are stored, the memory space required by the fridges is considerably reduced. The procedure has yet a disadvantage since the fitness of the individuals generated from the combination of genes of the main population and those from the fridge has to be recalculated. (ii) Combinatorial Aspect of the Problem and Search Space Definition. It is noticeable in our approach that the combinatorial aspect is not dependent on the number of products, but only on the number of different unit operations. The total possible configuration number can be expressed as follows (see Appendix): Nop

Configuration number ) Figure 3. Gene fridge.

constantly being swapped between the population and the fridge, old solutions are reintroduced later on in the run, hence releasing potentially useful schemata that have been lost from the main population. A variant of this method is proposed in this work: the string fridge is not composed of chromosomes but of genes. The sostored genes can then be introduced again during mutation and the hill-climbing procedure. Let us recall that the proposed coding considers a gene per unit operation. The proposed gene fridge uses a fridge for each unit operation in which all the genes belonging to the feasible configurations are evaluated along the evolutive process. Let us illustrate the gene fridge concept by an example involving four workshop configurations, ensuring the establishment of the steadystate regime. The workshop is composed of five different unit operations and the chromosomes of the individuals are thus composed of five genes. The corresponding fridges are represented in Figure 3. The fridges are updated every generation end by introducing new genes that may have been generated along the evolution. It must be emphasized that only two events (except for initial population creation) can generate a new gene, that is, mutation of an existing gene and equipment nouse (for instance, gene 211 can be replaced by 201 if the unit of average capacity is not used). Note also that there is no analogy between the gene string fridge and the classical genetics principles. Yet the consequence of this procedure can be compared with the presence of recessive genes in a diploid chromosome. In the approach presented here, chromosome coding only represents the dominant genes, which characterize the immediate performances of the individuals. The gene fridge allows modeling of recessive genes, which can be reintroduced at any time during the evolutive process. Let us note that a comparative study to show how the algorithm performs with and without the fridge was initially performed. The main conclusion was that the performance of the algorithm without the string fridge was redhibitory from the number of explored solutions,

γ [CNeq ∏ +γ - 1] i)1 i

i

i

when Nop is the number of different unit operations, Neqi is the maximum number of parallel equipment for operation i, and γi is the possible size range for equipment i. Since for all unit operations i, Neqi has been taken equal to 10 and γi to 3, the total configuration number is equal to 285Nop. This solution set contains both feasible and unfeasible configurations. Moreover, it has been said that the chromosomes are also modified as a function of the effectively used equipment, hence markedly reducing the space search and, consequently, the “real” combinatorics of the problem, which turns out to be difficult to appreciate and can only be determined a posteriori. In this context, the fridge use aims at defining more precisely the search space and determining the number of the feasible configurations. By considering once more the example of Figure 2, the first gene has three different values, the second one two values, and so on. By calculating the product of the fridge sizes, it is thus possible to estimate the number of configurations which are potentially explored by the GA. In the example, this search represents 3 × 2 × 4 × 4 × 3, that is, 288 configurations. Of course, this number is far higher for real problems, thus justifying the interest of a GA. Every time a new gene is stored in memory, the search space size increases. (iii) String Fridge as an Indicator of Critical Steps. Let us recall that in our approach every gene represents parallel equipment to carry out a unit operation type. Generally, workload allocation is not uniform. It is then usual that the unit operations dedicated to several recipes and to different tasks in the same recipe may constitute critical steps due to the competition between several products to use the same equipment. This leads to a more marked combinatorial aspect for the genes corresponding to these steps since they require a more significant number of equipment. This explains why critical steps imply greater gene fridges. The “fridge size” can hence be regarded as a variable representing the combinatorial complexity degree for specific points of the chromosome. This indicator will be consequently

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002 5747

used when applying mutation operators and individual improvement by the hill-climbing procedure. 3.6. Crossover Operator. Crossover is not applied to all pairs of individuals selected for mating. A random choice is made, where the likehood of crossover being applied is typically between 0.6 and 1. If crossover is not applied, offspring are produced simply by duplicating the parents. Three forms of crossover operator are available: (i) simple crossover; (ii) two-point crossover; (iii) uniform crossover. They all conserve the position or locus of the genes. This choice is justified since every locus represents a different unit operation and the gene values (i.e., equipment needs) may markedly differ among the different operations. Despite the great number of crossover operators reported in the literature, the procedures adopted here belong to the category of conservative ones, which have proven their efficiency.26 Note that clones are detected to avoid a successive evaluation of the same individual (which constitutes the limiting computation time). To be concise, only the uniform crossover procedure is presented in what follows. Uniform Crossover.26 In this procedure, each gene in the offspring is created by copying the corresponding gene from one of the other parent, chosen according to a randomly generated crossover mask. When there is a 1 (respectively a 2) in the crossover mask, the gene is copied from the first (respectively from the second) parent. The procedure is repeated with the parents exchanged to produce the second offspring. A new crossover mask is randomly generated for each pair of parents. Offspring therefore contain a mixture of genes from each parent. It is recognized that the increased disruption of uniform crossover is beneficial if the population size is small in comparison to the problem complexity and so gives a more robust performance. 3.7. Mutation Operator. Mutation is applied to each child individually after crossover. It randomly alters each gene with a smaller probability than the crossover one. Mutation provides a small amount of random search and helps ensure that no point in the search has a zero probability of being examined. A variety of mutation operators is available in the literature. In this work, three mutation types have been implemented. As for crossover, they respect the gene position in the chromosome: (i) gene modification by equipment elimination; (ii) gene modification by equipment size change (equipment elimination or addition are included); (iii) change of a gene value, randomly chosen from the string fridge. Since the two first mutation operators may turn out to be conservative, mutation is only applied over one or two genes according to stochastic decision. A coherence test verifies the choice of two different locii. A Critical Step-Oriented Search. As aforementioned, the introduction of the string fridge concept allows detection of critical steps. This notion does not refer directly in our approach to equipment use, which can be computed from the variables used by the simulator. A critical step means here that the step is likely to use several parallel units, thus contributing to an increase in the problem combinatorics. Whereas in the classical GAs, mutation is uniformly applied to all genes, a greater mutation probability is allocated in this work to the genes for which the combinatorial aspect is more marked, thus preventing useless searching on steps,

Table 2. Mutation Probability Computation per Locus locus (gene)

fridge size

mutation probabilty per locus

1 2 3 4 5

3 2 4 4 3

3/16 ) 2/16 ) 4/16 ) 4/16 ) 3/16 )

0.1875 0.1250 0.2500 0.2500 0.1875



16

16/16 )

1.0000

Figure 4. Goldberg’s biased wheel.

generating less conflict (an equipment addition to a nonlimiting step is unnecessary since it will be later eliminated if not used). This assumption rewards innovation in solution search since the locus mutation probability increases when a new gene appears. The choice of the mutating gene occurs by an equivalent Goldberg’s biased roulette wheel.22 Let us illustrate this procedure by the example of Figure 2. The mutation probability of each locus is calculated in Table 2 from the corresponding data on gene fridge size and Figure 4 presents the roulette wheel for the selection of the gene to mutate. It can be observed that genes 3 and 4 have a twice-greater probability to be mutated than gene 2. For the sake of illustration, only the third mutation policy is presented. Change of a Gene Value, Randomly Chosen from the String Fridge. This procedure consists of the change of a gene value by its replacement by a gene from the string fridge. The choice of the gene to introduce is carried out randomly. The procedure is illustrated in Figure 5. Some Comments on Mutation Operator Use. The first mutation operator (equipment suppression) turns out to be very efficient in the early stages of the GA by favoring the elimination of over-equipped structures. Yet equipment suppression is not invariably the best solution since in some cases the storage needs can be excessively increased and moreover lead to unfeasible solution zones. The mutation procedure based on equipment size change can thus improve the situation. But a decrease in equipment change may have the same consequences as elimination whereas an increase in equipment size has the opposite effect. The main interest in the third mutation operator is to move within solution space. This procedure can perform very well for genetic inheritance regeneration

5748

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002

Figure 5. Mutation procedure by change of a gene value, randomly chosen from the string fridge.

where a very good solution is surrounded in its immediate vicinity by very bad solutions (and/or unfeasible ones). 3.8. The “Hill-Climbing” Procedure. The basic principle of the hill-climbing method consists of both search and evaluation of neighboring individual fitness of the actual solution. If the best individual exists, he will replace the given solution. The application of this technique in GA can be very efficient, although greedy in computational time. Consequently, it is used only for the best individuals (their number is a parameter fixed by the user). For each of these individuals, a neighboring solution is generated (by applying the aforementioned mutation operators) and its fitness is calculated. If an improvement is observed, this solution will replace the previous one. Unlike mutation, this procedure allows only chromosome changes leading to an improvement in individual performance. 3.9. Stop Criterion. The stop criterion considered in this study concerns a maximum number of generations to reach. Like the other parameters of the model (crossover/mutation probability, population size), it has first been determined experimentally. The consideration of other criteria such as fitness nonevolution is yet attractive but risky since it is very difficult to characterize a nonevolution due to a local optimum or to the global one. A preliminary study has shown that the number of generations has to be greater than the population size to have a good compromise between the computational efforts and the quality of the results.

3.10. Clone and Survival Detection. By nature, GAs create clones by application of selection, crossover, and mutation procedures. Besides, some individuals can survive along several generations. To reduce the call number to the simulator that constitutes the limiting step, it is thus necessary to keep information concerning the configurations that have been tested in the previous stages of the algorithm: first, a variable is introduced to detect immortal individuals; second, a chromosome comparison test is achieved in which immortal individuals are compared with the new individuals of the actual generation to detect the clones; this test is all the more important in the late stages of the evolution where clone probability is much more significant. The interest of both procedures is illustrated in Figure 6 for the example that will be largely mentioned in this paper. This figure displays the call number to the simulator vs generation number, both effective (curve 1) and theoretical (curve 2), and clearly shows the interest of taking into account clone and survival detection. 3.11. Algorithm Flowchart. Finally, the general flowchart of the algorithm implemented in this study can be summarized as follows: STEP 1. Data initialization (manufacturing recipes, required production level, equipment size range, cost coefficient, simulator stop variables, GA parameters) STEP 2. Creation of the initial population

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002 5749

Figure 6. Interest of clone and survival detection.

STEP 3. Initialization of gene fridges and computation of mutation probability per gene GA Implementation. The following steps are repeated for all generations: STEP 4. Clone and survival detection STEP 5. Gene fridges updating STEP 6. Evaluation of individual performances (for those who have not been tested in the previous generation) STEP 6.1. Call to the simulator AD-HOC (i) If the configuration does not lead to the establishment of steady-state regime, then the individual will not participate in steps 6.2-6.4 and a null fitness is attributed at step 6.5. (ii) If the configuration leads to the establishment of a steady-state regime, go to step 6.2. STEP 6.2. Gene (and storage vessel) change as a function of the effective used equipment STEP 6.3. Computation of investment cost STEP 6.4. Search for best and worst solutions (as a function of investment cost) STEP 6.5. Computation of individual fitness STEP 6.6. Insertion of the best individual at the location “ population size +1” STEP 7. Selection procedure STEP 8. Hill climbing for the best individuals STEP 9. Crossover procedure STEP 10. Mutation procedure STEP 11. If the maximum number of generations is reached, then result report; otherwise, go to step 4. 4. Results Presentation and Analysis The algorithm is applied to the example presented in part 1 of these articles (see section 5 of part 1). Five test series have been carried out for the GA by varying population size, generation number, crossover and mutation probabilities, maximum number of clones, and number of individuals on which the hill-climbing procedure is applied. For each series, the GA has been run

Table 3. GA Parameters series 1 population size generation number crossover operator crossover probability mutation probability maximum number of copies (% of population) hill climbing for [n] individuals elitism factor

50 50

series 2 30 100

series 3

series series 4 5

30 50 250 200 the three types 0.90 0.90 0.70 0.75 0.10 0.10 0.10 0.05 5 5 5 10 (10%) (16.7%) (16.7%) (20%) no no no no yes [5] yes [5] yes [3] yes [5] 1.5 1.5 1.5 1.5

100 100 0.70 0.02 20 (20%) no yes [5] 1.5

five times. The data set is presented in Table 3. In this work phase, the GA parameters have been arbitrarily fixed. 4.1. Typical Results. Let us recall that a GA does not guarantee the global optimum will be obtained but can lead to a set of good ones. This can be interesting from a design viewpoint since the cheapest configuration is not necessarily, in practice, the most interesting one. The three best configurations found in one run are presented in Table 4, where (EQ.I_J) represents equipment number J of type I. This set of the best solutions comes from the last generation. Other interesting results concern fitness evolution. More precisely, the investment cost of the best and worst historical configurations as well as the average of investment costs for configurations of every generation (only those leading to steady-state regime are considered) are reported for each run. These evolutions are significant to the GA performances and can help with parameter setting. Let us examine two cases: (i) A Conservative Case. Figure 7 displays the evolution curves for a run of series 5, with relatively low crossover and mutation rates (0.7 and 0.02, respectively) for a population of large size (100 individuals) with a maximum number of the same individual limited to 20. The best solution has been found rapidly (15th generation), from which no more evolution is observed for the objective function. It is difficult a priori to know if the

5750

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002

Figure 7. Cost evolution with GA conservative parameters. Table 4. Example of the Third Best Configurations equipment EQ.1_1 EQ.1_2 EQ.2_1 EQ.2_2 EQ.3_1 EQ.3_2 EQ.4_1 EQ.5_1

1000 1000 2000 1000 2000 2000 1000 1000

EQ.1_1 EQ.1_2 EQ.2_1 EQ.2_2 EQ.3_1 EQ.3_2 EQ.4_1 EQ.5_1

1000 1000 2000 1000 4000 2000 1000 1000

EQ.1_1 EQ.1_2 EQ.2_1 EQ.2_2 EQ.3_1 EQ.3_2 EQ.4_1 EQ.5_1

1000 1000 2000 1000 2000 2000 2000 1000

storage tanks

investment cost

First Configuration ST 1 2000 storage tanks: ST 2 2000 equipment: ST 3 4000 ST 4 4000 total:

Second Configuration ST 1 2000 storage tanks: ST 2 1500 equipment: ST 3 4000 ST 4 1900 total:

Third Configuration ST 1 2000 storage tanks: ST 2 2000 equipment: ST 3 4000 ST 4 4000 total:

197 742 1 379 418 1 577 160

189 656 1 406 544 1 596 200

197 742 1 398 941 1 596 683

best solution found up to now is good enough or if a better one can be found (this is why several runs are necessary). It can also be observed that selection is the predominant phenomenon since the discrepancy between the average cost and the best solution cost is very low (above all from the 40th generation). This evolution may be dangerous, thus leading to a local optimum trap from which it will be very difficult to go out due to the low value of the mutation rate. The worst configurations have a rather limited influence on cost average because of the large size of the population; for instance, a bad solution that appeared at the 77th generation survived until the 85th one, without markedly increasing the average cost. This can be attributed to the limitation

in the number of copies and to a low value for crossover rate: a set of similar individuals reaches rapidly its maximum number of copies and allows the selection of an individual whose performances are not as interesting. In conclusion, Figure 7 typically illustrates the choice of conservative parameters, that is, the risk to be blocked on a local optimum despite a good trend for the evolution curve. (ii) An Innovative Case. Figure 8 displays the evolution curves for a run of series 2, with higher crossover and mutation rates (0.9 and 0.1, respectively) for a population of small size (30 individuals). The best solution has been obtained at the 92nd generation and is far better than the one found in the previous case. Besides, at the 13th generation, a solution equivalent in cost to the best solution found in the previous case has been obtained and the higher values of mutation and crossover rates have allowed the evolution toward better solutions. It can also be observed that the discrepancy between the average cost and the cost of the best solution is more important. This is due, on one hand, to higher values of mutation and crossover rates and, on the other hand, to a relatively low population size (for instance, bad solutions have a big influence on average cost around the 70th generation). 4.2. Combinatorial Aspect and Gene Fridge Size. Table 5 presents the gene fridges obtained with the data of series 3 (by applying crossover 3 and the hill-climbing procedure on the 3rd best configurations. Note that the genes belonging to feasible configurations are stored in the gene fridges in the same order, as they have been tested during the evolutive process. For unit operation of type 4, the gene locus has only taken two different values, [010] and [100], which corresponds to an equipment unit of either average or large size; thus, unit operation 4 does not constitute a critical step. In contrast, the gene locus corresponding to unit operation of type 1 has taken 39 different values, involving workshop structures containing from 2 to 9 parallel equipment items). This result then suggests that such an operation constitutes a critical step from a design

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002 5751

Figure 8. Cost evolution with GA innovative parameters. Table 5. Gene Fridges for the Didactic Example

Table 6. Parameters of Configuration 1 Configuration 1 (Total Cost: 1 577 160)

unit operation type gene fridges

fridge size

1

2

3

4

5

700 500 006 300 070 004 111 150 400 220 005 240 040 009 042 321 021 031 060 120 043 200 121 014 510 032 011 221 051 211 600 101 311 020 110 012 030 130 104

020 004 400 300 200 110 102 230 040 130 006 210 201 220 310 005 030 301 120 021 031 112 011 111 101 012

400 004 030 300 040 301 006 020 022 130 120 220 500 021 201 200 210 031 110 005

010 100

300 013 200 130 103 040 030 020 101 100 006 010 003 121 400 012 002 102 120 011 111 005 110 004

39

26

20

2

24

viewpoint. This is not surprising since this operation (i.e., reaction in a double-jacketed reactor) is used for shared intermediate product 1 (SIP1) manufacturing in

Equipment (Equipment Cost: 1 379 418) EQ.1_1 EQ.1_2 EQ.2_1 EQ.2_2 EQ.3_1 EQ.3_2 EQ.4_1 EQ.5_1 1000 1000 2000 1000 2000 2000 1000 1000 Storage Tanks (Storage Cost: 197 742) ST 1 ST 2 ST 3 ST 4 2000 2000 4000 4000

4 of the 5 recipes of the final products (for product B, this operation is even used in 2 different steps). In contrast, unit operation of type 4 is only considered in the recipe of final product B. In this example, the gene fridge size is equal to 973440, which constitutes the search space potentially explored by the GA. The explored fraction of this space has an order of magnitude from 0.1 to 0.3%. 4.3. A Posteriori Analysis of the Best Configurations. From the best configurations obtained from the GA, various analysis can be achieved. Two studies are detailed in what follows. 4.3.1. Analysis of the Use Rate of Equipment and Storage Vessels. Equipment and storage vessels use rate for any workshop configuration can be visualized by a Gantt chart. Let us consider for example the results corresponding to the best configuration obtained during the study from a total cost viewpoint (cf. Table 6). The Gantt chart relative to the equipment can be seen in Figure 9. It can be observed that equipment of type 3 (EQ.3_1 and EQ.3_2) have been occupied for almost all campaigns. Equipment of type 4 has only been partially used for all campaigns, confirming hence its nonlimiting feature, whereas equipment of type 5 nearly reaches its maximum use rate. A periodicity phenomenon for equipment use rate can also be observed: for instance, the use rate of equipment of type 1 presents a 3-campaign periodicity: the 2nd campaign (between 600 and 1200 min) has the same features as the 4th one and, identically, the 3rd and 5th campaigns present the same use rate. This behavior is typical of workshops under a steady-state regime. Similar analysis is valid for use rate of storage vessels (see the corresponding Gantt

5752

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002

Figure 9. Equipment use rate for configuration 1.

Figure 10. Storage use rate for configuration 1.

chart in Figure 10). Note that the use rate of the 3rd vessel and even more of the 4th one are very low. Actually, the storage vessel use is very dependent on product scheduling and a more thorough analysis on product management can lead to the elimination of the 4th storage vessel. In fact, the configuration of Table 7 is cheaper when considering total investment cost. Another configuration, very similar to the 1st one, has also been examined by the GA (see Table 7). Only the size of EQ.1_2 is different in these two configurations. With the adopted batch release (all batches are manufactured at every campaign beginning), these configurations require different storage needs. The equipment and storage use rates are respectively shown in Figures

Table 7. Parameters of Configuration 2 Configuration 2 (Total Cost:1 597 844) Equipment (Equipment Cost: 1 365 463) EQ.1_1 EQ.1_2 EQ.2_1 EQ.2_2 EQ.3_1 EQ.3_2 EQ.4_1 EQ.5_1 1000 500 2000 1000 2000 2000 1000 1000 ST 1 2000

Storage Tanks (Storage Cost: 232 381) ST 2 ST 3 ST 4 ST 5 2000 2000 2000 2000

11 and 12. The difference in capacity of EQ.1_2 between the two configurations has an influence on EQ.1_1 use rate (its workload is much more important in configuration 2) but also on other equipment such as EQ.4_1, confirming thus task interdependency in a multiobjective workshop. The impact of size difference for EQ.1_1

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002 5753

Figure 11. Equipment use rate for configuration 2.

Figure 12. Storage use rate for configuration 2.

is more marked for storage needs. For configuration 2, an additional storage vessel is necessary and the stored quantities in vessels 3 and 4 are reduced. It can be suggested at this level that better task scheduling could lead to the elimination of vessel 5. 4.3.2. Flexibility Analysis for Production Increase. When the objective is to increase proportionally the production level of all products, it is judicious to reduce the campaign length while keeping the same number and size of batches. Of course, this parameter setting is very essential since it influences greatly the behavior of the workshop which may or may not reach steady state. A preliminary study was performed to determine acceptable values for this parameter, preventing a quick bottlenecking. Here, a systematic study is carried out to examine its influence and is illustrated

by an example concerning 25 workshop configurations, selected from their equipment investment cost (storage cost is not taken into account). This configuration list has been ranked among the feasible configurations obtained for a run with parameters of series (see Table 6) for three values of campaign length, that is, 600, 590, and 580 min. A cross indicates the configurations that do not reach the steady-state regime (for campaign lengths inferior to 580 min, the steady state is never reached). Chromosome 2, representing the cheapest configuration when considering total investment cost for a campaign of 600 min, does not lead to a steady-state regime for shortest campaign lengths, whereas a steadystate regime is obtained for configuration 1 for campaigns of 590 min.

5754

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002

Table 8. Investigation on GA Parameters criterion

series 1

series 2

series 3

series 4

Table 9. Impact of Crossover Type series 5

best average least

Cost 1 577 160 1 577 160 1 577 160 1 577 160 1 577 160 1 702 236 1 687 031 1 622 030 1 645 331 1 706 906 1 794 357 1 800 713 1 771 404 1 771 462 1 803 761

best average least

754 1396 1952

frequency F 6/30

Call Number 716 1376 1266 2193 1852 3339

1077 2144 3216

1103 2157 3487

5/30

9/30

2/30

17/30

Let us also note that, for all configurations leading to steady state with campaigns of 590 min, the need in storage vessels is less important than that with a campaign of 600 min, thus confirming scheduling interest for storage tanks. It must also be emphasized that the decrease in campaign length does not involve a significant production increase (an increase of 1.66% is observed when the campaign length is decreased from 600 to 590 min). The proposed configurations are actually very near maximum capacity (see the Gantt charts) and are, consequently, less flexible. 4.4. General Investigation on GA Performance. 4.4.1. GA Parameters Influence. In this section, general ideas about the order of magnitude of GA parameters for later uses in similar problems are given. Table 8 presents the results obtained for each simulation run for three important criteria, that is, total investment cost, call number to the simulator, and the frequency with which the best solution represented by the chromosome 020-011-020-010-010 from total investment viewpoint is obtained. The results obtained correspond to the best solution obtained in each simulation series. Let us recall that 30 runs have been carried out for a series (10 for each crossover type, 5 with hill climbing and 5 without). The best solution (with an investment cost of 1 577 160) has been found in all series (2 times for series 5 and 17 times for series 3). Concerning the average cost, the best results have been obtained with the parameters of series 3 and the least results correspond to series 1 and 5. The discrepancies between the least solutions of each series are less marked: the best results correspond to series 3 and 4, with relatively similar values for investment cost, whereas the least ones are attributed to series. When considering the call number to the simulator, it is not surprising that the best results are obtained with a significant number of explored configurations. Nevertheless, algorithm efficiency can be evaluated by its ability to reach good solutions with less effort. Thus, series 5 is noticeably less adapted than series 3 (since several parameters have been changed, it is yet difficult to identify which one is responsible for this performance discrepancy). Yet preliminary studies have proven that the mutation rate has a significant impact: the higher the rate (up to a given limit), the better the solutions obtained. Furthermore, it has been observed that computational effort associated with the use of a large size population is not necessarily rewarded: a strategy based on the evolution of a small size population (30-50 individuals) over a large number of generations has much better performance than one based on the evolution of a large size population (100 individuals) over some generations. A small size population is much more sensitive to sudden changes (thus to mutation), consequently favor-

crossover type series 1 best configuration average configuration least configuration series 2 best configuration average configuration least configuration series 3 best configuration average configuration least configuration series 4 best configuration average configuration least configuration series 5 best configuration average configuration least configuration summary best configuration average of the average configurations least configuration

simple 1 593 709 1 699 461 1 766 250 simple 1 577 160 1 709 514 1 800 713 simple 1 577 160 1 633 063 1 771 404 simple 1 577 160 1 624 596 1 751 938 simple 1 614 859 1 728 993 1 793 056

two-point crossover 1 577 160 1 721 330 1 794 357 two-point crossover 1 601 785 1 677 435 1 746 556 two-point crossover 1 577 160 1 617 739 1 750 805 two-point crossover 1 577 160 1 681 859 1 771 462 two-point crossover 1 596 683 1 726 799 1 803 761

uniform 1 596 200 1 685 916 1 794 357 uniform 1 577 160 1 674 144 1 794 357 uniform 1 577 160 1 615 289 1 759 015 uniform 1 577 160 1 629 540 1 743 506 uniform 1 577 160 1 664 927 1 763 692

simple

two-point uniform crossover 1 577 160 1 577 160 1 577 160 1 679 125 1 685 032 1 653 963 1 800 713

1 803 761

1 794 357

average configuration 1 588 009 (best cases) average configuration 1 776 672 (least cases) frequency of the best one 10/50

1 585 989

1 580 968

1 773 388

1 770 985

8/50

15/50

ing simultaneous elitist trends. This elitism is all the more marked as the call number to the simulator is compared between series 1 and 2. Theoretically, the number of explored configurations would be higher for runs of series 2 since 100 generations of 30 individuals have been necessary (3000 individuals and additional evaluations due to the hill-climbing procedure application) than for those of series 1 (50 generations of 50 individuals). In conclusion, mutation rates between 0.05 and 0.2 seem appropriate for this kind of problem with relatively small size populations (25-50 individuals). A more detailed sensitivity analysis is proposed in section 4.4.5. 4.4.2. Importance of Crossover on GA Performance. The results concerning investment cost for the best solutions are presented in Table 9. A total number of 50 runs for each crossover type have been carried out: 10 for each series (5 with hill climbing and 5 without). It must be observed that the best solution (with a total cost of 1 577 160) has been found for the three crossover operators: when considering all criteria, that is, the average of the best cases, the average of the average cases and the least cases, and the number of times the best solution is obtained, it can be observed that the performances are slightly better with uniform crossover. Nevertheless, the discrepancies with the other crossover procedures remain marginal (less than 2% between the average of the average cases for uniform crossover and two-point crossover, for instance). 4.4.3. Importance of Hill-Climbing Procedure. A similar analysis is proposed in Table 10 to study the

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002 5755 Table 10. Impact of Hill Climbing hill climbing series 1 best configuration average configuration least configuration

without with 1 593 709 1 577 160 1 715 217 1 689 254 1 794 357 1 784 324

series 2 best configuration average configuration least configuration

without with 1 577 160 1 577 160 1 672 423 1 701 639 1 771 462 1 800 713

series 3 best configuration average configuration least configuration

without with 1 577 160 1 577 160 1 637 423 1 606 637 1 759 015 1 771 404

series 4 best configuration average configuration least configuration

without with 1 577 160 1 577 160 1 658 543 1 632 119 1 771 462 1 742 541

series 5 best configuration average configuration least configuration

without with 1 577 160 1 577 160 1 703 887 1 709 926 1 793 056 1 803 761

summary best configuration average of the average configurations least configuration

without with 1 577 160 1 577 160 1 677 499 1 667 915 1 794 357 1 803 761

average configuration (best cases) average configuration (least cases) frequency of the best one

1 580 469 1 577 160 1 777 870 1 780 548 15/75 18/75

impact of the hill-climbing procedure on GA performance, for the 5 series and the 3 crossover operators (5 series × 5 runs per series × 3 crossover types ) 75 runs with hill climbing and 75 without). The results show that the use of hill climbing has a positive impact on GA performances: the best results, except for the least case criterion, correspond to the hill-climbing application. But, once more, the differences are marginal from a statistical viewpoint. 4.4.4. Conclusions of the Comparative Study. In the previous studies, the influences of the crossover type and hill-climbing application have been examined separately, although a correlation between them may exist. Table 11 shows the statistical results obtained for the 5 series of runs by combination of crossover type/with or without hill climbing. On one hand, the discrepancies relative to the average costs are slightly more sensitive to crossover type than to the application of hill climbing. On the other hand, the differences concerning the least solutions are much less important, whereas the best solution has been obtained for all combinations. The same conclusions are valid for the frequency to reach the best solution. Besides, it is clear and logical that the hill-climbing application increases the call number to the simulator for all crossover operators. Although the gain may be viewed as light, the crossover operator

with application of hill climbing has been adopted in what follows. 4.4.5. Sensitivity Analysis. In this study, a stricter sensitivity analysis has been carried out by studying the influence of crossover, mutation rates, and population size since they have a major impact on GA convergence. The GA parameter set is presented in Table 12. The analysis is based on classical design of experiments methodology27,28 Let us recall that experimental designs are used to identify or screen important factors affecting a process and to develop empirical models of processes. A classical 23 design, to investigate the effects of the factors population size (S), crossover (C), and mutation (M) probabilities each at 2 levels and also the 3 associated two-way interactions and the single threeway interaction has been implemented (see Tables 12 and 13). The symbols (+1) and (-1) correspond respectively to the upper (respectively lower) limit of each parameter. Each row represents an experimental runs a set of conditions for the three factors. After the above 8 runs have been completed, and measured response recorded for each run, an empirical model was built to predict process behavior based on the settings of these factors. Let us note that the maximum number of copies and the number of individuals on which the hill-climbing procedure has been applied are a function of population size. The generation number has been chosen as a stop criterion. A priori, a number of 200 generations seem to be an acceptable value with a convenient evolution level for result comparisons. An average value for the elitism factor has been taken between conservative and revolutionary policies. For each point, 10 runs have been performed. The statistical results relative to investment cost (minimum, average, and maximum for the 10 runs) are presented in Table 14. Table 15 displays another statistical analysis using Student’s law to define 95% confidence intervals for each parameter and criterion. Finally, the variables considered (cost, number of simulator calls, frequency to reach the best solution) can be expressed as follows:

Variable ) I + RS + βC + γM + δTC + TM + ζCM + ηTCM where I is a constant term (corresponding to each variable), R, β, etc. are coefficients, and S, C, and M are coefficients relative to population size, crossover, and mutation probabilities normalized between 1 and -1. The different coefficients have been computed using the classical results on the design of experiments with a 23 factorial design.28 The results have shown that neither S nor C nor M influence the cost of the best or least solution. From a statistical viewpoint,

Table 11. Influence of Crossover/with or without Hill Climbing crossover: hill climbing:

simple without

simple with

best average least frequency

1 577 160 1 691 208 1 793 056 5/25

1 577 160 1 667 042 1 800 713 5/25

minimum average maximum

1065 1738 2937

1170 2117 3216

two-point without

two-point with

uniform without

uniform with

Cost 1 577 160 1 686 926 1 794 357 3/25

1 577 160 1 683 139 1 803 761 5/25

1 577 160 1 654 362 1 794 357 7/25

1 577 160 1 653 565 1 794 357 8/25

1068 1844 3094

1023 1675 2851

1103 2066 3339

Simulator Calls 716 1547 3487

5756

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002

Table 12. Parameter Values variable

high level (+)

S population size C crossover rate M mutation rate

low level (-)

60

center (0)

20

40

0.9

0.7

0.8

0.15

0.05

0.10

Table 13. Hadamard Matrix run

S

C

M

SC

SM

CM

SCM

1 2 3 4 5 6 7 8

+ + + +

+ + + +

+ + + +

+ + + +

+ + + +

+ + + +

+ + + +

Best cost ) 1 579 228 Average cost ) 1 639 208 Least cost ) 1 743 535 Concerning the call number to the simulator, the sensitivity is more marked: Minimum call number ) 3357 + 798T + 472M Average call number ) 3796 + 856T + 273C + 622M Maximum call number ) 4160 + 982T + 345C + 631M In both cases, the interaction parameters (SC, SM, CM, SCM) have no influence on the results. It must be emphasized that, in all cases, population size surprisingly does not have a considerable impact on the call number to the simulator. The 95% confidence interval relative to the frequency (F) to reach the best solution has been found to be equal to [-0.407; 0.407] from the results obtained at the center of the design of experiments:

F ) 4.38 + 1.13T + 1.88M + 0.88TCM It can be noted that variations in crossover probability has no influence on this variable whereas mutation

probability plays a major role. There is also an interaction between population size, crossover, and mutation probabilities. This methodology appears to be very useful in the treatment of new problems. 5. Conclusions This paper presents an application of a GA to solve complex batch plant design problems. A comprehensive analysis on a typical example that may be encountered has been proposed, thus providing helpful guidelines for treating similar problems. The implementation of the gene fridge turns out to be interesting for preventing population degradation by allowing the introduction of previous genes during evolution. Besides, this procedure allows a better screening of search space and the evaluation of the combinatorics of the problem. Finally, this concept drives the search toward critical steps, by introducing a varying mutation probability function of the gene locus. The parametric study is particularly interesting for GA applications, which constitutes a key problem in GA problems. The performance of the GA has been studied from 5 series of experiments. The results obtained on the treated example have shown that the best results have been obtained both with uniform crossover and application of the hill-climbing procedure. Of course, the discrepancy between the performances achieved with the other procedures is not very marked. It has also been noted that better results have been obtained for strategies using small-size populations (30-50 individuals) over a significant number of generations than with large-size populations over a reduced number of generations. A sensitivity analysis performed on three important parameters of a GA, that is, population size, crossover, and mutation probability has not induced significant differences in investment cost. This influence is more sensitive to computation time. These conclusions have helped to treat larger size batch plant design problems for which GAs are particularly efficient. This methodology has also been extended to the problem of batch processes retrofitting.29

Table 14. Statistical Results for Sensitivity Analysis costs parameters 0 + + + +

0 + + + +

0 + + + +

call number

minimum

average

maximum

F

minimum

average

maximum

1 577 160 1 593 709 1 577 160 1 577 160 1 577 160 1 577 160 1 577 160 1 577 160 1 577 160

1 606 721 1 683 925 1 635 823 1 679 419 1 673 472 1 612 205 1 611 987 1 626 374 1 590 462

1 706 795 1 755 328 1 743 506 1 779 655 1 746 556 1 751 938 1 743 506 1 742 234 1 685 556

6 0 4 3 3 6 7 4 8

3355 2053 3320 2398 3770 2702 4646 3086 4883

3806 2300 3523 2722 4153 3111 5160 3628 5773

4140 2420 3990 3030 4674 3344 5506 3918 6397

Table 15. Statistical Analysis for Confidence Interval of Parameters point 1 (center) statistical analysis

run

cost

call number

F

1 2 3 4 5 6 7 8 9 10

1 650 713 1 577 160 1 577 160 1 577 160 1 706 795 1 577 160 1 577 160 1 596 683 1 650 062 1 577 160

4045 3893 3450 4114 3845 3798 3355 3451 3973 4140

0 1 1 1 0 1 1 0 1 0

minimum maximum average variance (center) variance (parameters) standard deviation (parameters) confidence interval of parameters

cost

call number

F

1 577 160 1 706 795 1 606 721 2.130 × 109 2.663 × 108 16 318 [-36 359, 36 359]

3355 4140 3806 84 083 10 510 103 [-228, 228]

0.000 1.000 0.600 0.267 0.033 0.183 [-0.407, 0.407]

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002 5757

Appendix

By recurrence, it can be demonstrated that

Computation of the Total Number of Configurations for a Batch Plant. Given a batch plant to design with NOP unit operations, n maximum parallel equipment for each operation, and p available sizes for a given equipment unit, this study is limited to p ) 3 (“large”, average, and “small” sizes). The analysis of problem combinatorics involves solving the integer solutions of the diophantian equation for each unit operation:

n

∑ k)1

2

3 -1 ) Cn+3

(3)

that is,

A)B Relation (3) is verified for n ) 1:

A ) 3 and B ) C34 - 1 ) 4 - 1 ) 3

x1 + x2 + x3 ) k

If the relation is assumed to be verified for order n, at order (n + 1), we have

where

xi ∈ N, i ) 1, 2, 3

0 e xi e k

A)

0eken Index i refers to size (i ) 1, 2, 3) and the integer variable xi refers to the number of times an equipment unit of size i can be selected. The count is achieved as follows:

[

(k + 1)(k + 2)

x3 ) 0 x1 + x2 ) k (k + 1) solutions [x1,x2] ) [(0,k),(1,k-1),...(k,0)] x3 ) 1 x1 + x2 ) k - 1 k solutions [x1,x2] ) [(0,k-1),(1,k-2),...(k-1,0)] l l l l x3 ) k x1 + x2 ) 0

l l 1 solution [x1,x2] ) [(0.0)]

]

For a given value of k (0 e k e n), the solution number is equal to (k+1) + (k) + (k-1) + ... + 1, that is,

(1)

For the sake of illustration, if n is equal to 4, the count of the solution set leads to (a numerical value different from zero in the left position represents an equipment unit of large, average, and small size): equipment number configurations

solution number

k)2

k)3

k)4

100 010 001

200 020 002 110 101 011

300 030 003 210 201 021 120 102 012 111

400 040 004 310 301 031 220 202 022 211 130 013 103 121 112

10

15

6

n

2

3 -1+ ) Cn+3

(n + 2)(n + 3) 2

(n + 3)! (n + 2)(n + 3) + -1 n!3! 2

)

(n + 3)(n + 2)(n + 1) (n + 2)(n + 3) + -1 6 2

)

(n + 3)(n + 2)(n + 1) + 3(n + 2)(n + 3) -1 6

(n + 3)(n + 2)(n + 1 + 3) -1 6 (n + 4)(n + 3)(n + 2) -1 ) 6 )

(n + 4)! 3 -1)B ) Cn+4 (n + 1)!3!

Therefore, the relation is verified for every n and is general for every p. This result can easily be demonstrated by finite difference equations: 3 -1 F(n) ) Cn+3

∆F ) F(n) - F(n - 1) )

(n + 1)(n + 2) 2

The reciprocal theorem leads immediately to n

∑ k)1

(k + 1)(k + 2) 2

3 -1 ) F(n) - F(0) ) Cn+3

When NOP unit operations are considered with n ) 10 and p ) 3, the total number of possible configurations for the workshop is given by NOP

T)

The total number of solutions of a given unit operation is given by

∑ k)1

+ 1)(k + 2)

First-order finite difference gives

k)1

3

∑ k)1

)

)

(k + 2)(k + 1) 2

n+1(k

[C313 - 1] ) [C313]NOP ) [285]NOP ∏ i)1

The problem combinatorics begins to explode as follows:

(k + 1)(k + 2) 2

(2)

NOP T

3 2.3 × 107

5 1.9 × 1012

10 3.5 × 1024

5758

Ind. Eng. Chem. Res., Vol. 41, No. 23, 2002

For an illustration, let us recall that the number of stars in the universe is evaluated as 4 × 1011.30 Literature Cited (1) Pibouleau, L.; Floquet, S.; Domenech, S.; Azzaro-Pantel, C. A Survey of Optimisation Tools through ESCAPE Symposia. In Proceedings of the European Symposium on Computer Aided Process Engineering, ESCAPE. Comput. Chem. Eng. 1999, (Supplement), S495-S498. (2) Tayal, M. C.; Fu, Y.; Diwekar, U. M. Optimal Design of Heat Exchangers: A Genetic Algorithm framework. Ind. Eng. Chem. Res. 1999, 38, 456-467. (3) Caux, C.; Pierreval, H.; Portmann, M. C. Les algorithmes ge´ne´tiques et leur application aux proble`mes d’ordonnancement. Proceedings Journe´es d’e´tudes “ordonnancement et entreprise”; LAAS, Toulouse, France, 1994. (4) Kirkpatrick, S.; Gellatt, C. D.; Vecchi, M. P. Optimisation by Simulated Annealing. Science 1983, 220, 671-680. (5) Van Laarhoven, P. J. M.; Aarts, E. H. L.; Lenstra, J. K. Jobshop Scheduling by simulated annealing. Operations Res. 1992, 40 (1), 113-125. (6) Patel, A. N.; Mah, R. S. H.; Karimi, I. A. Preliminary Design of Multiproduct Noncontinuous Plants Using Simulated Annealing. Comput. Chem. Eng. 1991, 15 (15), 451-469. (7) Das, H.; Cummings, P. T.; LeVan, M. D. Scheduling of serial multiproduct batch processes via simulated annealing. Comput. Chem. Eng. 1990, 14, 1351-1362. (8) Ku, H.; Karimi, I. An Evaluation of Simulated Annealing for Batch Process Scheduling. Ind. Eng. Chem. Res. 1991, 30, 163169. (9) Tandom, M.; Cummings, P. T.; Le Van, M. D. Scheduling of multiple products on parallel units with tardiness penalties using simulated annealing. Comput. Chem. Eng. 1995, 19 (10), 1069-1076. (10) Athier, G. Optimisation des flux thermiques au sein de re´seaux d’e´changeurs de chaleur. Ph.D. Thesis, INPT, Toulouse, Jan 7, 1997. (11) Metropolis, N.; Rosenbluth, A.; Rosenbluth, M.; Teller, A.; Teller, E. Equation of State Calculations by Fast Computing Machines. J. Chem. Phys. 1983, 21, 1087-1092. (12) Yuan, X. Y.; Chen, Z. Z. A Hybrid Global Optimization Method for Design of Batch Chemical Processes. Comput. Chem. Eng. 1997, 21 (Supplement 1), S685-S690. (13) Cartwright, H. M.; Long, R. A. Simultaneous Optimisation of Chemical Flowshop Sequencing and Topology Using Genetic Algorihms. Ind. Eng. Chem. Res. 1993, 32, 2706-2713. (14) Wang, C.; Quang, H.; Xu, X. Optimal Design of Multiproduct Batch Chemical Process Using Genetics Algorithms. Ind. Eng. Chem. Res. 1996, 35, 3560-3566.

(15) Tan, S.; Mah, R. S. H. Evolutionary Design of Noncontinuous Plants. Comput. Chem. Eng. 1998, 22 (1-2), 69-85. (16) Cordero Cruz, J. C. Conception optimale de re´seaux de re´acteurs. Ph.D. Thesis, INPT, Toulouse, March 6, 1997. (17) Androulakis, I. P.; Venkatasubramanian, V. A genetic algorithm framework for process design and optimization. Comput. Chem. Eng. 1991, 15 (4), 217-228. (18) Azzaro-Pantel, C.; Bernal Haro, L.; Baudet, P.; Domenech, S.; Pibouleau, L. A two-stage methodology for short-term batch plant scheduling: Discrete-event simulation and genetic algorithm. Comput. Chem. Eng. 1998, 22 (10), 1461-1482. (19) Baudet, P. Ordonnancement a` court terme d’un atelier de chimie fine-cas du fonctionnement jobshop. Ph.D. Thesis, INPT, Toulouse, Jan 7, 1997. (20) (a) Peyrol, E. Gestion d’un atelier de fabrication de composants e´lectroniques. Ph.D. Thesis, INPT, Toulouse, Dec 15, 1992. (b) Peyrol, E.; Floquet, P.; Pibouleau, L.; Domenech, S. Scheduling and simulated annealing, Application to a semiconductor circuit fabrication plant. Comput. Chem. Eng. 1993, 17 (Supplement), S39-S44. (21) Holland, J. H. Adaptation in Natural and Artificial Systems; University of Michigan Press: Ann Arbor, MI, 1975. (22) Goldberg, D. A. Algorithmes ge´ ne´ tiques; Addison-Wesley: Reading, MA, 1994. (23) Costello, R. Chemistry Part II Thesis, Oxford University, U.K., 1993. (24) Tuson, A. L. The Implementation of a Genetic Algorithm for the Scheduling and Topology Optimisation of Chemical Flowshops; Technical Report TRGA94-01; Oxford University: Oxford, U.K., 1994. (25) Bernal-Haro, L.; Azzaro-Pantel, C.; Pibouleau, L.; Domenech, S. Design of multipupose batch chemical plants using a genetic algorithm. Comput. Chem. Eng. 1998, 22 (Supplement), S777-S783. (26) Davis, L. Handbook of Genetic Algorithms; International Thomson Computer Press: Boston, 1991; copyright 1996. (27) Goupy, J. La me´thode des plans d’expe´riences, Dunod, 1988. (28) Anderson, M.; Whitcomb, P. DOE Simplified, Practical Tools for Experimentation; Productivity, Inc.: Portland, OR, 2000. (29) Dedieu, S.; Azzaro-Pantel, C.; Domenech, S.; Pibouleau, L. A Retrofit Design Strategy for Multipurpose Batch Plants. Comput. Chem. Eng. 1999, (Supplement), S15-S18. (30) Sagan, C. Cosmos; Dunod: Paris, 1981.

Received for review July 30, 2001 Revised manuscript received June 27, 2002 Accepted June 27, 2002 IE0106478