Selection of a Mixed-Integer Nonlinear Programming (MINLP) Model

Selection of a Mixed-Integer Nonlinear Programming (MINLP) Model of ... a set of matching cases is retrieved, using inductive retrieval; (ii) the case...
0 downloads 0 Views 157KB Size
Ind. Eng. Chem. Res. 2006, 45, 1935-1944

1935

Selection of a Mixed-Integer Nonlinear Programming (MINLP) Model of Distillation Column Synthesis by Case-Based Reasoning Tivadar Farkas,† Yuri Avramenko,‡ Andrzej Kraslawski,*,‡ Zoltan Lelkes,† and Lars Nystro1 m‡ Department of Chemical Engineering, Budapest UniVersity of Technology and Economics, H-1521 Budapest, Hungary, and Department of Chemical Technology, Lappeenranta UniVersity of Technology, P.O. Box 20, FIN-53851 Lappeenranta, Finland

The paper presents a new application of the case-based reasoning method for finding a mixed-integer nonlinear programming (MINLP) model with superstructure and a solution of the corresponding distillation synthesis problem by suggesting an initial point for performing design and optimization of the system. A case library has been built from earlier published distillation problems with reproducible MINLP models. When solving a new problem, the most similar case to the target is found in the case library during the retrieval process in two steps: (i) first, a set of matching cases is retrieved, using inductive retrieval; (ii) the cases in the retrieved set then are ranked according to their similarity to the target case, using the nearest-neighborhood method. 1. Introduction Distillation is the one of the most widespread separation methods in the process industry. Unfortunately, the costs of distillation equipment and especially its operation are very high. This problem makes the synthesis of distillation sequences a very popular task, despite its difficulties related to the system complexity and equilibrium models. One of the most popular of general synthesis methods is the hierarchical approach,1 where synthesis and optimization used to be consecutive steps in an evolutionary manner. First, the process units are selected, and then the equipment configuration and the design variables are determined. Pibouleau et al.2 used a branch-and-bound procedure, combined with fuzzy set theory, for the identification of optimal distillation sequences. For the same purpose, Leboreiro and Acevedo3 suggested a modified genetic algorithm technique, coupled with a sequential simulator. Fraga and Zˇ ilinskas4 used a natural hybrid evolutionary/local search optimization method for the optimal design of heatintegrated distillation sequences. An other common method of synthesis is mixed-integer nonlinear programming (MINLP). MINLP affords the possibility to execute the synthesis and system optimization simultaneously.5 The method has three steps: (a) build a superstructure; (b) generate the MINLP model of the superstructure; and (c) find the optimal structure and operation, using the proper tool. There are two main difficulties when using MINLP: (a) generating an accurate MINLP model is a complicated task, and (b) MINLP algorithms provide a global optimum, in the case of convex searching space. In regard to generating an accurate MINLP model, usually, related papers report a new MINLP model and superstructure, according to the problem under consideration, but the development of all of these superstructures requires considerable engineering experience. Until now, only one automatic combinatorial methodsthat given by Friedler et al.6,7shas been reported to generate the superstructure. However, this method is difficult to use for cascade systems. In regard to the MINLP algorithms, the distillation * To whom correspondence should be addressed. Tel: +358 5 621 2139. Fax: +358 5 621 2199. E-mail address: [email protected]. † Budapest University of Technology and Economics. ‡ Lappeenranta University of Technology.

column design models include strongly nonconvex functions; therefore, finding a global optimum is not ensured. In such cases, the result is dependent on the initial point of calculations. To overcome these difficulties, experience must be used in solving a new problem. Case-based reasoning (CBR) is an excellent tool for the reuse of the previously acquired experience. In the CBR methodology, the case most similar to an actual problem is retrieved from a case library, and the solution of this case is used to solve the actual problem. Finally, the solution of the problem is stored in the case library for future use.8,9 The objective of this paper is to present a case-based reasoning method, which for a new distillation problem, provides proper MINLP model with superstructure and gives an initial state for the design of distillation column or distillation sequence. The creation of the case library of the existing MINLP models and results were considered. 2. Case-Based Reasoning CBR imitates human reasoning and tries to solve new problems through the reuse of solutions that were applied to past similar problems. CBR involves data from the previous situations and reuses results and experience to fit a new problem situation. The central notion of CBR is a case. The main role of a case is to describe and remember a single event from the past where a problem was solved. A case is composed of two components: the problem and the solution. Typically, the problem description consists of a set of attributes and their values. Many cases are collected in a set to build a case library (the case base). The library of cases must roughly cover the set of problems that may arise in the considered domain of application. The main phases of the CBR activities can be described typically as a cyclic process (see Figure 1). During the first step (retrieVal), a new problem (target case) is matched against problems of the previous cases (source cases) by calculating the similarity function, and the most similar problem and its stored solution are determined. If the proposed solution does not meet the necessary requirements of actual situation, the next step (adaptation) is necessary and a new solution is created. The obtained solution might be validated by external rules or deemed by a human to be appropriate (this step is Validation). The approved solution and the new problem are combined to

10.1021/ie0500265 CCC: $33.50 © 2006 American Chemical Society Published on Web 02/17/2006

1936

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006

Table 1. Stored Cases in the Library mixture 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

propane; iso-butane; n-butane propane; iso-butane; n-butane n-butane; n-pentane; n-hexane; n-heptane benzene; toluene; o-xylene n-hexane; n-heptane; n-nonane acetone; acetonitrile; water methanol; water benzene; toluene; o-xylene benzene; toluene; o-xylene acetone; acetonitrile; water benzene; toluene; o-xylene; diphenyl benzene; toluene; o-xylene; diphenyl propane; butane; pentane; hexane propane; butane; pentane; hexane propane; butane; pentane; hexane propane; n-butane; n-pentane; n-hexane propane; n-butane; n-pentane; n-hexane methylacetylene; propane; n-butane; n-pentane; n-hexane methylacetylene; propane; n-butane; n-pentane; n-hexane benzene; toluene benzene; toluene n-butane; n-pentane; n-hexane benzene; toluene; o-xylene n-pentane; n-hexane; n-heptane benzene; toluene; o-xylene n-pentane; n-hexane; n-heptane; n-octane; n-nonane

sharp separation?

heat integration?

reference

no no no yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes

no yes yes no no no no no no no no yes no yes no no yes no

Example 1, Aggarwal and Floudas10 Example 1, Aggarwal and Floudas10 Example 2, Aggarwal and Floudas10 Example MF1, Viswanathan and Grossmann11 Example MF2, Viswanathan and Grossmann11 Example MF3, Viswanathan and Grossmann11 Example MF5, Viswanathan and Grossmann11 Example Ternary 1, Viswanathan and Grossmann12 Example Ternary 2, Viswanathan and Grossmann12 Example Unit, Viswanathan and Grossmann12 Example 1, Novak et al.13 Example 2, Novak et al.13 Example 1, Yeomans and Grossmann14 Example 1, Yeomans and Grossmann14 Example 1, Yeomans and Grossmann14 Example 2, Caballero and Grossmann15 Example 3, Caballero and Grossmann15 Example 4, Caballero and Grossmann15

yes

yes

Example 5, Caballero and Grossmann15

yes yes yes yes yes yes yes

no no no no thermally linked thermally linked thermally linked

Example 1, Yeomans and Grossmann16 Example 3, Yeomans and Grossmann16 Example 4, Yeomans and Grossmann16 Example 5, Yeomans and Grossmann16 Example 5.1, Yeomans and Grossmann17 Example 1, Caballero and Grossmann18 Example 2, Caballero and Grossmann18

build a new case that is incorporated in the case library during the learning step. In this way, the CBR system evolves as the capability of the system is improved by extending the stored experience. One of the most important parts of the CBR cycle is the retrieval. During the retrieval, the attributes of the target cases are compared to find the most similar case. There are two widely used retrieval techniques:9 nearest-neighbor retrieval and inductive retrieval. The nearest-neighbor retrieval technique simply calculates the differences of the attributes, multiplied by a weighting factor. In inductive retrieval, a decision tree is produced, which classifies the cases. There are classification questions about the main attributes in the nodes of the tree; by answering these questions, the most similar case is determined.

3. Problem Statement The problem addressed in this paper is as follows. There is given an ideal mixture of components that is to be separated, by distillation, into several products of the specified composition. The goal is to propose a proper MINLP model with a superstructure to synthesize a sequence of distillation columns. The superstructure must include an initial structure for the design optimization. As a realization of the database of previously solved problems, a case library is created, and an efficient retrieval method, which identifies the case most similar to the actual problem, is developed. Using CBR for a distillation column synthesis problem is a relatively novel approach. To simplify the study of the applicability of CBR, only ideal mixture separation cases are considered. However, in our opinion, the method can be further extended to consider cases of azeotropic mixtures also, if appropriate data are available. The library of cases is built based on the detailed distillation examples with reproducible MINLP models that have been published in other papers.10-18 The case library contains 26 cases of separation of ideal mixtures for up to five components. The descriptions of the stored cases are given in Table 1. The retrieval of the cases most similar to a new problem consists of analysis of the characteristics of feed, required products and operational parameters, and their comparison with the analogical properties of a problem under consideration. As a solution, the MINLP models with a superstructure of the most-similar cases are suggested and their optimal flowsheets are given, which can be used as an initial point for optimization. 4. Implementation of Case-Based Reasoning Method

Figure 1. Case-based reasoning (CBR) cyclic process.

The computer implementation of CBR is composed of three main parts: (1) case library, (2) retrieval method, and (3) adaptation. 4.1. Case Representation. According to stated goal, a case must contain an applicable MINLP model with a superstructure,

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006 1937

Figure 3. Graphical representation of the flowsheet. Figure 2. Example of the flowsheet. (Figure 5 in Yeomans and Grossmann.14 Copyright 1999, Elsevier.)

which can be used in design and optimization to determine the optimal structure and operational parameters. The case description is supplemented by a particular solution that can be used as an initial approximation in finding the optimal solution. 4.1.1. MINLP Model with Superstructure. The case library includes only cases with reproducible MINLP models. The representation of a model involves a superstructure, the set of variables and parameters, the mass and enthalpy balances, and other constraints. However, usually only the superstructure, the variables, and the main equations are detailed in the source articles; e.g., the equilibrium models and the basic mass balances are not represented. The articles contain the hints and notes, which can be helpful in regard to using a model. To provide the instructions for using the MINLP model, the original articles have been included in the case library as PDF files. 4.1.2. Solution of Source Cases. Usually, the search space and the equations of MINLP models are strongly nonconvex, so the identified optimum is strongly dependent on the starting point of the calculations. The solution of a similar problem is given as an initial solution in optimization to increase greatly the probability of finding the global optimum. The articles usually report a flowsheet supplemented with a dataset as a solution for a problem. The case library contains the flowsheet and their mathematical representations. A flowsheet is represented as a graph. An example of graph representation of the flowsheet (shown in Figure 2, taken from Yeomans and Grossmann14) is shown in Figure 3. In this graph, the nodes are the feed (F1), the distillation columns (C1, C2, ...), the heat exchangers (condensers: Con1, ...; and reboilers: Reb1, ...), the mixers/splitters (MS1, MS2, ...) and the products (P1, P2, ...); the edges are the flows between the units. This graph can be represented in matrix form (using a node-node matrix; see Table 2). In this matrix, aij ) 1 if there is connection from node i to node j; otherwise, aij ) 0. Many flows are supplemented with attributes such as temperature, flow rate, and composition. Such flows are noted by captions (e.g., S1, S6b) in the graph. These flows are represented in a separate edge-node matrix (Table 3), which contains the starting and ending nodes of the flows. In the graphical representation, only simple columns are used, with a maximum of three inputs and two outputs. In case of thermally coupled flowsheets, a possible rearrangement of the

complex columns is used (see details in Example 1). If two flows between two columns have a reverse direction; these flows pairs then are called ”thermally coupled”. The thermally coupled complex columns are represented as being composed of two parts: upper and lower separate columns. The solution is represented by the graph (depicted in Figure 3), the node-node matrix (Table 2), and the edge-node matrix (Table 3), as well as the detailed data of units and flows, such as distillation columns (e.g., the number of trays, the diameter (given in meters), input/output trays, pressure (bar), and reflux ratio), heat exchangers (e.g., area (given in square meters), heat flowrate (given in megawatts), and utility), and flows (e.g., temperature (given in Kelvin), flow rate (kmol/h), the set of components, and the mole fraction of components). In the case of heat-integrated columns, the flows pass through heat exchangers. The heat exchanger changes the temperature and physical conditions of the flow. However, the rate of temperature change is unknown. Therefore, these flows are marked with the same number and distinguished with small letters (e.g., S2a, S2b, ...); however, only the data of the flow before the heat exchanger are reported. 4.2. Case Retrieval. During retrieval, an actual problem is matched against previous ones from the case library, and the most-similar problem is retrieved. The solution of the retrieved problem is used next in optimization. The actual problem, which must be solved, is the target case; the solved problems, with their solutions, are the source cases. First, the case base is analyzed using the induction method to classify the cases. One class of cases corresponding to the actual problem build a retrieved set of cases. Next, the cases in the set are ranked according to their similarities to the target case, using the nearest-neighbor method. 4.2.1. Inductive Retrieval. Using a set of classification attributes, the cases are grouped into clusters. The clusters are characterized by the following values of attributes: (a) Separation: sharp and nonsharp. (b) Heat integration: structure without heat integration, structure with heat integration, and thermally coupled structure. (In the single-column configuration, only a non-heat-integrated structure is possible.) (c) Number of products: the number of products can be 2 or 3-5. (This attribute is considered because the model for the single-column configuration does not include the mass balances for the connection of distillation columns; thus, such a model

1938

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006

Table 2. Node-Node Matrix

F1 C1 C2 C3 Con1 Con2 Reb1 Reb2 Reb3 MS1 MS2 MS3 MS4 MS5 MS6 P1 P2 P3 P4

F1

C1

C2

C3

Con1

Con2

Reb1

Reb2

Reb3

MS1

MS2

MS3

MS4

MS5

MS6

P1

P2

P3

P4

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0

0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

Table 3. Edge-Node Matrix

start end

F1

S1

S2a

S2b

S3

S4

S5

S6a

S6b

S7

S8

F1 C1

MS1 P1

MS2 Reb1

Reb1 C1

MS2 C2

MS3 P2

MS4 C3

C3 Reb1

Reb1 MS5

MS5 P3

MS6 P4

cannot be used for a problem involving the separation of three or more products.) (d) Feed type: single and multiple. (This attribute is considered because of the dissimilarity between the MINLP models with single and multiple feeds.) The cluster of cases corresponding to the actual problem (which has the same values of classification attributes) is retrieved for next stepsthe nearest-neighbor method. 4.2.2. Nearest-Neighborhood Retrieval. The nearestneighborhood method is used to calculate the similarity between the target case and the source cases from the set of similar cases retrieved using the inductive method. The evaluation of the global similarity between the target case and a source case is based on the computation of the local similarities. The local similarities involve a single attribute and take the value from the interval [0;1]. The global similarity can be derived from the local similarities as k

SIM(T,S) )

wi simi ∑ i)1 k

(1)

wi ∑ i)1 where wi is the weight of importance of attribute i; simi the local similarity between the values of attribute i obtained from the target case (T) and the source case (S), and k the number of attributes. The weights of importance take integer values from 1 to 10, according to the actual requirements, where a weight value of 10 determines the most important attribute. The similarity between component sets is very important and must be applied first. It must be determined which component in the source case corresponds to a certain component in the target case. In the simplest case, the sets of components of the target case and the source case are identical. Otherwise, the most similar sequence of components must be determined, and identical components often do not create the corresponding pairs. For instance, the components set of the target case (according to Yeomans and Grossmann16) is n-butane, n-pentane, and

n-hexane. The components set of the source case (according to Yeomans and Grossmann17) is n-pentane, n-hexane, and nheptane. The n-pentane and n-hexane components are present in both cases, and it is evident to assign them to each other in the target case and in the source cases. The third pair of the components then is n-butane (the target case) and n-heptane (the source case). However, there is a problem with this assignment, because of the fact that n-butane in the target case is the most volatile component, whereas n-heptane, the pair of n-butane in the source case, is the less-volatile component. Thus, the solution of the source case cannot be used for the solution of the target case. To overcome these difficulties, during the matching of the components, the primary assumption is the volatility order of the components, and the secondary assumption is the nature of the components. The component pairs in the previous example are n-butane-n-pentane, n-pentane-n-hexane, and n-hexanen-heptane. In this case, the solution of the source case can be used to solve the target case. To calculate the similarity, five attributes are used: components, boiling points of components, molar masses of components, feed, and product composition (mole fraction). 4.2.2.1. Components. It is a non-numeric attribute. The similarity of components is based on their chemical structure. The similarity tree, which includes all components in the case library (Figure 4), has been built. In the similarity tree, the nodes represent the basic groups of chemical components. To each component group, a numeric similarity value was assigned. The similarity value of two components is the value of the nearest common node in the tree. For example, when comparing n-butane and methanol, the nearest common node is the “organic” node; therefore, the similarity value is 0.2. The more similar the components, the greater the similarity value between them. For identical components, the similarity value is 1. It may happen that cases with different numbers of products are compared. In such cases, there are components in one set that have no corresponding components in another set. For these matchless components, the nearest common node is the “com-

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006 1939

Figure 4. Similarity tree of components.

ponents” node; therefore, the similarity value is 0 (see Example 2).

The local similarities for these attributes are defined as

The local similarity of the components (simc) is defined as the average of the similarity values between the components:

simt )

n

simc )

n

xc,i ∑ i)1 Nc

(1 - ∆tb,i) ∑ i)1 (5)

Nc n

(2) simm )

where xc,i is the similarity value of the components from the similarity tree and Nc is the maximal number of components in the compared mixtures. Because only problems that contain ideal mixtures are stored in the case library, the comparison of components, based on chemical structure of the components, is suitable. In the later phase of development also, problems containing azeotropic mixtures would be introduced to the case library, and the comparison of components can be further developed. The mixtures then will be grouped according to the type and number of azeotropes in the system, or the local similarity will be calculated based on a group contribution method. 4.2.2.2. Boiling Point and Molar Mass of Components. These attributes are numeric. In such cases, the similarity of the attributes is calculated utilizing a simple distance approach: the shorter the distance between two attribute’s values, the greater the similarity. For greater sensitivity, normalized values from interval [0;1] are used instead of the original values. The normalized values are defined for boiling point (tb) and molar mass (m) to be

(1 - ∆mi) ∑ i)1 (6)

Nc

where ∆tb,i is the difference of the normalized boiling points, ∆mi the difference of normalized molar masses, and Nc the maximal number of components. In cases where there the numbers of components of compared cases for a matchless component are different, the difference of boiling points (∆tb,i) or molar masses (∆mi) is the matchless component’s normalized boiling point or normalized molar mass (see more in Example 2, presented later in this work). 4.2.2.3. Feed and Product Compositions. These are also numeric attributes that are vectors. Comparison of the vector attributes determines the distance vector, B d:

S ) (s1, s2, ..., sn) B T ) (t1, t2, ..., tn); B

t, s ∈ [0;1]

|d B| ) x(t1 - s1)2 + (t2 - s2)2 + ... + (tn - sn)2

(7a)

B d ∈ Rn (7b)

where B T is the attribute vector of the target case and B S is the attribute vector of the source case.

Tb - Tb,min Tb,max - Tb,min

(3)

M - Mmin Mmax - Mmin

(4)

In case the numbers of components of compared cases are different, zero elements are added to the shorter vector, to have the same number of elements in the compared vectors (see more in Example 2, presented later in this work).

where Tb,min is the smallest boiling point, Tb,max the highest boiling point, Mmin the smallest molar mass, and Mmax the greatest molar mass in the case library.

Because there are many product composition vectors, the difference vector and the distance are calculated for every product pair. The method is analogous to the problems with

tb )

m)

1940

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006

the multiple feeds. The local similarity of the feed compositions (simf) and product compositions (simp) are defined as follows:

simf ) 1 -

1

∑ N j)1 f

simp ) 1 -

|d Bf,j|

Nf

1

n

|



(8)

B d p,j, b e i ∈ Rn

(9)

b e i| ∑ i)1 |d Bp,j|

Np

Np j)1

B d f,j, b e i ∈ Rn

n

b e i| ∑ i)1

|

where Nf is the maximal number of feeds, Np is the number of products, and b ei are the basis vectors in the Rn space (which are necessary for normalization). Other attributes can also be considered, according to the actual requirements. The calculation of similarity for other numeric or vector values is performed in the same way. Using the nearest-neighborhood method, the cases of the set retrieved by inductive method are ranked, and the solution of the most-similar case is determined. The MINLP model with the superstructure and the optimal solution of the source case are suggested. Usually, the chosen solution must be adapted to meet the actual requirements. 4.3. Adaptation. Three most-similar cases are selected as potential solutions, and, according to the actual requirements and engineering experiences, the most useful model is chosen. Because of the complexity of the distillation problems, there is no automatic adaptation of the found solution. The task of the designer is the modification of the MINLP model and the reuse of the solution of the chosen case as an initial point for design and optimization. 5. Examples In this section, three examples are presented to illustrate the mathematical representation of a solution of a case, the retrieval method, and problem solving. Example 1 shows the mathematical representation of a thermally coupled structure; it shows how to rearrange the configuration of complex distillation columns. Example 2 illustrates the retrieval method and a comparison of a target case and a source case. The nearestneighborhood method is applied to the case with a different number of components and feeds. The solving of a new problem is presented in Example 3. 5.1. Example 1. As mentioned in section 3.1, a flowsheet is represented as a graph. The nodes of the graph are the units (columns, condensers, reboilers, and mixers/splitters), the feeds, and the products; the edges are the streams that connect the units. The graph also is represented with a node-node matrix and an edge-node matrix. Only simple distillation columns with a maximum of three input and two output streams can be used in the graph. In the case of a thermally coupled solution, a possible rearrangement of the flowsheet could be also represented in the graph. An example for this rearrangement is shown below. The examined flowsheet (from Yeomans and Grossmann17) is presented in Figure 5a. There are two columns: a simple column (without heat exchanger) and a complex column. The simple column meets our requirements of the mathematical graph representation. However, the complex column has three outputs; therefore, it must be split into two simple columns (see Figure 5b). Between these two columns, there are interconnec-

Figure 5. (a) Flowsheet of Example 1 (Figure 5e in Yeomans and Grossmann.17). (b) Graphical representation of the flowsheet of Example 1. Table 4. Node-Node Matrix of Example 1 F1 C1 C2 C3 Con2 Reb3 MS1 MS2 MS3 P1 P2 P3 F1 C1 C2 C3 Con2 Reb3 MS1 MS2 MS3 P1 P2 P3

0 0 0 0 0 0 0 0 0 0 0 0

1 0 1 1 0 0 0 0 0 0 0 0

0 1 0 1 0 0 1 0 0 0 0 0

0 1 0 0 0 1 0 1 0 0 0 0

0 0 1 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 1 0 0 0 0 0 0 0

0 1 1 0 0 0 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 0 1 0 0 0

Table 5. Edge-Node Matrix of Example 1

start end

F1

S1

S2

S3

S4

S5

F1 C1

C1 C2

C1 C3

MS1 P1

MS2 P2

MS3 P3

tion streams in both directions; however, one of these streams is broken with a splitter. The node-node matrix and the edgenode matrix of the graph are given in Tables 4 and 5, respectively. Between the first column (C1) and the second column (C2), there are two streams (the situation is the same between the

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006 1941

from one column to another are marked as ”thermally coupled streams”. However, these streams are regarded as normal streams. 5.2. Example 2. To illustrate the retrieval method, only the comparison of two cases are described. The target case (from Viswanathan and Grossmann11) is a benzene-toluene-o-xylene system, and the source case (from Viswanathan and Grossmann11) is a methanol-water system. The descriptions of the cases are presented in Table 8. The target case is sharp separation without heat integration; there are two products and multiple feeds. The source case has the same values of classification attributes; hence, it belongs to the cluster of the target case. Before the nearest-neighborhood formula can be used, the corresponding pairs of components must be identified. The primary assumption is the volatility order of the components. The data of the components are given in Table 9. According to

Table 6. Units of Example 1 Value parameter

column 1

number of trays feed feed traya top product bottom product reflux ratio condenser utility condenser heat flow rate reboiler utility reboiler heat flow rate a

31 F1 21 S1 S2

column 2

column 3

9 S1 4 S3 S4 9.72 cold utility 32.167 MW

9 S2 8 S5

hot utility 33.907 MW

Trays are counted from bottom to top.

Table 7. Streams of Example 1

F1 S1 S2 S3 S4 S5

main component(s)

flow rate (kmol/h)

ABC ABC ABC A B C

1000 732 549

composition

type

(0.20; 0.50; 0.30)

normal thermally coupled thermally coupled normal normal normal

(0.971; -; -) (-; 0.899; -) (-; -; 0.901)

Table 9. Boiling and Molar Masses of the Components of Example 2 Boiling Point Data

first column and the third column). These streams can be regarded as outputs/inputs from or to the column, but also as a reflux in the first column. These types of streams, which are a neighbor of reverse direction, are called ”thermally coupled streams”. In Figure 5b, the marked streams S1 and S2 are examples of such streams (see Table 6). The S1 stream is regarded as an output from the first column, and an input to the second column (see Table 7). The complex column is split into two simple columns that are connected by ”thermally coupled streams”. One of these streams is broken with a splitter, from which a product stream (S4) starts. This stream can be regarded as a product of a second column and the third column. In the graphical representation, such a stream is regarded as the bottom product of the column (see Table 7). In the case of thermally coupled flowsheets, the complex columns are separated into simple columns with a maximum of two outputs. The internal streams of the earlier complex column are always regarded as outputs of the columns above the actual stream. The pair of streams with reverse directions

benzene toluene o-xylene methanol water

Molar Mass Data

boiling point, Tb (K)

normalized boiling point, tb

molar mass, M (g/mol)

normalized molar mass, m

353.2 383.8 417.6 337.9 373.15

0.371 0.481 0.603 0.316 0.443

78 92 106 32 18

0.472 0.583 0.693 0.110 0.000

the boiling points, the most-similar component to benzene is methanol in the source case, and the second component pair is toluene-water. The o-xylene has no pair, because there are different numbers of products in the target case and the source case. Comparing the components similarity for the benzenemethanol pair, the nearest common node in the similarity tree (Figure 4) is the “organic” node (xc,1 ) 0.2), and, for the toluene-water pair, the nearest common node is the “components” node (xc,2 ) 0). The o-xylene has no pair; therefore, for this component, the nearest node is the “components” node, and the similarity value is xc,3 ) 0. The local similarity value of components (simc) is the average of those similarity assessments (see Table 10).

Table 8. Descriptions of Target and Source Cases of Example 2

system condenser type reboiler type estimated maximum number of trays feed 1

feed 2

target case

source case

benzene-toluene-o-xylene total kettle type 40 Ft1 ) 50, xtF,1 ) (0.15; 0.25; 0.60), ptF,1 ) 1.2 bar, TtF,1 ) 411.459 K, qtF,1 ) 0.1 Ft2 ) 50, xtF,2 ) (0.55; 0.25; 0.20), ptF,2 ) 1.2 bar, TtF,2 ) 390.387 K, qtF,2 ) 0.0

methanol-water total kettle type 60 Fs1 ) 43.5, xsF,1 ) (0.15; 0.85), psF,1 ) 1.42 bar, TsF,1 ) 365.0 K, qsF,1 ) 0.0 Fs2 ) 29.5, xsF,2 ) (0.50; 0.50), psF,2 ) 4.8 bar, TtF,2 ) 392.697 K, qsF,2 ) 0.0 Fs3 ) 27.0, xsF,2 ) (0.89; 0.11), psF,3 ) 1.38 bar, TsF,3 ) 347.797 K, qsF,3 ) 0.0

1.2 bar 1.20 bar 1.10 bar 1.05 bar xt45,1 g 0.999 xt1,2 + xt1,3 g 0.999 2

1.4475 bar 1.44064 bar 1.0408 bar 1.0340 bar xs60,1 g 0.999 xs1,2 g 0.999 20

feed 3

pressures preb pbot ptop pcon purity constraint on top product purity constraint in bottom product upper bound of reflux ratio

1942

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006

Table 10. Comparison of Boling Point and Molar Mass of Components of Example 2 ∆Tb,i (K)

xc,i benzene-methanol toluene-water o-xylene

0.2 0 0 simc ) 0.067

local similarity values

Table 11. Weights of Importance for Example 2

∆Mi (g/mol)

0.055 0.038 0.603

0.362 0.583 0.693

simt ) 0.232

simm ) 0.546

To determine the local similarity values of boiling points and molar masses, firsr, the differences of the component pairs are determined. In the case of o-xylene, which has no pair from the source case, the differences are its own normalized values (∆tb,i ) tb,i; ∆mi ) mi). The local similarity values are the average of the differences of component pairs (Table 10). The purity requirements are not considered when comparing the products, but the concentrations of the components in the required products taken into consideration. This is needed because, in nonsharp separations, there are no purity requirements, so the exact composition of the products are used instead. Therefore, the required product compositions also must be given for the sharp separations. For example, in the target case, only the minimum mole fraction of the benzene of the top product is known (xt45,1 g 0.999); therefore, the concentration of the components in the required product is xtp,1 ) (0.999; 0; 0). This is not really the product composition, because it does not fulfill the requirement that the sum of the mole fractions is 1; instead, it is the minimal mole fraction of each component in the product. Therefore, the compared products are t s b x p,1 ) (0.999; 0; 0) s b x p,1 ) (0.999; 0; 0) t s ) (0; 0; 0) s b x p,2 ) (0; 0.999; 0) b x p,2

(10)

t s b x p,1 ) (0.999; 0; 0) b x p,1 ) (0.999; 0; 0)

(11a)

t s ) (0; 0; 0) b x p,2 ) (0; 0.999; 0) b x p,2

|d Bp,2| ) x(0 + 0.999 + 0 ) ) 0.999 (11b) 2

2

2

The local similarity value of product compositions is 2

simp ) 1 -

|d Bp,j| ∑ j)1 3

2|

b e i| ∑ i)1

)1-

0 + 0.999 2x3

≈ 0.712

weight

quality of components composition of required products boiling point of components composition of feeds molar mass of components

10 7 4 3 1

case, therefore, a third feed with zero elements is considered in the target case. When matching the feed, the primary objective also is the minimal difference; thus, the feed pairs, and the distance between the feed composition vectors, are given as t s b x f,1 ) (0.15; 0.25; 0.60) b x f,1 ) (0.15; 0.85; 0)

|d Bf,1| ) x(02 + 0.62 + 0.62) ≈ 0.849 (13a) t s ) (0.55; 0.25; 0.20) b x f,3 ) (0.89; 0.11; 0) b x f,2

|d Bf,2| ) x(0.442 + 0.142 + 0.22) ≈ 0.503 (13b) t ) (0; 0; 0) b x f,2s ) (0.50; 0.50; 0) b x f,3

|d Bf,3| ) x(0.52 + 0.52 + 02) ≈ 0.707 (13c) The local similarity value of feed compositions is 3

simf ) 1 -

|d Bf,j| ∑ j)1 3

3|

In the second product of the target case, all the mole fractions are zero, because the constraint xt1,2 + xt1,3 g 0.999 is uncertain. The third zero mole fraction is added to the product vector of the source case, to have the same number of elements of compared vectors. The products are matched by minimizing the difference between the product component vectors. Therefore, in this case, the product pairs are b xtp,1-x bsp,1 and b xtp,2-x bsp,2. For every product pair, the difference between the product composition vectors is calculated:

|d Bp,1| ) x(02 + 02 + 02) ) 0

attribute

(12)

The similarity of the feed compositions is calculated in a manner analogous to that for the product compositions. To have the same number of elements of vectors, a third zero element is added to the feed compositions of the source case. The number of feeds in the source cases is greater than that in the target

≈1-

0.849 + 0.503 + 0.707 3x3

b e i| ∑ i)1

≈ 0.604

(14)

The weights of importance for local similarities must be assigned before eq 1 can be used. The weights have values from 1 to 10. The more important an attribute, the greater the value of the weight. A possible set of weights is shown in Table 11. The global similarity is calculated as follows:

SIM(T,S) ) wc simc + wt simt + wm simm + wp simp + wf simf 5

wi ∑ i)1 ) [(10 × 0.067) + (4 × 0.232) + (1 × 0.546) + (7 × 0.712) + (3 × 0.604)]/25 ) 0.467 (15) In the above example, the comparison of two cases has been presented. First, the components were matched according to the volatility order and the boiling points. The local similarity of the components was calculated applying the values obtained from the similarity tree. The distance between the boiling point and the molar mass of the components were calculated; for the matchless component, the differences were its normalized values. During the determination of the local similarities of products and feeds, a third zero element was added in the source case compositions, to have the same number of elements of vectors; a third feed was considered as the B 0 vector to the target case, to have the same number of feeds. A set of importance weights was assessed, and the global similarity was calculated. 5.3. Example 3. The third example presents the application of the method to a new distillation problem. Consider a heptane-toluene mixture. The flowrate of the equimolar [0.5, 0.5] feed is 100 kmol/h. The target is to separate

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006 1943

the mixture into pure components, with a 95% purity requirement at the top and at the bottom. It is a sharp separation problem and a single-column configuration should be used, which means that the searched structure is not heat-integrated. There are one feed and two products. Applying the inductive retrieval, the set composed of four source cases has been determined in the case library. Next, the global similarity is calculated for the target case and for all the source cases, using the nearest-neighborhood method. As a result, the product compositions of the target case is [0.95, 0] at the top, and [0, 0.95] at the bottom. When required, a zero element is added to the composition vector. According to the nearest-neighborhood method (see Table 12), the most-similar case is a benzene-toluene problem (from Yeomans and Grossmann16). However, to choose the most suitable superstructure and MINLP model, three most-similar cases are considered (cases 4, 2, and 1 in Table 12). In the given example, source case 1 and source case 2 have the same MINLP model (see Viswanathan and Grossmann12). They differ in regard to the initial point during optimization. Therefore, the adaptation of source case 3 is not studied here. The models must be adapted according to the actual requirements of the target case. The adaptation has two main steps: (1) adaptation of the model and (2) adaptation of the solutions of the source cases as an initial point. The adaptation of the MINLP model is based on the assumptions of the optimization procedure. The column pressure assumed to be constant; therefore, the equations of the pressure profile in the model of Viswanathan and Grossmann12 are omitted. A constant molar overflow is assumed; therefore, the enthalpy balances and enthalpy calculations are omitted, and other equations are used instead, which force the total vapor and liquid flows to be constant in each column section. The fugacities are calculated according to the modified RaoultDalton equation:

fVi ) Pyi

(16)

fLi ) γixip0i (T)

(17)

where, for component i, fVi is the vapor fugacity (given in pascals), P the column pressure (given in pascals), yi the vapor mole fraction, fLi the the liquid fugacity (given in pascals), γi the activity coefficient, xi the liquid vapor mole fraction, and p0i (T) the vapor pressure (given in pascals). The vapor pressure (p0i (T)) is calculated using the Antoine equation, the applied model parameters of which are collected in Table 13.19 The heptane-toluene mixture is assumed to be ideal; therefore, the activity coefficients γi are equal to 1. The heptane-toluene mixture has a lower relative volatility than the mixtures of the source cases; therefore, the maximum number of trays in the column is increased to 80.

According to our earlier experiences, the numerical characteristics if this type of model can be improved by adding monotonity constraints to the model. Therefore, concentration and temperature monotonity constraints are given to the MINLP models, which do not spoil the generality of the models. The cost function is also modified according to the actual requirements. The following cost function is applied:20

cost ) βtax(CLPS∆vapH + CCW∆condH)FV +

f(N,D) tpb

(18)

f(N,D) ) 12.3[615 + 324D2 + 486(6 + 0.76N)D] + 245N(0.7 + 1.5D2) (19) where βtax is the tax factor (βtax ) 1.18), CLPS the cost of the low-pressure steam (CLPS ) 3.54 × 10-4 $/kJ), CCW the cost of the cooling water (CCW ) 2 × 10 -7 $/kJ), ∆Hvap the latent heat of vaporization (∆Hvap ) 33 773 kJ/kmol), ∆Hcond is the latent heat of condensation (∆Hcond ) 31 828 kJ/kmol), FV the vapor flow rate at the bottom [kmol/a], tpb the payback period (tpb ) 15 yr), N the number of trays in the column, and D is the column diameter (given in meters). The column diameter is calculated from the cross section of the column, which is determined by the F-factor method:21

A)

qVm

(20)

Fmax xFV

where A is the cross section of the column (given in square meters), qVm the flow rate of the vapor (given in units of kg/s), Fmax the F-factor (Fmax ) 2.2 Pa1/2), and FV the density of the vapor (given in units of kg/m3). The solution of a source case is used to give the initial state in design and optimization. The number of trays in the solution of the most-similar source case16 is 55, the reflux ratio is 1.77, and the column diameter is 0.56 m (equivalent to 1.84 ft). As in the target case, the feed is different from the feed of the source case (100 kmol/h instead of 150 kmol/h), so the values of these quantities must be modified in the initial state. Because of the lower relative volatility of the mixture in the target case, the reflux ratio and the column diameter are increased (3.54 and 1.12 m, respectively), using the same number of trays (55) in the initial state. An initial column profile is calculated by dividing the mole fraction interval between the compositions of distillate and bottom product into the same number of intervals as the number of trays. The initial temperature profile of the column is calculated similarly. The initial values of all other variables are calculated from these initial values using the model equations. The solution of the second-most-similar source case12 contains the following values: the number of trays is 25, the reflux ratio is 9.01, the flow rate of the distillate is 15 kmol/h, and the flow

Table 12. Nearest-Neighbor Retrieval (Example 3) source case 1

source case 2

problem published originally

Example Ternary 1 from ref 12

Example Ternary 2 from ref 12

source case 3

system

benzene-toluene-o-xylene

benzene-toluene-o-xylene

acetone-acetonitrile-water

benzene-toluene

simc simt simm simp simf

0.400 0.777 0.711 0.329 0.822

0.400 0.777 0.711 0.713 0.822

0.133 0.767 0.756 0.714 0.714

0.600 0.967 0.913 0.650 0.833

SIM

0.503

0.611

0.492

0.713

Example Unit from ref 12

source case 4 Example 1 from ref 16

1944

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006

Table 13. Constants in the Antoine Equation

heptane toluene

Ai

Bi

Ci

6.89386 6.95087

1264.370 1342.31

216.640 219.187

Acknowledgment

Table 14. Results of Example 3

objective number of trays feed location (from bottom) column diameter reflux ratio

selected most-similar cases, the corresponding MINLP model can be generated, and the optimal solution of these cases can be used as a starting point for design and optimization.

Yeomans and Grossmann16 model

Viswanathan and Grossmann12 model

$3 066 973/yr 63 30 1.51 m 4.22

$3 121 234/yr 73 38 1.50 m 4.18

rate of the bottom product is 85 kmol/h. In this solution, a low number of trays is used with very high reflux ratio; therefore, in the initial state of the new problem, the number of trays is doubled (50), and the reflux ratio is diminished to 4.50, as in the solution of the source case. The purity requirements for the distillate and for the bottom product for the main component are the same, therefore, the initial value of the distillate and the bottom product are the same: 50 kmol/h. An initial column profile for concentration and temperature is calculated using the same method as in the first case. The adapted MINLP model is solved by GAMS DICOPT++ (from Brooke et al.22) on a Sun Sparc Station. The results of both MINLP models are presented in Table 14. The results are checked with a ChemCAD 5.2 simulator, fixing the parameters given in Table 14. For the solution of the model of Yeomans and Grossmann,16 the purity of the distillate is 0.9408 and that of the bottom product is 0.9444. For the solution of the model of Viswanathan and Grossmann,12 the purity of both products is 0.9407. It can be seen that, in this case, the MINLP model of the most-similar source case found a better solution than the model of the second-most-similar source case. 6. Summary The paper presents an application of the case-based reasoning (CBR) method for finding a mixed-integer nonlinear programming (MINLP) model with a superstructure and a solution of the corresponding distillation synthesis problem by suggesting an initial point for performing design and optimization of the system. The cases in the case library are the earlier published distillation problems with the reproducible MINLP models. Each case contains a problem description and the mathematical representation of its solution. When solving a target case, a new problem that is the mostsimilar case to the target case is found in the case library during the retrieval process. First, a set of matching cases is retrieved using the inductive retrieval. The cases are classified according to the operational attributes such as sharp/nonsharp separation, heat integration, and the number of products and feeds. The cases in the retrieved set then are ranked according to their similarity to the target case, using the nearest-neighborhood method. In the nearest-neighborhood retrieval, the similarity is calculated from the local similarities of components, their molar masses and boiling points, and the composition of the feeds and required products. The set of weights is set for the specific problem. After the retrieval, the three most-similar cases are presented to the system user. Basing on the original articles, referring the

We thank the support received for this project from OTKA F046282 and PRCH Student Science Foundation. Literature Cited (1) Douglas, J. M. Conceptual Design of Chemical Processes; McGrawHill Chemical Engineering Series; McGraw-Hill: New York, 1988. (2) Pibouleau, L.; Floquet, P.; Domenech, S. Fuzziness and branch and bound procedures: Applications to separation sequencing. Fuzzy Sets Syst. 2000, 109, 111-127. (3) Leboreiro, J.; Acevedo, J. Processes synthesis and design of distillation sequences using modular simulators: a genetic algorithm framework. Comput. Chem. Eng. 2004, 28, 1223-1236. (4) Fraga, E. S.; Zˇ ilinskas, A. Evaluation of hybrid optimization methods for the optimal design of heat integrated distillation sequences. AdV. Eng. Software 2003, 34, 73-86. (5) Duran, M. A.; Grossmann, I. E. A mixed-integer nonlinear programming approach for process systems synthesis. AIChE J. 1986, 32 (4), 592606. (6) Friedler, F.; Tarjan, K.; Huang, Y. W.; Fan, L. T. Graph-theoretic approach to process synthesis: axioms and theorems. Chem. Eng. Sci. 1992, 47 (8), 1973-1988. (7) Friedler, F.; Tarjan, K.; Huang, Y. W.; Fan, L. T. Graph-theoretic approach to process synthesis: polynomial algorithm for maximal structure generation. Comput. Chem. Eng. 1993, 17 (9), 929-942. (8) Aamodt, A.; Plaza, E. Case-Based Reasoning, Foundational Issues, Methodological Variations, and System Approaches. AI Commun. 1994, 7 (1), 39-59. (9) Watson, I. Applying Case-Based Reasoning: Techniques for Enterprise Systems; Morgan Kaufman Publishers: San Francisco, CA, 1997. (10) Aggarwal, A.; Floudas, C. A. Synthesis of heat integrated nonsharp distillation sequences. Comput. Chem. Eng. 1992, 16 (2), 89-108. (11) Viswanathan, J.; Grossmann, I. E. Optimal Feed Locations and Number of Trays for Distillation Columns with Multiple Feeds. Ind. Eng. Chem. Res. 1993, 32, 2942-2949. (12) Viswanathan, J.; Grossmann, I. E. An alternate MINLP model for finding the number of trays required for a specified separation objective. Comput. Chem. Eng. 1993, 17 (9), 949-955. (13) Novak, Z.; Kravanja, Z.; Grossmann, I. E. Simultaneous synthesis of distillation sequences in overall process schemes using an improved MINLP approach. Comput. Chem. Eng. 1996, 20 (12), 1425-1440. (14) Yeomans, H.; Grossmann, I. E. Nonlinear disjunctive programming models for the synthesis of heat integrated distillation sequences. Comput. Chem. Eng. 1999, 23 (9), 1135-1151. (15) Caballero, J. A.; Grossmann, I. E. Aggregated Model for Integrated Distillation Systems. Ind. Eng. Chem. Res. 1999, 38, 2330-2344. (16) Yeomans, H.; Grossmann, I. E. Disjunctive Programming Models for the Optimal Design of Distillation Columns and Separation Sequences. Ind. Eng. Chem. Res. 2000, 39 (6), 1637-1648. (17) Yeomans, H.; Grossmann, I. E. Optimal Design of Complex Distillation Columns Using Rigorous Tray-by-Tray Disjunctive Programming Models. Ind. Eng. Chem. Res. 2000, 39 (6), 4326-4335. (18) Caballero, J. A.; Grossmann, I. E. Generalized Disjunctive Programming Model for the Optimal Synthesis of Thermally Linked Distillation Columns. Ind. Eng. Chem. Res. 2001, 40, 2260-2274. (19) Gmehling, J.; Onken, U.; Arlt, W. VLE Data Collection; DECHEMA: Frankfurt, Germany, 1977. (20) Luyben, M. L.; Floudas, C. A. Analyzing the interaction of design and control, Part 1: A multiobjective framework and application to binary distillation synthesis. Comput. Chem. Eng. 1994, 18 (10), 933-969. (21) Kister, H. Z. Distillation Design; McGraw-Hill: New York, 1992. (22) Brooke, A.; Kendrick, D.; Meeraus, A. GAMS: A User’s Guide; Scientific Press: Palo Alto, CA, 1988.

ReceiVed for reView January 10, 2005 ReVised manuscript receiVed October 14, 2005 Accepted November 2, 2005 IE0500265