Integration of Data Analysis and Design Optimization for the

5204

Ind. Eng. Chem. Res. 2003, 42, 5204-5214

Integration of Data Analysis and Design Optimization for the Systematic Generation of Equipment Portfolio Vishal Goyal and Marianthi G. Ierapetritou* Department of Chemical and Biochemical Engineering, Rutgers University, Piscataway, New Jersey 08854

Current techniques for process synthesis aim at creating customized process plants in which production capacity is specified to meet certain market demand. However, it is known that, in most cases, significant economic savings can be achieved if standardized module-based designs can be developed instead of customized designs. In this paper, a novel framework is presented to generate a set of modular designs that are optimal for different ranges of market data based on customer requirements through integration of data-analysis and design/synthesis-optimization stages that have been traditionally performed separately. The basic idea is to apply a clustering methodology and design/synthesis optimization iteratively, allowing repartitioning of data based on design feasibility and a new optimization search based on the current clustering of data. The proposed approach expands the boundaries of design optimization to incorporate demand data analysis reflecting customer requirements. A detailed case study of a cryogenic air separation plant with current demand data is presented to illustrate the importance and practical relevance of the proposed approach. 1. Introduction Process synthesis has received a great deal of attention in the chemical engineering literature in the past two decades, focusing on issues related to the systematic representation of all different process alternatives, mathematical modeling, and solution procedures of the resulting large-scale MINLP problems (Grossmann,6 Kallrath14). The problem of process synthesis basically involves the selection of the type and sizes of process units, along with the optimal operating conditions, to minimize an economic objective function, which is commonly the overall cost consisting of the capital and operating cost. Reviews of the different frameworks and applications in the various areas of process synthesis are provided in Grossmann7 and Floudas.4 The area of process synthesis and design under uncertainty has also received considerable attention because of the variability in system characteristics and increasing market requirements and competition. Most of the existing approaches dealing with process synthesis under uncertainty are either deterministic (Grossmann and Floudas,8 Grossmann and Sargent9); characterized by discretization of the uncertain parameter space by considering discrete scenarios; or stochastic (Straub and Grossmann,23 Pistikopoulos and Ierapetritou,20 Ahmed et al.,1), based on the statistical information of the uncertain parameter, assuming that some information regarding the market data is provided either in the form of the most expected nominal point or specific range of values or in the form of a probability distribution function. Among the various approaches, multiperiod optimization is the most common technique for design and planning problems (Halemane and Grossmann,11 Iyer and Grossmann,12 Papalexandri and Pistikopoulos,19 Sahinidis et al.21). The main difficulty in solving a multiperiod model is that the computational complex* To whom correspondence should be addressed. E-mail: [email protected]. Tel.: 732-445-2971. Fax: 732-4452421.

ity increases exponentially with the number of periods and thus application to large-scale industrial problems becomes difficult and less robust. The motivation for this paper comes from (a) the need to incorporate market data analysis directly within design optimization and (b) the fact that significant economic savings can be achieved if standardized designs can be developed by taking into account the customer demand space. Thus, the objectives of this paper are to address the following important issues: (a) the need to obtain a set of optimal modular designs that can cover the entire space of market demand as described by the customer requirements and (b) the need to expand the design-optimization boundaries to incorporate market data analysis and thereby improve the decision-making process. To achieve these targets, the following tools are required: (1) an efficient approach to describe the feasible space of a given process accurately, (2) data clustering techniques to cluster the data into closely packed groups, and (3) a unified data-analysis/processoptimization framework for determining equipment designs. The paper is organized as follows. Following this introduction, in section 2, an example is presented to motivate the need for the proposed work. In section 3, the detailed proposed framework is presented, along with brief descriptions of the simplicial approximation algorithm (Goyal and Ierapetritou5) and the K-medoid clustering algorithm (Kaufman and Rousseeuw15), both of which are integrated within the main algorithm. In section 4, a case study of a cryogenic air separation plant (Sirdeshpande et al.22) is presented, in which the objective is the determination of an equipment portfolio for the different processing stages. The proposed approach is thus applied utilizing existing market data and resulting in modular designs for the entire demand space. The obtained results suggest an increased decision-making flexibility due to better utilization of the available market data and the integration with the

10.1021/ie020755+ CCC: $25.00 © 2003 American Chemical Society Published on Web 09/06/2003

Ind. Eng. Chem. Res., Vol. 42, No. 21, 2003 5205

Figure 1. Process flowsheet for the motivation example.

Figure 2. Demand plot for products P1 and P2.

design/synthesis stage. Finally, section 5 summarizes the work and presents future work directions. 2. Motivation Example The synthesis problem considered in this section consists of eight processes to convert raw materials A, B, and C into the two finished products P1 and P2, as shown in Figure 1. The complete MINLP formulation of the synthesis-optimization problem consists of 8 binary and 19 continuous variables and is described in Appendix A. Ii and Oi are the input and output flow rates in each stream i, respectively, whereas PCi and ki are yield constants, and MIi is the maximum allowable input flow to process i. FCi is the installation and OCi is the power cost for each process i. The demands for the two products, based on customer requirements, are as shown in Figure 2. Note that, for this example, the values of the demands for the two products are randomly generated within a fixed range (5-25) defined by demand variability around the nominal point (9, 17). The conventional approach for process design is to choose a fixed value or an uncertainty range for the product demand, depending on customer requirements, and to determine a customized design for this demand. For example, for nominal demands of P1 ) 9 and P2 ) 17, the optimal design configuration was determined to be (1, 0, 0, 1, 0, 0, 1, 1) upon solution of the corresponding MINLP optimization problem. To determine the flexibility of the design, the flexibility index (Swaney and Grossmann24) was evaluated with expected devia-

Figure 3. Flexibility plot for design configuration (1, 0, 0, 1, 0, 0, 1, 1).

tions of the products such that P1 ∈ [5, 20] and P2 ∈ [8, 25] and was found to be equal to 0.21. To determine the actual feasible region for the design, a grid search was performed on the uncertainty region, and the feasible region was approximated by the simplicial approximation approach (Goyal and Ierapetritou5). The final results are as shown in Figure 3. As new demand requirements appear, representing different clients, new designs have to be constructed for points outside the feasible space. Two remarks are in order here. First, if the flexibility index approach is used, even for demand values that can actually be covered by this design, for example, demands of 6 and 16 for products 1 and 2, respectively, a new design has to be developed mainly because of underestimation of the feasible region. Second, if the new demand is actually outside the feasible space, for example, the demand point of (13, 20) in Figure 3, a new customized design has to be developed, which will eventually result in a large number of customized designs, increasing the cost for both the customer and manufacturer. If on the other hand, one can use the supply chain information regarding all different customers at the design-optimization stage and develop a portfolio of different modular designs, as shown in Figure 4, that optimally covers the entire demand space, no additional customized designs are needed for essentailly any possible realizations of the demand. The work presented in this paper develops a systematic methodology to facilitate this target. The approach

5206 Ind. Eng. Chem. Res., Vol. 42, No. 21, 2003

Figure 4. Proposed design configurations (1, 0, 0, 1, 0, 0, 1, 1), (1, 1, 0, 1, 1, 0, 1, 1).

advances the current state of the art in process synthesis by integrating data analysis with design optimization, thus improving the decision-making process for both the manufacturer and the customer. In particular, the following benefits are achieved by this approach: (1) Plant manufacturer can achieve great savings, as modular-based designs have been obtained to cover the entire range of demand. For example, for the synthesis problem considered in this section, a large number of customized designs would have to be developed to satisfy the 25 customers, whereas by applying the proposed approach, the entire demand space can be satisfied by two standardized designs. (2) Customers are presented with richer information about different design alternatives and, thus, can make more intelligent decisions based on their needs and economic flexibility. For example, for the synthesis problem considered in this section, a customer with demand point P1 ) 12 and P2 ) 17 can choose between two standardized designs depending on the expected demand growth and the economic flexibility. 3. Unified Data-Analysis/Process-Optimization Framework The main target of the proposed algorithm is to expand the boundary of the design/synthesis-optimization problem by integrating the data-analysis stage, thus allowing more flexibility at the decision-making process. The basic idea is to determine a set of optimal designs that jointly cover the entire demand space determined by customer requirements. The proposed algorithm, shown schematically in Figure 5, consists of two basic stages, the “feasibility” stage and the “optimality” stage, that are presented in detail in the next subsections. 3.1. Feasibility Stage. The scope of this stage is to determine the minimum number of designs that can cover the demand space. Step 1. Cluster the demand data into a small number of clusters using the clustering algorithm described in subsection 3.3. Step 2. Choose the cluster medoids as representative points of the clusters, and solve the design-optimization problem for each medoid. The medoid is defined as the point from which the average distance to all points within the cluster is minimum. Thus, the medoid is the

central point of each cluster and, hence, is selected as a representative of the particular cluster for design optimization. Step 3. Evaluate the feasible region for each of the designs obtained at step 2 using the simplicial approximation technique, described in subsection 3.2. Step 4. Determine the union of the feasible regions of the designs. If all demands are covered by at least one design, stop the feasibility-stage iterations; otherwise, continue the iterations from step 1 by reclustering the data using one more cluster. The following points should be noted: (1) It is not possible to know a priori the minimal number of designs that can cover the entire demand space; thus, an iterative procedure has to be utilized that starts from a small number of designs. (2) Reclustering of the data, based on the results of the feasibility evaluation at the previous iteration, allows for the determination of the optimal number of clusters at the end of the iterative procedure, as the clusters are redefined and the points that are not covered by the feasible regions of the designs obtained at the previous iteration are reassigned. The same clustering algorithm is used for the reclustering stage, as all data points are considered. (3) The conventional multiperiod approach is not used at this stage because the clustering procedure cannot guarantee that the optimal number of clusters is obtained. (4) Because the data set can involve some outliers, the proposed approach terminates when the inclusion of an additional point increases the number of clusters by one. In this case, it is more profitable to leave these points for customized designs. 3.2. Optimality Stage. At the end of the feasibility stage, the designs obtained are “locally optimal”, i.e., optimal for the medoids of the clusters, and jointly cover the entire demand space with their operability limits. However, there is no guarantee that these designs are optimal with respect to capital and operating costs. Thus, the target at this stage is to determine the optimal designs that can cover the demand space using the same number of clusters. The following steps are involved in this stage: Step 1. Determine the center of each simplicial by solving a linear program as described in detail in Goyal and Ierapetritou5 and solve the design-optimization problem for each center. Step 2. Evaluate the feasible regions for the new set of designs using the simplicial approximation technique. Step 3. Check each design change regarding the following optimality criteria: (a) the overall cost, including installation and operating costs (For a given design the average operating cost consists of the cost of operation at all demand points covered by its feasible region averaged over the number of points.) and (b) the design flexibility based on the number of demand points covered by the design. If the overall cost of the new design has decreased and the flexibility has increased or remained the same, accept the change, and select the new design. Step 4. Repeat steps 1-3 for all of the designs that were changed and satisfied the optimality criteria. Continue until the flexibility starts to decrease or the overall cost begins to increase or remains unchanged. The following points in the optimality-stage iterations should be further discussed. At step 1, the centers of


Figure 5. Proposed algorithm.

the simplicials are selected as redesign centers to reduce the cost of the design because the center of the simplicial most often represents a point of lower capacity. The center of each simplicial can be calculated by solving a simple linear program to determine the center of a polytope, as described in the simplicial approximation algorithm (Goyal and Ierapetritou5). At step 3, to determine whether the new design is better than the

previous design, the installation cost and the average operating cost at all points covered by each design are calculated and compared. This quantifies the value for each design with respect to its feasible space (simplicial). After the interior points of a simplicial are obtained using the “point in a polytope” algorithm (O’Rourke18), the operating costs are calculated at all interior points and averaged over the number of interior points to


obtain the average operating cost of the design. This total simplicial cost of the new design is then compared with the total simplicial cost of the previous design. Also, the flexibility of the new design is compared with the flexibility of the previous designs in terms of the number of demand points covered. Note, that if the new design does not cover some of the demand points and those points are also not covered by any of the present simplicials, the new design is rejected, because this means that the uncovered points are not covered by any of the present set of designs. If the new design obtained has an overall cost lower than that of the previous design and same or better flexibility, the change is accepted. This check is performed for all designs, and at the end of each iteration, a modified set of optimal designs is obtained. The designs that are changed in the previous iteration are used as the set of designs for another iteration, and the iterations continue until the designs obtained are either more expensive or less flexible. A case study of an air separation plant is presented in the next section to illustrate the applicability of the proposed approach. 3.3. Feasibility Analysis: Simplicial Approximation of the Feasible Region. The determination of the range of feasible operability for a given design is achieved utilizing the simplicial approximation approach, proposed by Goyal and Ierapetritou.5 The approach is based on explicitly approximating the boundary ∂R, of the feasible region, R, of an n-parameter design space by a polyhedron made up of n-dimensional simplicies. Briefly, the procedure consists of the following steps: Simplicial Convex Hull. Step 1. Determine any m points p1, p2 ,..., pm on the boundary ∂R, where m g n + 1 and n is the dimension of the parameter space. Step 2. Construct the convex hull of the set of m points using the quickhull algorithm (Barber et al.2). Step 3. Given the first approximation of ∂R, obtain the largest hypersphere that can be inscribed in the convex hull. Step 4. Determine which of the mH faces of the polyhedron that are tangent to the inscribed hypersphere is the largest. Step 5. Determine a new boundary point by making a one-dimensional search in the outward normal direction, starting from the center of the hyperplane found at step 4. Step 6. Add the new point to the set of boundary points and return to step 2. Outer Convex Polytope. Step 1. Generate the tangent hyperplanes at the boundary points obtained by the simplicial approximation process. Step 2. Obtain the points of intersection of the tangent hyperplanes. Step 3. Generate the convex hull using the points of intersection, which serves as an envelope of the simplicial convex hull. The procedure iterates between the inner and outer approximations until convergence is achieved according to a tolerance specified for the volume of the feasible region, and the inner region is used as an approximation of the feasible region. 3.4. Data Analysis: K-Medoid Clustering. At the first step of the proposed approach, the demand data are grouped together using a data clustering algorithm.

Data clustering is one of the basic tools for exploring the underlying structure of a given data set and has been applied in a wide variety of engineering and scientific disciplines such as medicine, biology, and pattern recognition. The primary objective of cluster analysis is to partition a given data set of multidimensional vectors into homogeneous clusters such that patterns within a cluster are more similar to each other than patterns belonging to different clusters. Many different clustering techniques have been proposed over the years, and they can be distinguished as either hierarchical or partitional clustering methods (Jain and Shim13). Hierarchical techniques produce a nested sequence of partitions, with a single, all-inclusive cluster at the top and singleton clusters of individual points at the bottom, whereas partitional techniques create a onelevel (unnested) partitioning of the data points. If K is the desired number of clusters, then partitional approaches typically find all K clusters at once. In recent years, a number of clustering algorithms for large data sets have been proposed.10,17 In Ng et al.,17 a partition clustering algorithm, CLARANS, for large databases is proposed that is based on randomized search. Each cluster is represented by its medoid, the most centrally located point in the cluster, and the objective function is to find the k best medoids. The authors reduce this problem to that of a graph search by representing each set of k medoids as a node in the graph, with two nodes being adjacent if they have k - 1 medoids in common. In Guha et al.,10 the authors proposed a state-of-theart clustering algorithm, CURE, that tackles not only time complexity but also efficiency for nonuniform clusters. CURE employs a novel hierarchical clustering algorithm that adopts a middle ground between the centroid-based approach and the nearest-neighbor approach. However, data clustering in still an active field of research, and a large number of algorithms are available depending on the nature and size of the data set. The clustering technique used in this paper is the K-medoid clustering algorithm proposed by Kaufman and Rousseeuw,15 which is based on the idea of finding k representative objects, medoids, among the objects of the data set. The idea is to select objects that are in the center of each cluster. Among the different K-medoid clustering algorithms, the PAM (partitioning around medoids ) approach of Kaufman and Rousseeuw15 is used in this work as the main algorithm to cluster the demand data. PAM starts from an initial set of medoids and iteratively replaces one of the medoids by one of the nonmedoids if the move improves the total distance of the resulting clustering. The basic steps of the algorithm are as follow: Step 1. Select K representative objects arbitrarily. Step 2. For each pair of nonselected object h and selected object i, calculate the total swapping cost, TCih, Step 3. For each pair of objects i and h, if TCih < 0, replace i by h; otherwise, make no swap. Then, assign each nonselected object to the most similar representative object. Step 4. Repeat steps 2-3 until there is no change in the total swapping cost. The details of the algorithm can be found in Kaufman and Rousseeuw.15 Because PAM compares an object with the entire data set to find a medoid, it has a computational time complexity of O(k(n - k))2, where k is the number of clusters and n is the number of points in the data set. Different data clustering algorithms can


Figure 6. Schematic of the equipments in an air separation plant.

be utilized, especially if the size of the data set is increased, without affecting the proposed approach because the basic idea of the clustering step is to group the demand data into clusters and then choose a specific representative point as a design-optimization point. For example, for a data set containing more than 100 points, CLARA (Kaufman and Rousseeuw15) could be utilized, whereas for a data set containing as many as 100 000 points, CURE (Guha et al.10) can be integrated. 4. Case Study: Air Separation Plant 4.1. Process Description. The case study presented in this section is a cryogenic air separation system. The main product of the plant is gaseous oxygen, with gaseous nitrogen as waste nitrogen and with no liquid coproducts. The process for the separation of air can be modularized into four basic operations: heat exchange, refrigeration, distillation, and compression. A schematic representation of the process is given in Figure 6. Each processing stage can utilize units of different capacities depending on the product demand. Depending on the size and type of equipment used for each process, a number of different options and suboptions are available, representing various unit operation decisions. Two main heat-exchanger systems are available, with the major difference between them being the costs of installation and the warm-end temperature differences. The warm-end temperature difference leads to a warm-end

enthalpy loss and thus impacts the refrigeration requirement for the process. The distillation system consists of two distillation columns, a high-pressure column and a low-pressure column. There are four primary distillation systems based on the low-pressure column diameter, and each of the distillation systems has two suboptions for the high-pressure diameter. Four basic options are available for the refrigeration system. They mainly differ in the feed pressure of the turbine air stream. In options A and C, the turbine air is taken directly from the main air compressor, whereas in options B and D, the turbine air is taken from the auxiliary compressor. Furthermore, in options C and D, the turbine air drawn is additionally compressed in the turbine air compressor. The main compression consists of two parts. The first part compresses the main air to the primary plant pressure, and the second part compresses a portion of the main air to an elevated pressure. There are four options for main air compression depending on the flow rate of air, and each option has two suboptions controlled by the flow rate of high-pressure air. The numbers of options and suboptions for each stage are summarized in Table 1. The model constraints represent the mass and energy balance equations, along with variable bounds. Details of the process equations are available in Sirdeshpande et al.22 The designoptimization problem is formulated as an MINLP problem, whereas the simplicial approximation approach requires the solution of nonlinear and linear

5210 Ind. Eng. Chem. Res., Vol. 42, No. 21, 2003 Table 1. Summary of Options for the Unit Operations in the Air Separation Plant unit operation

number of main options

number of suboptions

heat exchanger distillation column refrigerator compressor

2 4 4 4

4 2 2

Table 2. Main Heat Exchanger System Options exchanger set

A

B

exit stream temperature (K) installed cost ($ × 1000)

292.5 1,430

296.8 2,600

Table 3. Refrigeration System Options refrigeration option 1 1 2 2 3 3 4 4

max turbine air flow (kg‚mol/h) cost ($ × 1000) max turbine air flow (kg‚mol/h) cost ($ × 1000) max turbine air flow (kg‚mol/h) cost ($ × 1000) max turbine air flow (kg‚mol/h) cost ($ × 1000)

A

B

C

D

1000 910 1325 975 1825 1,073 2450 1,203

1000 910 1325 975 1825 1,073 2450 1,203

1000 845 1325 845 1825 936 2450 1,040

1000 845 1325 845 1825 936 2450 1,040

Figure 7. Demand plot for the case study.

problems, as described in section 3.2. All of the optimization problems discussed here are modeled and solved using GAMS (Brooke et al.3). Brief descriptions of the equipment options/suboptions are presented in Tables 2-5. The data for the demand of oxygen were generated by evaluating the demand for products that use oxygen and the plant capacities for these products as shown in Tables 6 and 7 (Sweeney25). Using this information, a demand plot for oxygen was generated for flow rate and pressure as demand variables, as shown in Figure 7. The objective of the problem is to determine the optimal set of designs to cover the demand space. The optimal set is defined as the minimum number of designs with minimum cost that can be used to cover the entire demand space. The designs should have the following characteristics: (1) The feasible regions of the designs, defined using the simplicial approximation scheme, should cover the entire demand space. (2) Each design should have the lowest total cost for the given feasible region determined by its simplicial. The proposed approach is thus applied in the air separation plant case study and the results are presented in the next subsection. 4.2. Results. 4.2.1. Feasibility-Stage Iterations. The application of the proposed approach was initialized by assuming three clusters. After application of PAM, the medoids for the three clusters were as shown in Table 8 and also illustrated by the asterisks in Figure 8. The design-optimization problem was solved at each

Figure 8. Design feasibility plot assuming three clusters.

medoid, and the flexibility regions of the designs were evaluated using the simplicial approximation algorithm. To compare the efficiency of the simplicial approximation algorithm, a grid-search simulation was performed for each of the designs, and the feasible regions were found to be accurately approximated by the simplicial approximation technique as shown in Figure 8, where the dots represent the feasible points and the simplicial of each design is given by the closed polytopes. Because a significant number of demand points were not covered by the proposed designs, another iteration of the feasibility stage was carried out, and the number of clusters was increased to four. The demand medoids were then as shown in Table 8. Steps 2-5 of the

Table 4. Distillation Column System Options main column system

A

B

heat inleak (kcal/hr) max air flow (kg‚mol/h) HP column subsystem total cost ($ × 1000) max main air (kg‚mol/h)

45 650 3100

55 700 4100

1 2,113 1450

2 2,158 1850

C

1 2,665 1850

D 73 800 6000

2 2,756 2850

1 3,445 2850

90 250 10 500 2 3,543 3900

1 4,095 3900

2 4,154 5000

Table 5. Main Air Compression System Options main air compression system max air flow (kg‚mol/h) HPA comp subsystem total cost ($ × 1000) max HP air (kg‚mol/h)

A

B

3100 1 5,070 1,350

C

4100 2 5,330 1,950

1 5,200 1,800

D

6000 2 5,525 2,600

1 7,150 2,650

10 500 2 7,540 3,825

1 7,800 3,375

2 8,255 5,000

Ind. Eng. Chem. Res., Vol. 42, No. 21, 2003 5211 Table 6. Sample Plant Capacities plant capacitya product

A

B

C

ethylene oxide vinyl chloride monomer propylene oxide titanium dioxide vinyl acetate

150 550 200 100 150

300 1100 400 200 300

600 2200 800 400 450

a

Capacities in millions of pounds per year.

Table 7. Factors for Developing the Consumption of Tonnage of Oxygena in the North American Chemical Industry

a

product

oxygen tonnage

ethylene oxide vinyl chloride monomer propylene oxide titanium dioxide vinyl acetate

1.01 0.13 1.26 0.50 0.33

Figure 9. Design feasibility plot assuming four clusters.

Tons per ton of product.

Table 8. Cluster Medoids in the Feasibility Stage Iterations no. of clusters 3 4 5 6

medoids (kg‚mol/h, atm) (411.2, 50), (514, 10), (1696.2, 41) (411.2, 50), (514, 10), (1130.8, 41), (1696.2, 41) (411.2, 50), (514, 10), (1233.6, 41), (1696.2, 35), (2056, 41.5) (411.2, 50), (215.9, 10), (771, 15), (1233.6, 41), (1696.2, 35), (2056, 41.5)

Table 9. Selected Equipment Configurations at the End of the Feasibility Stage design medoid

heat refriger- distil- compresexchanger ation lation sion

215.9 kg‚mol/h, 10 atm 771 kg‚mol/h, 15 atm 411.2 kg‚mol/h, 50 atm 1233.6 kg‚mol/h, 41 atm 1696.2 kg‚mol/h, 35 atm 2056 kg‚mol/h, 41.5 atm

A A A A A A

C1 A1 A1 A2 A3 A3

A1 B1 A1 C1 D1 D2

A1 B1 A1 C1 D1 D1

Table 10. Equipment Configurations Considered at the Optimality Stage design medoid 343.7 kg‚mol/h, 17.4 atm 691.5 kg‚mol/h, 16.9 atm 384.4 kg‚mol/h, 45 atm 918.8 kg‚mol/h, 38.4 atm 1455.5 kg‚mol/h, 37.25 atm 1865 kg‚mol/h, 40 atm

heat refriger- distil- compresexchanger ation lation sion A A A A A A

C1 C1 A1 C2 C3 C2

A1 B1 A1 C1 D1 D2

A1 B1 A1 C1 D1 D1

feasibility stage were then applied, which resulted in the feasibility plot shown in Figure 9. The iterative procedure was then continued, and Figures 10 and 11 represent the feasibility regions after the design optimization at the third and fourth iterations. The medoids for each of the iterations are as listed in Table 8. After the fourth iteration, most of the demand points were covered by the feasible regions of the designs obtained, and the iterations were terminated. Note that the remaining few uncovered demand points were left to be covered by customized designs. This decision was based on the fact that an attempt to cover these points would result in an increase in the number of clusters anyway and, hence, an increase in the number of designs. After the termination of the feasibility-stage iterations, six designs were accepted as

Figure 10. Design feasibility plot assuming five clusters.

the final feasible designs that could successfully cover the demand space. The equipment configurations obtained for the six designs at the end of the feasibility stage iterations are shown in Table 9. 4.2.2. Optimality-Stage Iterations. At the beginning of the optimality-stage iterations, the center of the six simplicials were calculated (shown in Table 10) and were considered as the new design points. The designoptimization problem was solved at each of these points, and the feasible regions of the new designs were determined using the simplicial approximation algorithm. The results of the design optimization are listed in Table 10, and the feasible regions of these designs are shown in Figure 12. The next step of the optimality stage was to check whether the new designs were better than the designs at the end of the feasibility stage. The detailed comparisons of the two design sets are summarized in Table 11 and shown in Figure 13. In particular, for the first change, which corresponds to the design center moving from (215.9 kg‚mol/h, 10 atm) to (343.7 kg‚mol/h, 17.4 atm), the new design obtained was the same as the previous design, and hence, no design validation needed to be performed, and the design was finalized. For the design determined after the demand point was moved from (771 kg‚mol/h, 15 atm) to (691.5 kg‚mol/h, 16.9 atm), a smaller installation and average operating cost was achieved, but the new design had a smaller feasible region, which trans-


Figure 11. Final plot for the feasibility stage iterations for six clusters. Table 11. Design Validation for the Optimality Stage Iterations design medoid (kg‚mol/h, atm)

inst. cost

av. opt. cost

designs gained

designs lost

result same designs -

215.9, 10 V 343.7, 17.4

9.458 × 106 -

-

-

-

771, 15 V 691.5, 16.9

1.0205 × 1.014 × 107

4.598 × 106

0

3

411.2, 50 V 384.4, 45

9.523 × 106 -

-

-

-

1233.6, 41 V 918.8, 38.4

1.3 × 107 1.287 × 107

9.055 × 106 9.986 × 106

1

2

rejected

1696.2, 35 V 1455.5, 37.25

1.4398 × 107 1.4261 × 107

1.337 × 107 1.2078 × 107

3

2

accepted

2056, 41.5 V 1865, 40

1.4457 × 107 1.4229 × 107

1.7156 × 107 1.589 × 107

0

4

rejected

107

lates into a reduced number of demand points that can be covered, as shown in Figure 13. Because those points cannot possibly be covered by any other existing design, the new design was rejected, and the previous design was accepted. For the move of the design point from (411.2 kg‚mol/h, 50 atm) to (384.4 kg‚mol/h, 45 atm), the design obtained was the same and, hence, accepted without change, whereas, for the move of the design point from (1233.6 kg‚mol/h, 41 atm) to (918.8 kg‚mol/ h, 38.4 atm), the new design obtained had a cheaper installation cost but a higher average operating cost, which resulted in an increased total overall cost. Also, the new design had smaller flexibility because two demand points were lost and one was gained. Thus, the new design was rejected, and the previous design accepted. For the move from (1696.2 kg‚mol/h,35 atm) to (1455.5 kg‚mol/h, 37.25 atm), the new design obtained was cheaper in terms of installation and average operating cost. Also, the flexibility of the design slightly increased, as three demand points were covered but two demand points lost. Consequently, the new design was accepted for this region. For the final move from (2056 kg‚mol/h, 41.5 atm) to (1865 kg‚mol/h, 40 atm), the design obtained had a cheaper installation and average

rejected same designs

Figure 12. Final plot after optimality stage iterations for six clusters.

operating cost but a smaller feasible region and thus resulted in a loss of coverage of demand points that could not be covered by any existing design, as shown


takes, on average, 20 CPU s using GAMS/DICOPT (Viswanathan and Grossmann26), whereas the NLP problem solution involved in the simplicial approximation requires less than 3 CPU s, on average, using GAMS/MINOS (Murtagh and Saunders16). 5. Summary and Future Directions

Figure 13. Overlap plot of the two equipment configuration sets.

A novel framework is presented in this paper for the integration of data-analysis and design/synthesisoptimization stages that have been traditionally performed separately. The basic idea is to apply a clustering methodology and design/synthesis optimization iteratively, thereby allowing for the repartitioning of data based on design feasibility and a new optimization search based on the current clustering of data. The final step involves repartitioning of the demand space based on the results of the feasibility analysis so as to obtain cheaper designs to cover the entire demand space. The main significance of the proposed approach is that it expands the boundaries of design-optimization decision making to incorporate demand data analysis reflecting customer information. Future research involves the incorporation of design decisions together with the synthesis-optimization problem. This problem would assume that discrete equipment choices have not been made and that the optimal designs have to be determined. The incorporation of the probability of existence for the data point is also under investigation. Acknowledgment M.G.I. gratefully acknowledges financial support from the National Science Foundation under NSF CAREER Program Grant CTS-9983406 and partial financial support by BOC Gases. Appendix A

Figure 14. Final design configurations and feasible regions.

in Figure 13. Thus, because of the lower flexibility, the new design was rejected, and the previous design was accepted. To summarize the results, at the end of the optimality stage, two of the designs did not change, and only one of the four new designs was accepted. Because there was only one new design, the optimality-stage iterations were terminated at this stage, and the set of designs was accepted as the final designs. The final design configurations and their feasible regions are shown in Figure 14. Note that, in the optimality stage, the alternative designs obtained offered some tradeoffs between cost and flexibility that should be weighted appropriately by the decision maker to reach the optimal decision. For example, as illustrated in the detailed discussion above, some moves result in designs with a lower cost but smaller flexibility, so that it is not obvious which design to choose. Although some rules were applied for this case-study, all of the design alternatives could be kept and the tradeoffs analyzed by the design engineers according to their preferences. Moreover, for the points that are covered by more than one design, there are also some interesting tradeoffs that have to be considered because the more expensive designs have higher flexibility and thus higher profits can be anticipated if higher demand is realized. In terms of the computational complexity of the proposed approach for this case-study, the design-optimization problem

Process Equations for the Motivation Example.

max 1775P1 + 1500P2 - (200A + 550Bf + 300C) 8

OCiIi ∑ i)1

8

2

Oi ) PCiIi

-

FCiyi ∑ i)1

i ) 4, 7, 8

Oi e PCi ln(1 + Ii/ki)

i ) 1, 2, 3, 5, 6

I4 ) O1 + O2 + O3 + Bf I7 + I8 ) O4 + O5 + O6 Ii - MIiyi e 0 I3 - 10.0y3 g 0 I4 - 25.1y4 g 0

∀i


I5 - 14.0y5 g 0 I6 - 30.0y6 g 0 Bf e 30 PCi ) {18, 20, 25, 0.9, 22, 28, 0.65, 0.9} ki ) {20, 21, 26, - , 21, 25, - , -} MIi ) {20, 25, 40, 80, 55, 75, 60, 60} OCi ) {5, 18, 7, 15, 8, 18, 17, 10} FCi ) {800, 1500, 3000, 1000, 1000, 1650, 1100, 1000} Literature Cited (1) Ahmed, S.; Sahinidis, N. V.; Pistikopoulou, E. N. An Improved Decomposition Algorithm for Optimization under Uncertainty. Comput. Chem. Eng. 2000, 23, 1589. (2) Barber, C. B.; Dobkin, D. P.; Huhdanpaa, H. The Quickhull Algorithm for Convex Hulls. ACM Trans. Math. Soft. 1996, 22, 469. (3) Brooke, A.; Kendrick, D.; Meeraus, A. GAMS: A User’s Guide; The Scientific Press: Redwood City, CA, 1988. (4) Floudas, C. A. Nonlinear and Mixed Integer Optimization: Fundamentals and Applications; Oxford University Press: New York, 1995. (5) Goyal, V.; Ierapetritou, M. G. Determination of Operability Limits Using Simplicial Approximation. AIChE J. 2002, 48, 2902. (6) Grossmann, I. E., Ed. Global Optimization in Engineering Design. Nonconvex Optimization and Its Application; Kluwer Academic Publishers: Dordecht, The Netherlands, 1996. (7) Grossmann, I. E. Mixed Integer Optimization Techniques for Algorithmic Process Synthesis. In Advances in Chemical Engineering, Process Synthesis; Anderson, J. L., Ed.; Academic Press: New York, 1996; Vol. 23, pp 171-246. (8) Grossmann, I. E.; Floudas, C. A. Active Constraint Strategy for Flexibility Analysis in Chemical Processes. Comput. Chem. Eng. 1987, 11, 675. (9) Grossmann, I. E.; Sargent, R. W. H. Optimal Design of Chemical Plants Design with Uncertain Parameters. AIChE J. 1978, 24, 1021. (10) Guha, S.; Rastogi, R.; Shim, K. CURE: An Efficient Clustering Algorithm for Large Databases In Proceedings of ACM SIGMOD, International Conference on Management of Data (SIGMOD98); Haas, L., Tiwary, A., Eds.; Association of Computing Machinery (ACM): New York, 1998; pp 73-84.

(11) Halemane, K. P.; Grossmann, I. E. Optimal Plant Design under Uncertainty. AIChE J , 1983, 29, 425. (12) Iyer, R. R.; Grossmann, I. E. Optimal multiperiod operational planning for utility systems. Comput. Chem. Eng. 1997, 21, 787. (13) Jain, K. A.; Shim, R. K. Algorithms for Clustering Data; Prentice Hall: Upper Saddle River, NJ, 1988. (14) Kallrath, J. Mixed Integer Optimization in Chemical Process Industry: Experience, Potential and Future. Trans. Inst. Chem. Eng. A 2000, 78, 809. (15) Kaufman, L.; Rousseeuw, P. J. Finding Groups in Data: An Introduction to Cluster Analysis; John Wiley & Sons: New York, 1990. (16) Murtagh, B. A.; Saunders, M. A. MINOS 5.1 User’s Guide; Technical Report SOL 83-20R; Systems Optimization Laboratory, Department of Operations Research, Stanford University: Stanford, CA, 1987. (17) Ng, R. T.; Han, J. Efficient and Effective Clustering Methods for Spatial Data Mining. In Proceedings of the 20th International Conference on Very Large Databases; Santiago, Chile, 1994 (18) O’Rourke, J. Computational Geometry in C; 2nd ed.; Cambridge University Press: New York, 1998. (19) Papalexandri, K. P.; Pistikopoulos, E. N. A multiperiod MINLP model for the synthesis of flexible heat and mass exchange networks. Comput. Chem. Eng. 1994, 18, 1125. (20) Pistikopoulou, E. N.; Ierapetritou, M. G. A Novel Approach for Optimal Process Design under Uncertainty. Comput. Chem. Eng. 1995, 19, 1089. (21) Sahinidis, N. V.; Grossmann, I. E.; Fornari, R. E.; Chathrathi, M. Optimization Models for Long Range Planning in the Chemical Industry. Comput. Chem. Eng. 1989, 13, 1049. (22) Sirdeshpande, A.; Ierapetritou, M. G.; Andrecovich, M. J.; Naumovitz, J. P. Process Synthesis Optimization and Flexibility Evaluation of Air Separation Cycles. AIChE J. 2003, manuscript submitted. (23) Straub. D. A.; Grossmann, I. E. Design Optimization of Stochastic Flexibility. Comput. Chem. Eng. 1993, 17, 339. (24) Swaney, R. E.; Grossmann, I. E. An Index for Operational Flexibility in Chemical Process DesignsPart I: Formulation and Theory. AIChE J. 1985, 26, 139. (25) Sweeney, P. BOC Gases, Murray Hill, NJ. Personal communication, 2002. (26) Viswanathan, J.; Grossmann, I. E. A Combined Penalty Function and Outer Approximation Method for MINLP Optimization. Comput. Chem. Eng. 1990, 14, 769.

Received for review September 25, 2002 Revised manuscript received July 10, 2003 Accepted July 11, 2003 IE020755+

Integration of Data Analysis and Design Optimization for the

Recommend Documents