A Kriging-Based Approach to MINLP Containing Black-Box Models

Jul 17, 2008 - A new MINLP algorithm is presented for a class of problems whose formulation contains black-box models. Black-box models describe proce...
0 downloads 0 Views 771KB Size
Ind. Eng. Chem. Res. 2008, 47, 6101–6125

6101

A Kriging-Based Approach to MINLP Containing Black-Box Models and Noise Eddie Davis and Marianthi Ierapetritou* Department of Chemical and Biochemical Engineering Rutgers, The State UniVersity of New Jersey, 98 Brett Road, Piscataway, New Jersey 08854

A new MINLP algorithm is presented for a class of problems whose formulation contains black-box models. Black-box models describe process behavior when closed-form equations are absent and can be functions of continuous and/or integer variables. To address the lack of explicit equations, kriging is used to build surrogate data-driven global models because a robust process description can be obtained even when noise is present. The global models are used to identify promising solutions for local refinement, and the continuous variables are then optimized using a response surface method. The integer variables are optimized using Branch-andBound if a continuous relaxation exists and direct search otherwise. The four algorithms are unified into a comprehensive approach that can be used to obtain optimal process synthesis and design solutions when noise and black-box models are present. The performance of the proposed algorithm is evaluated based on its application to two industrial case studies. 1. Introduction 1.1. Problem Definition. Process synthesis problems are difficult to solve when model equations are unavailable and noisy input-output data are the only accessible information. Model equations may be either unknown because first-principles models have not yet been developed, as in the case of emergent technologies, or inaccessible because they are embedded within process train legacy codes. Because of the lack of closed-form equations, the process behavior is described as being “blackbox”. Surrogate models can be alternatively generated in place of the black-box models, but because of the combinatorial nature of process synthesis problems, many substitute models may need to be constructed. Although model reliability can be improved using additional information, resource constraints can limit the number of additional experiments allowed. Because it may not be possible to a priori estimate the problem cost in terms of the number of experiments required, there is a need for strategies targeted at the generation of sufficiently accurate surrogate models at low resource cost. There are two important cases motivating the need to address problems having noise and black-box characteristics: (1) resource allocation during the early product life cycle for funding of the most promising research projects and (2) process train optimization for systems that have been retrofitted or no longer exhibit the same process behavior for which current policies have been determined. The problem addressed in this paper is the attainment of optimal synthesis and design policies for systems in which integer variables participate at both the synthesis and design stages. The need for accurate model construction using a small number of sampling data is even more paramount because of the combinatorial aspect at both the synthesis and design stages. There are several modeling and optimization challenges when black-box models are present, and the only system information is in the form of noisy sampling data. The first question is whether a local or global surrogate model should be constructed. Although local models are cheaper to build, global optimality is not guaranteed. Conversely, because it may not be possible to a priori estimate global model building costs, the sampling * To whom correspondence should be addressed. E-mail: marianth@ soemail.rutgers.edu.

expense associated with attaining a reliable model may become prohibitively high. The application of local techniques to even partially refined global models may only be limited to a subset of the warm-start iterates, and the global solution may still be missed. Second, because of the noise, there will be some uncertainty associated with both the surrogate models and the point estimate obtained for the optimum. In the former case, an accurate description of the system behavior is not guaranteed, and in the latter case, global optimality cannot be guaranteed. If the surrogate models interpolate the noisy data, then derivative-based techniques can lead to the identification of poor search directions because the model has captured the noise instead of the true process behavior. However, the construction of noninterpolating models can require additional sampling information, thereby increasing the resource costs needed to solve the problem. Third, when integer variables are present, a combinatorial complexity is introduced because of the set of possible integer feasible solutions, and many nonlinear programming (NLP) subproblems may need to be solved in order to find the integer optimum. If too many experiments are performed during the early problem-solving stages, resource constraints may prevent the best NLP solutions for later subproblems from being found. Conversely, poor initial NLP solutions for early subproblems can lead to increased searching being required later on as well. The difficulty lies in not being able to provide accurate estimates as to how many NLPs need to be solved because resources could then be better allocated to ensure that the information value obtained from solving each NLP subproblem is maximized. When the black-box models are functions of strictly integer variables, local gradient-based methods such as the response surface methodology (RSM) cannot be applied because fractional values are infeasible. Direct search is thereby proposed for the optimization of this class of variables. In addition, for process synthesis problems, another class of integer variables exists outside the black-box functions for which a continuous relaxation is permitted. In order to obtain an integer optimal solution, Branch-and-Bound (B&B) is proposed for the optimization of this variable class. In our previous work, an algorithm incorporating B&B, kriging, and RSM was presented to deal with the class of problems in which the black-box functions depended on continuous variables alone.1 The meth-

10.1021/ie800028a CCC: $40.75  2008 American Chemical Society Published on Web 07/17/2008

6102 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008

odology is now extended to address the case when the blackbox functions depend on both continuous and strictly integer variables. The contribution of this paper is the unification of kriging, RSM, direct search, and B&B into a comprehensive algorithm that can be used to solve a more general class of mixed-integer nonlinear program (MINLP) containing blackbox models and noise. 1.2. Literature Review. In the field of stochastic programming, a variety of techniques have been developed to solve MINLPs containing variables with uncertainty. One class of techniques determines the solution based on information obtained from complementary deterministic problems created after obtaining sampling realizations in the uncertain space. More specifically, outer approximation is used to solve mixed-integer linear program and NLP subproblems formulated using the sample average approximation method.2 Confidence intervals on lower bound/upper bound (LB/UB) are refined by solving a higher number of replicated subproblems created from an additional number of realizations in the uncertain space. This methodology has been extended by using the simplicial approximation approach to describe the feasible region using a convex hull approximation.3 However, these methods assume model equation availability in order to obtain linearization information, so they cannot be directly applied toward the solution of the problem involving black-box models. In the complementary field of deterministic global optimization, Floudas et al.4 present a review of the state-of-the-art techniques. Because the majority of the gradient-based techniques assume knowledge of closed-form models, they cannot be directly applied to the problem class addressed in this paper because the formulation contains black-box models. However, an approach has been proposed by Meyer et al.5 for NLP problems containing nonanalytical differentiable constraints. The solution is obtained using a global model whose form is that of a blending function from which valid over- and underestimators can be generated in order to provide -guarantee of global optimality. Zero-order techniques have also been employed for solving noisy MINLP lacking closed-form equations. Chaudhuri and Diwekar6 wrapped a stochastic annealing algorithm around ASPEN in order to obtain the solution of a hydrodealkylation synthesis under uncertainty. Because surrogate models are not constructed when using zero-order techniques such as Nelder-Mead,7 divided rectangles,8 multilevel coordinate search,9 and differential evolution,10 convergence to the optimum can be slow, motivating the use of gradient-based algorithms. However, the key problems associated with using first- and second-order techniques are asymptotic convergence and premature termination at artificial local optima, which are induced by noise. Modifications to gradient-based techniques overcome the latter problem using finite differences of large enough interval length to filter out noise.11–13 However, the use of methods that do not rely on model building for optimization results in a limited understanding of the system behavior, motivating the application of methods for which optimization is applied once surrogate models have first been built. RSM belongs to a class of local methods in which gradient techniques can be sequentially applied to inexpensively fitted local models known as response surfaces.14 Each response surface assumes the form of a low-order polynomial that is minimized to find the next candidate solution. A new response surface centered at the new iterate is built and minimized, and the procedure is repeated until a convergence criterion such as a prespecified decrease in the objective function has been attained. The

Figure 1. B&B-kriging-RSM algorithm.

sampling expense is the primary source of the optimization cost because the response surfaces can be built cheaply using leastsquares fitting. In our recent work, an algorithm employing (1) adaptive experimental designs to retain feasibility and (2) projection of the n-dimensional response surface onto constraints has been successfully applied toward identifying NLP solutions in higher dimensions described by arbitrary convex feasible regions.15 In addition to local optimization of response surfaces, global methods have also been studied for box-constrained problems, which rely upon a larger set of basis functions used in response surface construction.16–19 Local techniques such as RSM ensure a global optimal solution only under convexity conditions, which cannot be known in advance for problems containing black-box functions. Therefore, these approaches are inefficient for solving MINLP whose relaxed NLP subproblems are nonconvex because only local optima may be found. This problem can be addressed by first building global models in order to identify the best areas to apply the local methods. One popular global modeling technique comes from the geostatistics literature and is known as kriging. In kriging, the black-box model outputs are treated as realizations of normally distributed random function models.20 The methodology has seen increasing use because an uncertainty mapping can be obtained in addition to the global model. Kriging originated in 1951 and was targeted at determining the spatial distribution of mineral deposits in mining applications based on physical sampling data. More recently, kriging has been increasingly applied to develop models based on the complementary field of computer experiments when simulation is used to describe complex processes.21 The technique is interpolation-based whereby a prediction and its variance at a test point are obtained according to a weighted sum of the nearby sampling data. Because the basis functions used in kriging are random function models, noise is assumed to be a natural feature of the black-box system, and the weighting is automatically adjusted to account for the process stochasticity. Once both a kriging prediction and its uncertainty estimate have been obtained for a test point, the procedure is repeated for all remaining test points that cover the feasible region. An initial global prediction and variance mapping is then generated. If additional sampling is conducted in high-variance regions lacking sampling data, updated models can be generated in order to improve model accuracy. Despite its advantages, the main limitation associated with kriging is that the model building costs are higher compared to response surface methods because a more detailed description of the process behavior is obtained. Therefore, once a global model is determined to be accurate,

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6103

Figure 2. Data smoothing applied to squared function differences Fi,j (a) in order to obtain a semivariogram fit (b) and a covariance function fit (c).

additional sampling should instead be directed at locally optimizing the best kriging solutions. One way of assessing kriging model accuracy is by determining the corresponding average prediction value. If convergence is achieved, RSM and direct search are applied to refine the best local optima found. B&B is also employed to obtain integer optimal solutions when integer variables having a continuous relaxation are present. The details of each algorithm are presented more fully after the targeted MINLP problem class has been mathematically formulated in the next section. 2. Solution Approach The problem addressed in this work can be expressed in the following form: min F(x, y1, y2, z1, z2) s.t. g(x, y1, y2, z1, z2) e 0 h(x, y1, y2, z1) ) 0 z2(x, y1) ) [1 + N(µ, σ2)]Γ(x, y1) x ∈ Rn,

y1 ∈ {0, 1}q1,

(1)

y2 ∈ {0, 1}q2

On the basis of this formulation, the vectors of continuous and strict integer variables are given by x and y1, respectively. The vector of integer variables having a continuous relaxation is given by y2. The objective function is represented by F, and the deterministic variables z1 describe outputs whose modeling equations h(x, y1, y2, z1) are known. The vector of stochastic output variables z2 exists when the input-output functionality Γ(x,y1) is a black box, and each z2 variable is modeled as having a mean deterministic output value perturbed by an additive noise component. One sample model for the noise is that it behaves according to a normal distribution. For field experiments, an estimate of the parameters µ and σ2 can be obtained by conducting replicate experiments for a given sampling vector. It should be noted that the modeled values of µ and σ2 may need to assume a range of values if it is known from prior field data that the noise is spatially variant with respect to the input specification (x, y1). Synthesis equations are given by g(x, y1, y2, z1, z2), which includes design constraints, operating specifications, and logical relations. In the problem formulation, it is assumed that the synthesis variables appear only as part of the feasible region constraints and/or objective function. A unified B&B-kriging-RSM-direct search algorithm (Figure 1) is proposed for the solution of the class of problems described by eq 1. In subsection 2.1, the basic idea of the method is presented. Subsection 2.2 describes how kriging is used to build global models, subsection 2.3 explains how the best kriging solutions are refined by optimizing response surfaces, and subsection 2.4 provides details for optimizing the integer design variables using two direct search methods. Finally, subsection 2.5 summarizes the steps of the unified algorithm. 2.1. Method Outline. In the proposed algorithm, y2 is optimized using B&B. At each node of the B&B tree, partially

relaxed NLP subproblems are formulated whose solutions correspond to optimal x and y1. The designation of “partially relaxed NLP” is termed as such because while a continuous relaxation is permitted for y2, strict integrality must always be enforced for y1. At the root node, the first partially relaxed NLP is generated by relaxing all y2. An initial kriging model of the corresponding objective is constructed from a nominal sampling set Ω containing sampling points S0 ) {x0, y10, y20}. The kriging model is subsequently updated using additional sampling information located at regions of high variance and minimal prediction and where model behavior significantly changes over consecutive iterations. For each model, the average predictor value is compared to the corresponding one obtained at the previous iteration, and once convergence has been attained, further model refinement is terminated. The local kriging solutions are then identified as warm-start iterates for further local optimization using RSM and direct search. Because no convexity assumptions are required when applying kriging, the set of potential local optima can be identified once an accurate global model has been obtained. For each kriging solution, the set of {x, y2} variables is locally optimized using RSM for fixed values of y1. Given a nominal vector SK representing one of the best kriging solutions, additional sampling data are obtained from a collocation set localized around SK. A quadratic model is fitted using least squares and then locally optimized. A new model is constructed at the new optimum, and the sequential optimization of response surfaces continues until convergence in the objective has been achieved. The vector SR ) {xR, y1K, y2R} is then defined as the RSM optimal solution, and the search for optimal y1 is now initiated using one of two direct search methods. In the first method, sampling is performed at interval end points centered around SR. The interval is defined according to a stencil whose bracket length is adaptively decreased once the neighborhood containing optimum y1 has been found. If the initial interval length is greater than unity, smaller intervals are constructed around the current best solution. The procedure terminates when convergence in the objective is achieved or when the newest interval length requires y1 to assume fractional values. The vector SD ) {xR, y1D, y2R} is then established as the solution of the current NLP subproblem. In the second direct search method, a global search strategy is employed whereby the objective attained for SR is compared to the corresponding objectives obtained from sampling at the lowest and highest feasible values for y1. Smaller search intervals centered around the new y1 optimal solution are generated, and sampling is conducted at the new end points. The best solution is again found, and the procedure is repeated until the stopping criterion has been satisfied. The solution SD, while optimal in x, y1, and relaxed y2, may not be integer feasible for y2. Additional subproblems are therefore created when 0-1 integrality in y2 is not satisfied.

6104 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008

For each new subproblem, the kriging predictor is refined over the corresponding feasible subregion in order to determine best “warm-start” locations for further local optimization. The algorithm terminates when the list of candidate NLP subproblems is exhausted, and the MINLP solution is established as the integer optimal solution in terms of both y1 and y2. The proposed algorithm relies on the sequential optimization of the continuous and integer variables x and y1, respectively. Because of this, the discovery of a local optimum, if not the global optimum, may prove to be more difficult if x-y1 separability does not exist. This problem can be addressed by (1) locally optimizing multiple kriging solutions or (2) creating multiple kriging models using different nominal sampling sets. The second strategy alleviates the attainment of suboptimal local solutions obtained from a kriging model whose initial sampling set has been poorly selected. However, the recourse costs associated with building replicate global models or conducting extensive local sampling can become prohibitively high, and so an area of ongoing research is focused at determining an optimal sampling arrangement. In the following subsections, each of the individual algorithms will be presented in more detail, starting with the kriging methodology. 2.2. Kriging Methodology. In kriging, sampled output data are treated as the realizations of a random function in order to improve the modeling of a black-box system assumed to be inherently stochastic. The kriging model is based on a random function, and both a point value and variance estimate are obtained for each test point Sk. The global model is built by mapping the kriging predictions with respect to the input variables. The corresponding variance mapping is primarily used to determine sampling locations for model refinement at highvariance regions. Each kriging estimate at Sk is generated as a weighted sum of nearby sampled function values. The weights are generated as a function of the Euclidean distance between the sampling vectors close to Sk in a manner similar to inverse distance weighting methods, in which higher weighting is generally given to sampled function values whose vectors reside near Sk. Although kriging models can be generated with respect to the process output variables z2 for the black-box units, the kriging model that is directly used in the unified algorithm is the one built with respect to the relaxed NLP objective F. This is done because it is F that is being optimized and not z2. The weights are determined using a covariance function, which measures the correlation strength between two objective values based on a Euclidean distance equation as given by eq 2: di,j ) √Si(x, y2) - Sj(x, y2)

(2)

where it should be noted that the complete representation of any sampling or test vector is given by the triplet (x, y1, y2), even though it is only x and relaxed y2, the continuous components, that participate in the di,j computation. The two vectors employed in eq 2 can be either two sampling points {Si, Sj} or a sampling point and a test point {Si, Sk}. Although coefficient values must first be determined for the covariance function, a reliable covariance function can usually be built using limited sampling data, enabling a set of predictions and, in turn, a global model to be generated at low cost. The steps for obtaining a kriging estimate at a test point are as follows: (1) determination of covariance function coefficients; (2) calculation of the covariance Cov(di,k) between a test point and neighboring sampling points; (3) generation of the weight vector λ from solving the linear system Cλ ) D, where the

elements of C and D are {Si, Sj} and {Si, Sk} covariance values, respectively; (4) estimation of the kriging predictor as given by eq 3. kcluster

∑ F(S ) λ(S )

F(Sk) )

i

(3)

i

i)1

In eq 3, F(Sk) represents the prediction value at Sk, kcluster denotes the number of nearby sampling points, and F(Si) denotes the sampled output at Si. The estimation of F(Sk) is generally improved when the set of Euclidean distances between the test point and its nearby sampling points are all different. The methodology for obtaining the covariance function coefficients is now presented. From the set of STot sampling data, squared function differences Fi,j are obtained for each sampling pair Si-Sj as given by eq 4: Fi,j ) [F(Si) - F(Sj)]2 i, j ) 1, ..., STot,

i*j

(4)

A semivariance function is then fitted from a scatter plot of Fi,j built with respect to di,j. Several standard semivariance models from the literature are tested in order to ascertain which one provides the best least-squares fit.20 However, because of the plot complexity as shown in Figure 2a, it may not be immediately apparent as to which model provides the best fit. Data smoothing is applied to alleviate this problem, and the semivariance function is then fitted to a reduced set of scatter points known as semivariances γ, as shown in Figure 2b. A set of P equidistant intervals are defined between zero and the maximum di,j distance. The pth interval midpoint is denoted by hp, and the semivariance corresponding to the pth interval, γ(hp), is obtained by averaging the set of squared function differences falling inside this interval, as given by eq 5: γ(hp) )

1 2N(hp)

N(hp)

∑F

i,j,r

p ) 1, ..., P,

i*j

(5)

r)1

where N(hp) is the number of sampling pairs {Si, Sj} whose separation distance di,j lies inside the pth interval. The semivariance function rises from zero to an asymptotic maximum known as the sill σVARIO2 and is reflected between zero and the sill in order to generate the corresponding covariance function as displayed in Figure 2c. The covariance between any two sampling vectors can then be determined from substitution of di,j or di,k into the covariance model. The kriging weights are then obtained as the solution of a linear system of equations in which the left-hand side consists of a matrix of {Si, Sj} covariances and the right-hand side consists of a vector of {Si, Sk} covariances. If the weights are forced to sum to unity, the linear system can be recast in a Lagrangian formulation as given by eq 6:

[ ] [

λ(Sk) Cov(di,j) 1 ) λ ′ (Sk) 1 0

][ -1

]

Cov(di,k) 1 i, j ) 1, ..., kcluster,

i*j

(6)

where λ(Sk) and λ′(Sk) represent the weight vector and the Lagrange multiplier, respectively. Once the weights are obtained, the kriging prediction F(Sk) and variance σk2(Sk) are obtained according to eqs 3 and 7, respectively: kcluster

σk2(Sk) ) σVARIO2 -

∑ λ(S ) Cov(d ) - λ ′ (S ) i

ik

k

(7)

i)1

Kriging is again applied to generate a function estimate at a new test point Sk, and once all ktest predictions have been obtained, the global mapping is constructed. If additional

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6105

Figure 3. Kriging model at initial (a), intermediate (b), and final (c) stages.

sampling is performed, a new covariance function is generated. On the basis of the updated covariance function, new kriging estimates can be generated for all ktest sampling points and a refined global model can be created. For each global model, its corresponding average predictor value µ is compared against its counterpart based on the previous model. Further refinement is terminated once the value of µ has converged. Let the iteration index m refer to any property based on the mth kriging model, and let TolKrig be a percentage stopping tolerance. A sample range for TolKrig would be any value between 1 and 10%. The mth average prediction value µm is defined as the average of the set of kriging predictions and sampled function values as given by eq 8: 1 µm ) STot,m + ktest,m

(∑

STot,m

i)1

ktest,m

Fi(Si) +

)

∑ F (S ) r

r)1

k

(8)

where STot,m and ktest,m refer to the number of sampled function values and test points employed in constructing the mth global model. On the basis of this notation, STot,1 denotes the nominal sampling set used to create the first global model. The nominal value of µ0 is obtained by averaging the sampled function values from STot,1. Once the mth global model has been constructed, µm/µm-1 is evaluated. If this ratio falls inside the interval 1 ( 0.01TolKrig, the mth global model is considered accurate, and no additional updating occurs. Conversely, if µm/µm-1 falls outside 1 ( 0.01TolKrig, another model is built using additional sampling data. To increase model reliability, sampling is performed in regions of high uncertainty characterized by high prediction variances and at points whose kriging prediction value has changed significantly between iterations.15 In order to emphasize global model improvement, an additional criterion is enforced whereby all new sampling points reside some minimum distance apart from one another. This practice ensures that the new sampling set will not consist of clustered points located at a single high-variance region, thereby deemphasizing local model refinement. In order to achieve improved warm-start iterates for subsequent local optimization, the new sampling set also includes vectors whose kriging predictors yield the current best objective value estimates. At regions where locally optimal kriging predictions are obtained, refined grids are generated and the corresponding set of new vectors is added to the test point set ktest. It should again be noted that if z2 is substituted in place of F for the equations given in eqs 3–5 and 8, a kriging model with respect to z2 can also be similarly built. A set of global models are presented in Figure 3, which show the kriging predictor stabilizing after it has been updated using the sampling rules. The procedure for obtaining a prediction at Sk, referred to as the kriging algorithm, is summarized as follows. First, an

iteration index m is initialized at unity. If the black-box model represents an intermediate process within a process train, output sampling data z2 obtained for some upstream processes may be required in order to fully define the feasibility constraints g(x,y1, y2, z1, z2) and h(x,y1, y2, z1). A nominal sampling set Ω is specified that can contain as few as 10 points even when the MINLP is described by as many as 40 input variables. As long as the starting size of Ω is not too small, the number of iterations required to achieve convergence in µm will be relatively insensitive to the number of sampling vectors in the nominal sampling set. However, the initial number of sampling points in Ω should be kept low in order to place emphasis on further sampling as needed during the iterative stages of predictor refinement. The location Sk is specified and kcluster nearestneighbor sampling points are chosen from Ω that are nearest to Sk as given by eq 2. The value of kcluster usually varies between 5 and 10 regardless of the problem dimension, although the estimate of F(Sk) may be skewed if sampling information is too sparse. Semivariances are then generated using all sampling data within Ω. The best semivariance model is fitted using least squares, and the complementary covariance function is then obtained. The matrices on the right-hand side of eq 6 are then constructed from submatrices hcluster, h0, C, and D, as given by eq 9. The matrices C and D are augmented in order to remain consistent with the Lagrangian formulation given in eq 6, in which the weights are required to sum to unity.

hcluster )

h0 )

[

0 d2,1 l dkcluster,1

[ ] d1,k d2,k l

[ [

]

· · · d1,kcluster 0 · · · d2,kcluster ) [di,j] · ·· l l dkcluster,2 · · · 0 i, j ) 1, ..., kcluster d1,2

) [di,k] i ) 1, ..., kcluster

dkcluster,k

Cov(hcluster) 1 C) 1 0 Cov(h0) D) 1

]

] (9)

The kriging weights λ are then obtained by solving the system of equations presented in eq 6, and prediction of F(Sk) and its variance σk2(Sk) is determined. The weights are then recalculated for each of the remaining ktest sampling vectors in order to generate corresponding estimates F(Sk) at each new test point Sk. Once the global mapping has been constructed, µm is obtained from eq 8 and compared against µm-1. If convergence is not

6106 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008

Figure 4. Flowchart of the kriging algorithm, a data-driven global modeling method.

achieved, m is advanced by unity and additional sampling is performed based on the application of the sampling rules. A new covariance function is built, new kriging estimates F(Sk) are generated, and an updated mapping is built. The procedure is terminated once the value of µm has converged. The best local solutions are then identified for sequential local optimization of x and relaxed y2 using RSM and of y1 using direct search. The details of RSM are presented in the next subsection, and a flowchart of the kriging algorithm is shown in Figure 4. 2.3. Local Optimization Using RSM. Response surfaces describe local behavior in the form of low-dimensional polynomials fitted to a sparse data set, although accuracy cannot be guaranteed, especially in the case of noisy data. These models can then be optimized using standard gradient techniques. If additional sampling at the response surface optimum yields a new best solution, a new model is then built that is centered around this iterate. Because integrality in the optimum cannot be guaranteed, response surfaces are built over the x - y2 space only because fractional y1 values are infeasible. For noisy systems, least-squares fitting enables models that more closely describe the true functionality instead of the noise to be built, enabling more reliable search directions to be identified. The set of input points for building the response surface typically conforms to a stencil arrangement known as an experimental

design that is centered about an iterate, which can be any of the best kriging solutions.14,22 The central composite design (CCD) shown in Figure 5 for a problem in two dimensions requires 1 + 2n + 2n sampling points for a problem of dimension n and is the template used at the RSM refinement stage for the proposed algorithm. This design is associated with a lower sampling expense relative to a factorial design because data are not obtained at every factor-level combination. If the problem dimensionality in the x and relaxed y2 variables is high, the sequential optimization of a subset of the variables, while holding the remaining ones fixed, may be necessary in order to avoid a prohibitive sampling cost. At the start of the algorithm, the iteration index w is initialized at a value of unity. A response surface is built around a starting iterate S0 by fitting sampling data obtained from a collocation set Scoll,w determined by the CCD and local model radius bw. If RSM is applied to warm-start iterates found using kriging, S0 is initialized at the kriging optimal solution SK. The vector S0 and its corresponding objective value F0 comprise the nominal solution set {Sopt,w, Fopt,w}. Once the response surface has been created, the optimum Sopt,w+1 having corresponding value Fopt,w+1 is determined using gradient methods and then confirmed as a new best solution by sampling. If the difference between the current and previous optimum |Fopt,w+1 - Fopt,w| falls below a

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6107

Figure 5. CCD for response surface generation in two dimensions.

prespecified criterion TolRSM, the algorithm terminates with {Sopt,w+1, Fopt,w+1} established as the RSM solution. Otherwise, the iteration index is advanced by unity and another response surface having new bound bw is constructed at the new Sopt,w. At any iteration w, the value of bw+1 is different from that of bw only if the Euclidean distance between the current and previous solution vectors is lower than the current radius bw. During the later stages of the algorithm, Sopt,w+1 will be near Sopt,w, signifying that the basin of the RSM optimum has been found. At this point, a more accurate description of the system behavior near the optimum can be attained using more localized response surfaces. Whenever iterates are close to the boundaries, lower-dimensional response surfaces are created by projecting the model onto constraints so as to prevent model generation based on an asymmetrical arrangement of the feasible sampling data.15 A flowchart of the algorithm is presented in Figure 6. The RSM optimal solution is denoted as F(SR) ) F(xR,y1K,y2R). Once its value has been attained, the next step is to optimize the y1 variables using direct search as presented in the next subsection. 2.4. Local Optimization Using Direct Search. Because direct search methods do not rely on derivative information in the search for an optimum, convergence can be slow. In order to address this, a global search algorithm is presented in addition to a local one. The local and global techniques are denoted as DS-L and DS-G, respectively. Some common features of both methods are that (1) search is performed based on simple heuristics and (2) each y1 variable is sequentially optimized one at a time. These methods are intended as starting points from which more sophisticated variations can be developed as a future work. The local direct search method, DS-L, is presented first. One of the y1 variables is selected for optimization, which has lower and upper limits given by YLL and YUL. The optimization iteration index m is initialized at unity. The current best solution Ym is initialized at a nominal sampling vector having corresponding objective F(Ym). If a warm-start iterate has been attained using kriging and RSM, F(Ym) is set as the RSM optimum F(SR). When y1 is multidimensional, F(Ym) is instead set as the objective value corresponding to the sampling vector containing the subset of previously optimized y1 variables. For ease of presentation, F is shown as being a function of only one y1 variable because all other components of a multidimensional sampling vector are held constant. The DS-L algorithm consists of two steps: (1) sampling at points YmL and YmU whose values are lower and higher than Ym; (2) finding the best solution of these three values as given by the following: F(Ym+1) ) min{F(YmL), F(Ym), F(YmU)}

(10)

where Ym+1 is the optimal sampling point based on the set of objective function values attained at the bracket midpoint and end points. The iteration index m is advanced by unity, and the procedure is repeated until (1) YmL and YmU are found to be inferior solutions to Ym and (2) no unsampled points lie between

Figure 6. Flowchart of the RSM algorithm, a gradient-based technique for the optimization of continuous (x) and relaxed integer (y2) variables using surrogate models.

YmL, Ym, and YmU. At this point, the value of Ym is fixed at its optimal solution YD. The DS-L method is then applied to the next y1 variable. After the set of y1 variables have been optimized, the procedure is terminated. The method by which YmL and YmU are generated for each iteration m is based on the application of a simple heuristic targeted at accelerating convergence. There are three cases to be considered, according to whether Ym is (1) equal to YLL, (2) equal to YUL, or (3) between YLL and YUL. The equations used to generate the triplet {YmL, Ym, YmU} are presented for each case as follows: Case L1: Ym is equal to YLL. YmL ) Ym YmU ) Ym + 2Nm Ym ) Ym + Nm LL

Case L2: Ym is between Y

(11)

UL

and Y .

YmL ) Ym - Nm Ym ) Ym YmU ) Ym + Nm

(12)

UL

Case L3: Ym is equal to Y . YmL ) Ym - 2Nm Ym ) Ym - Nm YmU ) Ym

(13)

where Nm is an integer-valued step-length parameter that must be initialized. When the range of permissible values for a given y1 variable is high, initializing Nm at a value greater than unity enables larger steps to be taken toward the optimum. In the DS-L algorithm, Nm is set to be approximately 1/10 of the feasible region range for the respective y1 variable. As an example, in the second case study, the first y1 variable can assume values between 1 and 9, and so Nm is initialized at unity. In contrast, the second, third, and fourth y1 variables can assume integer values between 1 and 72, so Nm is initialized at 10. As the optimization progresses, the mth iterate Ym may be located near a boundary. On the basis of the current value of Nm, it is possible for the corresponding bracket end points YmL and YmU to be defined

6108 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008

at infeasible sampling points. To address this, a new value of Nm+1 is prescribed at each iteration m in order to ensure a symmetrical sampling arrangement as given by eq 14: Nm+1 ) min{Nm, Ym - YmL, YmU - Ym}

(14)

When the objective value corresponding to the bracket midpoint is found to be lower than the end-point solutions, the vicinity of the optimum has been located. The value of Nm+1 is then halved as given by eq 15, in order to refine the optimum using a sequence of smaller brackets. Nm+1 ) [0.5Nm]

(15)

Once the midpoint-end-point bracket length has fallen to unity and no unsampled points remain between YmL, Ym, and YmU, the DS-L algorithm is terminated at the optimal solution. The next y1 variable is selected for optimization, and the DS-L procedure is repeated until optimal values have been found for all y1 variables. The complete y1 optimal solution set is referred to as y1D. A flowchart of the methodology is presented in Figure 7. The main limitation of the DS-L algorithm is that the final solution may not be globally optimal. In order to avoid termination at a local solution, the DS-G algorithm can instead be applied. The DS-G method relies on global search and differs from its DS-L counterpart in the mechanism by which YmL, Ym, and YmU are defined. First, the iteration index m and the current solution Ym are initialized in the same way as that presented for the DS-L method. Nominal bracket end points YmL and YmU are initialized at the feasible region boundaries YLL and YUL. The bracket midpoint YmC is defined between YLL and YUL, with the ceiling function being applied to ensure integrality satisfaction. After sampling is performed at the bracket end points and midpoint, the current solution F(Ym) is updated as given by eq 16: F(Ym+1) ) min{F(YmL), F(YmC), F(YmU), F(Ym)}

(16)

The next step is to define a new bracket over which the global optimum is expected to be found. If the nominal solution at Ym continues to be the best solution, the new bracket is defined by applying cases G1 or G2, depending upon whether the nominal value is less than or greater than the initial bracket midpoint F(YmC). Conversely, if the best solution is attained at the bracket midpoint or end points, the new bracket is defined by applying cases G3, G4, or G5, respectively. Case G1: F(Ym+1) ) F(Ym), YmL < Ym < YmC Ym+1L ) Ym - min([YmC - Ym], [Ym - YmL]) Ym+1C ) Ym Ym+1U ) Ym + min([YmC - Ym], [Ym - YmL]) Case G2: F(Ym+1) ) F(Ym), Ym < Ym < Ym C

(17)

U

Ym+1L ) Ym - min([YmU - Ym], [Ym - YmC]) Ym+1C ) Ym Ym+1U ) Ym + min([YmU - Ym], [Ym - YmC])

(18)

Case G3: F(Ym+1) ) F(Ym ) L

Ym+1L ) YmL Ym+1C ) 0.5(YmL + YmC) Ym+1U ) YmC

(19)

Case G4: F(Ym+1) ) F(YmC) Ym+1L ) 0.5(YmL + Ym) Ym+1C ) YmC Ym+1U ) 0.5(Ym + YmU)

(20)

Case G5: F(Ym+1) ) F(YmU) Ym+1L ) Ym Ym+1C ) 0.5(YmL + YmU) Ym+1U ) + YmU

(21)

Once the new bracket has been generated, the iteration index m is advanced by unity and additional sampling is performed as necessary. The best solution F(Ym+1) is again determined based on the bracket end-point and midpoint solutions, and the procedure is repeated until convergence in the objective has been achieved. In order to increase the probability of finding the global optimum YD, a set of brackets can be defined for each new iteration, which collectively enclose both the optimal and near-optimal solutions. The additional sampling expense is offset by the increased chances of finding a solution that, at worst, is equivalent to the one obtained using the local method and, at best, is the global optimum. The procedure is then repeated for the remaining y1 variables, and the complete y1 optimal solution set is again given as y1D. A flowchart of the DS-G method is presented in Figure 8. The sequential application of kriging, RSM, and direct search enables optimal process design and operations policies to be determined for problems containing variable classes x and y1. When y2 variables are present, the application of B&B enables integer optimal y2 to be obtained because a continuous relaxation exists for this class of variables. In the next section, the B&B methodology is presented first. Following this, the four methodssB&B, kriging, RSM, and direct searchsare then unified into a comprehensive algorithm. 2.5. B&B-Kriging-RSM-Direct Search Algorithm. The B&B algorithm is used to bracket the integer optimal y2 solution objective between a converging sequence of lower and upper bounds (LB/UB). Each LB and UB corresponds to the solution attained for a partially relaxed NLP subproblem. At the start of the procedure, the initial LB and UB are set at -∞ and +∞, and the first partially relaxed NLP subproblem is formulated by relaxing all y2 variables. The optimal solution to the NLP is classified as a LB if it is integer-infeasible in the y2 variables and an UB otherwise. If integer feasibility is not met, two new NLP subproblems can be formulated that require integer feasibility for any or all of the fractional y2 variables. By application of the floor and ceiling functions to an integerinfeasible vector, two disjoint subregions are generated that define the feasible space for each new NLP. If the global solution has been attained for a partially relaxed NLP subproblem, a solution that is obtained over a reduced feasible region cannot be better than the solution attained for the parent NLP. Therefore, when the NLP solution has been designated as an UB, no additional subproblems are formulated. Conversely, when the solution is a LB, additional subproblems are created only if the LB is lower than the best UB. As the optimization progresses, a sequence of monotonically increasing LB and monotonically decreasing UB are generated that bracket the objective corresponding to the y2 integer optimal solution. The procedure is terminated once the list of candidate NLP subproblems is empty or the LB/UB integrality gap has fallen below a stopping tolerance TolBB. The integer optimal solution

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6109

Figure 7. Flowchart of a local direct search (DS-L) algorithm, a method for optimizing a single strictly integer (y1) variable.

Figure 8. Flowchart of a global direct search (DS-G) algorithm, a method for optimizing a single strictly integer (y1) variable.

corresponds to the best UB and is designated as the solution to the original MINLP. Each of the B&B, kriging, RSM, and direct search algorithms target the optimization of a different variable class: y2, x - y1 - y2, x, and y1, respectively. The four algorithms are now combined into a comprehensive B&B-kriging-RSMdirect search algorithm addressing the MINLP formulated in eq 1. The first step is to formulate a partially relaxed NLP subproblem that has been relaxed in the y2 variables. A

nominal sampling set is specified, and the initial kriging model is built. The model is then iteratively refined based on sampling at (1) regions of high uncertainty, (2) regions where the minimum value of the objective lies, and (3) regions where significant model change is observed over consecutive iterations until a stopping criterion has been satisfied. The stopping criterion can be resourcebased, such as an upper limit on the number of computer or field experiments performed, or model-based, such as when convergence has been achieved in the average prediction value. After the stopping criterion has been satisfied, the model predictor values are ranked and the locally optimal kriging solutions are identified. A kriging solution SK ) (x, y1, y2)K is then selected to be a “warmstart” iterate for local optimization using RSM. The RSM iteration index w is initialized at unity, and the nominal solution SK is denoted as Sopt,w. The y1 components of Sopt,w are held fixed, and a response surface centered at Sopt,w and having radius bw is built and minimized. The new solution is given as Sopt,w+1, updated bounds bw+1 for the next response surface are obtained, the iteration index is advanced by unity, and the procedure is repeated until convergence has been achieved in the objective. The x - y2 components of the RSM optimal solution SR ) (xR, y1K, y2R) are now fixed, and the y1 variables are then optimized using the local or global direct search methods presented in Figures 7 or 8, respectively. Once convergence in the objective has been attained, the solution is given as SD ) (xR, y1D, y2R) and the next kriging solution SK is then locally optimized. Once all kriging solutions have been opt refined, the partially relaxed NLP solution SNLP is designated as the best SD solution attained, which has corresponding objective value Fopt NLP. Because of resource constraints, refinement of multiple local kriging solutions may need to be restricted to a subset of the warm-start iterates. For the two examples presented in section 3, the best kriging solution is the only warmstart iterate that is further refined using RSM/sirect search. The final step is to optimize the y2 variables using B&B in order to obtain the integer optimal solution in y2. If integer

6110 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008

Figure 9. Flowchart of the detailed B&B-kriging-RSM-direct search algorithm. opt feasibility is not satisfied for the y2 variables, FNLP is classified as a LB and new NLP subproblems are formulated by enforcing integrality on a subset of the y2 variables. Kriging, RSM, and direct search are applied to each new NLP subproblem, and additional LB/UB are established depending on whether the corresponding solution is integer-feasible in y2. New partially relaxed NLP subproblems are created when a solution designated as a LB is superior to the current best UB. Once the list of candidate NLP subproblems is exhausted or another stopping criterion has been achieved, such as when the LB/UB integrality gap has fallen below the stopping tolerance TolBB, the procedure is terminated. The integer optimal solution vector and objective opt Sopt MINLP and FMINLP is then defined as the one for which the lowest UB solution has been attained. The proposed methodology can be used to determine the problem’s optimal solution when uncertainty is present in both process input and output variables. If a noisy output variable for the mth unit Um is also an input variable for the next downstream unit Um+1, the range of possible values for this output variable also describes the parameter uncertainty for this corresponding input variable in the (m + 1)th unit. As an example, the noisy distillate and bottoms flow rates for the problem presented in the first case study are considered to be uncertain input parameters in the form of feeds to subsequent columns in the separation system. It is recognized that the local (DS-L) and global (DS-G) direct search methods for optimizing the y1 variables rely on a simple midpoint evaluation to drive search toward an optimum.

Sampling at the respective midpoints of a set of the best subintervals, rather than the single best subinterval, is still a very simple optimization strategy, and convergence to a global optimum cannot be guaranteed. If the optimization of the continuous and/or integer variables x and y1, respectively, is performed by sequential optimization of a subset of the variables at a time, the discovery of a local solution in a subset that is optimized early prevents the global solution from being attained in the remaining set of variables to be optimized. However, the DS-L and DS-G strategies for the optimization of integer variables are intended to serve as simple starting methods in place of more effective techniques to be applied within the framework of the unified algorithm. Other pattern search and evolutionary techniques have been tested for the solution of mixed-integer problems, but their promise is also rather limited by either premature convergence at a local optimum or prohibitive resource cost in order to find the global solution.23–25 The proposed methodology of the unified algorithm is shown in a flowchart as given by Figure 9. In the next section, this algorithm is applied to two industrial case studies to demonstrate proof of concept. 3. Examples In this section, the performance of the proposed approach is evaluated based on their application to two industrial case studies. In the first example, presented in subsection 3.1, the objective is to determine process synthesis and design specifica-

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6111

Figure 10. Superstructure of the t-BMA separation train.

tions while minimizing the monthly operating costs for tertbutyl methacrylate (t-BMA) production.26 In the second example, presented in subsection 3.2, the objective is to determine optimal design specifications for alcohol dehydrogenase (ADH) manufacture.27 If kriging, RSM, and local direct search are employed to determine the optimal solution, an algorithmic designation of K-R-L is used. Similarly, if kriging, RSM, and global direct search are used, the corresponding designation of K-R-G is employed. For each example, a table of solution information is provided based on application of both the K-R-L and K-R-G algorithms. Because the first problem includes synthesis variables, B&B is also used to obtain the integer optimal y2 variables. All results are obtained using an HP dv8000 CTO Notebook PC containing a 1.8 GHz AMD Turion 64 processor. 3.1. Example 1: tert-BMA Synthesis. t-BMA is a monomer used in the production of industrial and household coatings. In this example, the objective is to minimize the total cost required to satisfy a monthly demand of 457 874 kg of 97.5% t-BMA.

The total cost is the sum of in-house production costs and a third-party purchase at $1.10/kg of t-BMA. In-house production costs are defined as the sum of raw material (RM) and utility costs. The RMs used in t-BMA production are isobutylene (IB) and methacrylic acid (MA), having corresponding costs of $0.17/ kg and $0.88/kg, respectively. The reaction product is a fourcomponent mixture that enters a three-cut separation train. The utility cost components of the production costs are defined as the cooling water, chilled water, refrigerant, and steam costs required for the separation train. The problem is formulated as an MINLP that contains synthesis decisions describing the distillation tower sequencing for t-BMA purification and design decisions representing the amount of fresh RM feed, column reflux, reboiler temperature, and column feed tray location. The continuous process is initiated by feeding IB and MA to a 6.7 m3 reactor pressurized at 3.426 bar, in which the following reactions occur: IB + MA T t-BMA and IB f DIB. The first reaction describes the reversible acid-catalyzed production of t-BMA, and the second reaction represents IB

6112 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 Table 1. Separation Train Task List for the t-BMA Case Study Presented in Example 1 separation train no.

task 1

task 2

task 3

1

DIB distillation using an 18-tray column

IB flash from the task 1 distillate

2

DIB distillation using a 40-tray column

IB flash from the task 1 distillate

3

t-BMA distillation using an 18-tray column

IB flash from the task 1 distillate

4

t-BMA distillation using an 18-tray column

5

t-BMA distillation using a 40-tray column

DIB distillation from the task 1 distillate using a 40-tray column IB flash from the task 1 distillate

6

t-BMA distillation using a 40-tray column

t-BMA distillation from task 1 bottoms using a 40-tray column t-BMA distillation from task 1 bottoms using an 18-tray column DIB distillation from task 2 bottoms using a 40-tray column IB flash from task 2 bottoms DIB distillation from task 2 bottoms using an 18-tray column IB flash from task 2 bottoms

DIB distillation from the task 1 distillate using an 18-tray column

Table 2. Synthesis and Design Decisions for the t-BMA Case Study Presented in Example 1 variable

type

description

x1 {x2, ..., x1,1} {x1,2, ..., x2,1} {y1,1, ..., y1,10} y2,1 y2,2 y2,3 y2,4 y2,5 y2,6 y2,7 y2,8

continuous continuous, design continuous, design integer, design integer, synthesis integer, synthesis integer, synthesis integer, synthesis integer, synthesis integer, synthesis integer, synthesis integer, synthesis

% fresh feed from storage reboiler temperature (columns 1-10) reflux (columns 1-10) feed tray location (columns 1-10) presence/absence of separation train 1 presence/absence of separation train 2 presence/absence of either one of separation trains 3 or 4 presence/absence of separation train 3 presence/absence of separation train 4 presence/absence of either one of separation trains 5 or 6 presence/absence of separation train 5 presence/absence of separation train 6

Table 3. Cost and Thermophysical Data for the t-BMA Case Study Presented in Example 1 i

cost of RM i [$/unit]

unit

IB MA H2SO4 NaOH t-BMA saturated steam cooling water chilled water refrigerant

88.19 16.98 2.65 3.3 110 2.2 0.26 0.52 5.28

100 kg 100 kg 100 kg 100 kg 100 kg 1000 kg 10 000 L 10 000 L 10 000 L

∆Hi [kJ/kg]

Tb [K]

Fi [kg/m3]

Cp,i [kJ/(kg K)]

∆Ti [K]

2000

457.16 293.15 277.6 177.6

1000 1000 683

4.84 4.184 1.74

20 20 20

dimerization resulting in diisobutylene (DIB) formation, an unwanted side product. The conversion achieved is approximately 45%, after which a caustic 50% NaOH solution is then fed to the reactor to neutralize the H2SO4 catalyst. The principal reaction product consists of IB, DIB, t-BMA, and MA and is flashed at 1.01 bar to remove the majority of IB. The remaining process operations involve further t-BMA purification from IB, DIB, and MA using distillation or flash operations. At the end of the purification train, DIB exits as a waste product, t-BMA is either sold off or used elsewhere in the plant, and IB and MA are recycled back to the reactor. The problem is solved using CHEMCAD simulation-based optimization with the following options: (1) a UNIFAC thermodynamic model, (2) a latent heat global enthalpy model, (3) TOWR distillation models, which simulate rigorous tray-by-tray calculations using the inside-out algorithm, (4) noisy column distillate and bottoms flow rates to simulate nonideal process operation, (5) reduction in the amount of IB/MA available for RM purchase, thereby limiting maximum production, (6) application of two towers and a flash drum for the separation cascade instead of three towers, and (7) optimization of the column feed tray location. The process behaviors of the reactor, flash drum, and rigorous distillation column simulation units are considered to be blackbox legacy codes because the design models are not directly

accessible using CHEMCAD. A process superstructure that contains six t-BMA purification sequences is presented in Figure 10. The notation R/β is used to denote the task of generating a distillate R using a tower consisting of β trays. The synthesis and design input specifications are provided to the simulator. The synthesis decisions are given in terms of a vector of 0-1 variables, specifying the sequencing of the IB/ DIB, DIB/t-BMA, and t-BMA/MA separation tasks. Design decisions are provided in terms of continuous variables describing the amount of IB/MA fed to the reactor, column reflux, column reboiler temperatures, and integer variables denoting the column feed tray location. For clarity, Table 1 lists the tasks for each of the six separation trains corresponding to the superstructure given in Figure 10 and Table 2 lists the set of synthesis and design variables. Once the process utilities and product flow rates have been obtained, the operating cost is then determined. There is a tradeoff between t-BMA production and the cost of MA/IB purchase. Because the RMs are expensive, a reduced production of t-BMA may be more economical. However, the manufacturing deficiency must then be supplemented by the purchase of additional t-BMA in order to meet the required monthly demand. The objective function is therefore given as the sum of RMs (including t-BMA) and utility costs. The MINLP is formulated

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6113 Table 4. Performance of the B&B-Kriging-RSM-Direct Search Algorithms for Example 1: No Noise

algorithm

no. of sampling points used in building the kriging model (initial, final)

optimum monthly opt cost F MINLP ($)

% trials for which opt F MINLP is attained

simulations required

simulation CPU time (s)

K-R-L K-R-L K-R-L K-R-L K-R-G K-R-G K-R-G K-R-G

10, 18 15, 23 20, 28 30, 38 10, 18 15, 23 20, 28 30, 38

299 264 299 530 299 017 299 428 299 568 299 314 299 314 299 455

94 84 96 81 94 88 84 100

150 154 197 224 142 171 159 188

3658 3224 4580 5508 3503 3182 4081 5514

as shown in eq 22 and accompanying cost, and thermophysical data are presented in Table 3: min F ) 0.17FIB + 0.88FMA + 0.0265FH2SO4 + 0.033FNaOH + 1.10Ft-BMA + 0.26VCW + 0.52VChW + 5.28VRef + 2.2MStm s.t. FIB + FMA + FH2SO4 + FNaOH ) F0 F1 - F2 - F3 - F4 - F5 ) 0 F4 - F6 - F7 ) 0 F5 - F8 - F9 ) 0 Fi - F0yi e 0 i ) 2, ..., 9 Fi g 0 y2,2 + y2,3 + y2,4 + y2,5 ) 1 y2,4 - y2,6 ) 0 y2,4 - y2,7 ) 0 y2,5 - y2,8 ) 0 y2,5 - y2,9 ) 0 VCW ) VChW )

∑Q

CW

FCWCp,CW∆TCW

∑Q

ChW

FChWCp,ChW∆TChW

VRef )

∑Q

Ref

FRefCp,Ref∆TRef

MStm )

∑Q

Stm

∆HStm Djk ) Γ(RRk, TBot,k, QCW,k, QChW,k, QRef,k, QStm,k, Fjk) Bjk ) Γ(RRk, TBot,k, QCW,k, QChW,k, QRef,k, QStm,k, Fjk) Dnoisy ) (1 - σU(0,1))Djk + σU(0,1)Bjk jk Bnoisy ) (1 - σU(0,1))Bjk + σU(0,1)Djk jk j ) {IB, DIB, t-BMA, MA} k ) {DIB ⁄ 18, DIB ⁄ 40, t-BMA ⁄ 18, t-BMA ⁄ 40} y2,i ) {0, 1} i ) 2, ..., 9 σ)0 0.3 e ω e 0.7 0.2 e RRk e 30 290 e TBot,k e 351 (22) The first four terms in the objective function represent RM costs related to production. The fifth term represents the cost required for additional t-BMA purchase whenever the manufacturing scheme fails to satisfy the monthly demand over the 720-h operating period. The last four terms of the objective describe utility expenses, which are comprised of cooling water, chilled water, refrigerant, and steam costs, respectively. The binary variables y2,i describe column or flash unit existence with respect to Figure 10, and the variables Fi describe corresponding

modeling and optimization CPU time (s) K R L/G 35 38 42 46 33 40 41 48

12 14 13 13 11 12 13 13

3 4 3 3 3 3 4 3

feed flow rates. The first equation is a RM mass balance, the next 10 equations describe the feasible t-BMA separation syntheses, and the subsequent four equations describe the required utility consumption. The next two equations represent distillate and bottoms flow rates from separation towers whose model equations are treated as a black box. The final two equations model the way in which noisy distillate and bottoms flow rates Dnoisy and Bnoisy are obtained. The operating ranges jk jk for column reflux and bottoms product temperatures are given with respect to the range of values allowed for all possible separations. The parameter ω is a scaling factor used to restrict the amount of MA/IB fed to the reactor with respect to the specified values in the original study. This parameter is used to simulate the scenario in which RM availability is limited and complicates the problem by introducing the possibility that additional t-BMA may need to be expensively purchased if the reaction product throughput falls short of the demand. The kriging model is built to describe the objective function behavior in terms of the synthesis and design inputs. This global model is then used to guide search toward candidate vectors whose manufacturing schemes have lower utility requirements. B&B is used to determine the optimal t-BMA separation sequence from IB/DIB/MA, RSM is used to determine the optimal amount of fresh IB/MA reactor feed, column reflux, and reboiler temperatures, and direct search (DS) is used to obtain the optimal column feed tray locations. For a given set of process input conditions, there is a long computational time required for CHEMCAD simulation in comparison to the corresponding time required for modeling and optimization. Therefore, the goal is to attain the lowest value of the objective, given as Fopt MINLP, using as few CHEMCAD simulations as possible. The optimization problem is first solved for the deterministic case, in which column distillate and bottoms flow rates are not perturbed by noise. Four sets of experiments are performed for each of the K-R-L and K-R-G algorithms, in order to determine whether an increase in the number of sampling points used to build initial kriging models accelerates the discovery of the best local optimum. Results for the eight experiments are presented in Table 4. The first column contains the sequence of global and local algorithms used in combination with B&B to find the optimal opt monthly cost FMINLP . The second column contains the initial and total number of sampling points used to build the kriging predictor, respectively. The third and fourth columns contain the expected monthly cost and the rate of successful convergence to the optimum, respectively. The success rate is based on the application of the K-R-L or K-R-G algorithms, in combination with B&B, for 50 different trials. For each trial, a unique sampling set is employed to build a nominal kriging predictor of the relaxed NLP objective at the root node of the B&B tree. The fifth and sixth columns contain the average number of simulations and corresponding simulation CPU time. The seventh, eighth, and ninth columns contain the CPU cost

6114 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 Table 5. Performance of the B&B-Kriging-RSM-Direct Search Algorithms for Example 1: With Noise

algorithm

no. of sampling points used in building the kriging model (initial, final)

optimum monthly cost opt F MINLP ($)

% trials for which opt F MINLP is attained

simulations required

simulation CPU time (s)

K-R-L K-R-L K-R-L K-R-L K-R-G K-R-G K-R-G K-R-G

10,18 15,23 20,28 30,38 10,18 15,23 20,28 30,38

304 318 304 026 304 479 303 968 304 453 304 030 303 883 303 334

80 84 84 84 72 88 88 72

188 161 184 235 157 213 202 278

4737 3016 3373 3884 2732 3799 3593 4943

associated with the application of the kriging, RSM, and local or global direct search algorithms. It is observed that an increase in the number of sampling points used to build the nominal kriging predictor does not always increase the chances of finding the optimum. When the K-R-L algorithm is applied, the optimum is found in 84% of the cases when 15 points are used to build the initial global model, compared with 96% of the cases when only 10 points are used. Similar results are observed for the performance of the K-R-G algorithm; the optimum is found in 94% and 88% of the cases when 10 and 15 sampling points are used, respectively. When the direct search algorithms are compared in terms of the number of sampling points used to build the initial global model, the global search algorithm, DS-G, slightly outperforms its local search counterpart, DS-L, for a subset of the experiments, in terms of the number of simulations required. This behavior is observed when 10, 20, or 30 sampling points are used to build the nominal kriging predictor, suggesting that the global search feature of the DS-G algorithm results in faster discovery of the optimum. Consider a global model built using 10 sampling points. The optimum is found after an average of 142 simulations when global search is applied, in contrast to 150 simulations required by local search. The monthly cost reported in Table 1 represents the total cost required to satisfy the monthly demand of 457 874 kg of t-BMA. Based on the corresponding optimal synthesis and operating conditions, a third-party purchase of t-BMA is not required. The in-house production cost of t-BMA is $0.65/kg, a 41% savings achieved when compared to the third-party cost of $1.10/kg. The sequential application of B&B, kriging, RSM, and direct search is again applied to the problem for varying nominal sampling set sizes, with the only change now being that process noise has been introduced into the reactor, flash unit, and distillation columns. The value of σ is set at 0.05. The unified B&B-kriging-RSM-direct search algorithms are again applied based on the same eight conditions given in Table 2, in which optimal y1 is determined by local or global search and in which the number of sampling points used in building the nominal kriging predictor varies between 10 and 30. The algorithm is applied to 25 different sampling sets for each of the eight conditions. Results are presented in Table 5. opt The value of the optimal objective F MINLP is slightly inferior to the value obtained when no noise is present, although the relative t-BMA production cost of $0.664/kg still results in a 39.7% savings in comparison to the third-party price. Successful convergence to the optimum is generally observed in fewer cases when noise is present regardless of whether local or global direct search is used to determine optimal y1. However, the comparison is not quite fair because (1) results for the noisy conditions are based on 25 trials instead of 50 and (2) for any given sampling size, a nominal sampling set employed under noisy conditions

modeling and optimization CPU time (s) K

R

L/G

34 39 33 50 33 36 37 44

2 2 1 2 2 2 2 2

8 9 7 9 9 8 8 9

might be completely different from the sampling sets used when noise was absent. The addition of noise for a given sampling set can lead to convergence problems being encountered for downstream units in the simulator, so new nominal sampling sets need to be employed when this problem occurs. The number of simulations required to achieve convergence in the objective opt to FMINLP is also generally higher under noisy conditions. Because the number of sampling points employed for global modeling is the same in each of the eight conditions, the additional sampling costs are attributed to local optimization. This suggests, in turn, that the objective values corresponding to the deterministic kriging solutions SK are lower than those found for the stochastic conditions, indicating that the global geometry is modeled less accurately when noise is present. For the K-R-L algorithms, as the nominal sampling size increases, opt a modest 4% improvement in the attainment of FMINLP is observed. For the corresponding K-R-G algorithm, a 16% improvement is obtained for sampling sizes of 15 and 20 points, although the success rate obtained using 10 points is only 72%. Surprisingly, when the K-R-G method is applied to 30 points, the success rate decreases to 72%. Because the nominal sampling sets are randomly generated for all cases, this suggests the presence of sampling data that contribute redundant information to the kriging model and that the success rate could be improved by a better spatial arrangement of the sampling points. In summary, the important results for this example are as follows: (1) the unified B&B-kriging-RSM-direct search method does not guarantee the discovery of the best local optimum because only a percentage of the trials in each of the opt experiments are successful in the discovery of FMINLP , (2) an increase in the number of sampling points used to build the initial global model fails to consistently accelerate the search opt toward FMINLP , suggesting that the simple sampling scheme of randomly selecting dispersed feasible points is not very efficient, and (3) improved criteria for determining global model accuracy opt must be developed because the success rate in attaining FMINLP is approximately 10% lower than that for the deterministic case. For the second point, the subject of a future publication is focused on applying analytical kriging predictors within a reverse engineering framework in order to determine the optimal number and locations of sampling vectors. The closed-form kriging predictors may also be helpful in the development of model reliability criteria superior to the practice of attaining convergence in the mean value of the global model. 3.2. Example 2: ADH Purification. The second example is taken from the biochemical engineering literature.27 ADH is an enzyme that is used to convert alcohols to aldehydes and ketones. Because ADH is a naturally occurring chemical found within Saccharomyces cereVisiae cells, it can be produced on a commercial scale via industrial continuous or fed-batch fermentation. However, the primary difficulty is ADH acquisition and purification because ADH must first be released from the

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6115

Figure 11. Detailed homogenizer schematic.

cells and then separated from the cellular debris. ADH is temperature-sensitive and degrades above temperatures of 40 °F. There are four main process operations to be considered in ADH production, which are also typical steps for a standard enzyme process: fermentation, homogenization, centrifugation, and precipitation. In this case study, S. cereVisiae cells are grown in a fermenter using fed-batch operations. The cells are then disrupted using homogenization, an operation that consists of passing the cells through a blender in order to disrupt the cell walls, thereby releasing intracellular components. In the centrifugation step, the ADH is separated from an intracellular protein contaminant. The ADH leaves in the centrifuge supernatant,and residual protein is then precipitated using ammonium sulfate. In order to improve ADH purity and yield, the centrifugation and precipitation operations are repeated. Detailed homogenizer and centrifuge schematics are presented in Figures 11 and 12, followed by a flowsheet of the complete process in Figure 13. In the schematic shown for the entire process, seven unit operations are employed for the process. The variables F, H, Ci, and Pi designate the fermentation, homogenization, ith centrifugation, and ith precipitation process, respectively. The process behavior for each unit operation is considered to be noisy, and the equation models are assumed to be inaccessible. A test problem is formulated whereby the objective is to maximize ADH purity and yield while minimizing process operation costs. There are seven continuous variables and four integer variables. The continuous variables, where the respective process operation is denoted in parentheses, are given as follows: (1) glucose concentration (F), (2) pressure (H), (3) disk angle (C1, C2, and C3), and (4) precipitant concentration (P1 and P2). The integer variables are (1) number of passes (H) and (2) number of disks (C1, C2, and C3). The problem formulation is given in 23: max F ) 2z1 + 2z2 - y1,1 - y1,2 - y1,3 - y1,4 s.t. zF ) Γ1[x1, N(0, σ2)] zH ) Γ2[x2, y1,1, N(0, σ2)] zC1 ) Γ3[x3, y1,2, N(0, σ2)] zP1 ) Γ4[x4, N(0, σ2)] zC2 ) Γ5[x5, y1,3, N(0, σ2)] zP2 ) Γ6[x6, N(0, σ2)] zC3 ) Γ7[x7, y1,4, N(0, σ2)] 2 {z1, z2} ) Γ7[x7, y4, N(0, σ )]

1 e y1,1 e 9 1 e {y1,2, y1,3, y1,4} e 72 σ ) 0.03

(23)

Because each unit operation is considered to be a black box, this problem was solved using simulation data because field

Figure 12. Detailed centrifuge schematic.

experimental data were unavailable. The optimization of F is performed by first generating kriging models of the objective with respect to the continuous and integer inputs. The objective is considered to be an output variable because it is evaluated once the output variables zF, zH, zCi, and zPi have been obtained after the process has been simulated for a unique set of input conditions. Each simulation is performed using a computational code containing a set of closed-form equations corresponding to each stage of the ADH process. The models are presented in Appendix A simply to describe process operation and are not directly optimized. When possible, the same models used by Groep et al.27 have been employed. In the cases where model equations require experimental parameter data, such as determining the amount of ADH present in the cells after fermentation, fitted equations have been employed that match the literature data.27 In Table 6, the optimal value of the objective function F, as given by eq 23, is reported based on the application of RSM and local direct search to the best kriging solution SK obtained from a partially refined model. A partially refined model is one for which further updating is terminated after an arbitrarily chosen limit of 30 objective function evaluations. The nominal kriging predictor is constructed from 15 sampling vectors, each consisting of input points {x1, ..., x7, y1, ..., y4}. The value of the nominal best solution Fopt MINLP is 198 and is the optimal objective obtained based on the sampled data. The objective is improved by 11.9% after global model refinement. Once both RSM and the DS-L algorithm have also been applied, the total improvement rises to 26.1%. The corresponding objective has a value of 244, which is attained after 83 additional function evaluations. A modest 7% improvement is observed after RSM optimization at a cost of 44 function evaluations. This suggests that the “warm-start” iterate attained from the unrefined global model is still relatively far away from the refined local solution and that further global refinement could result in a lower resource cost during the subsequent local optimization. A set of complementary results is presented in Table 7, in which global direct search is applied instead of its local counterpart. The optimal solution Fopt MINLP has an objective value of 286, and an additional 22.7% improvement is attained relative to the best solution using the K-R-L method, at the cost of an additional 11 function calls. Although this solution is superior to the corresponding objective found using local search, it cannot be confirmed as a global solution because of problem nonconvexity based on nonlinear and bilinear

6116 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008

Figure 13. Schematic of pocess fow dagram for ADH production. Table 6. Performance of the Kriging-RSM-Local Direct Search (K-R-L) Algorithm for Example 2, Based on Local Optimization of a Partially Refined Kriging Global Model simulations required

algorithm

opt F MINLP

% improvement relative to opt nominal F MINLP

algorithm

total

CPU time (s)

none (nominal sampling set only) kriging RSM local direct search

198 217 230 244

11.9 18.7 26.1

10 44 27

15 25 69 98

0.83 3.90 5.84 4.03

x - y1 variable terms that appear in the models found in Appendix A. In Tables 8 and 9, two complementary sets of results are presented in which local optimization is performed on the best kriging solution SK obtained from a fully refined kriging model. These results are obtained without applying any a priori resource restrictions on the amount of sampling directed at global model refinement. Instead, the stopping criterion applied is based on convergence in the average value of the kriging predictor, as described in Figure 4.

Although the value of Fopt MINLP rises to 269 when local search is employed, no increase is observed when global search has been applied. These findings suggest that a better objective can be attained using local search based on a fully developed kriging model. Conversely, the solution attained using global search does not require a completely refined global mapping. In summary, the important results for this example are the following: (1) iterative global model refinement leads to significantly improved warm-start iterates for local optimiza-

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6117 Table 7. Performance of the Kriging-RSM-Global Direct Search (K-R-G) Algorithm for Example 2, Based on Global Optimization of a Partially Refined Kriging Global Model algorithm

opt FMINLP

% improvement relative to opt nominal F MINLP

none (nominal sampling set only) kriging RSM global direct search

198 215 227 286

9.4 15.9 48.8

simulations required algorithm total 9 44 41

CPU time (s)

15 24 68 109

1.04 3.61 6.12 6.33

Table 8. Performance of the Kriging-RSM-Local Direct Search (K-R-L) Algorithm for Example 2, Based on Local Optimization of a Completely Refined Kriging Global Model algorithm

opt F MINLP

% improvement relative to opt nominal F MINLP

none (nominal sampling set only) kriging RSM local direct search

198 240 252 269

23.1 28.9 37.5

simulations required algorithm total 49 42 27

CPU time (s)

15 64 106 133

0.98 12.62 5.62 3.96

Table 9. Performance of the Kriging-RSM-Global Direct Search (K-R-G) Algorithm for Example 2, Based on Global Optimization of a Completely Refined Kriging Global Model algorithm

opt F MINLP

% improvement relative to opt nominal F MINLP

none (nominal sampling set only) kriging RSM global direct search

198 241 253 286

28.7 34.8 54.1

Table 10. Percentage Improvement Attained in the Objective after Local Optimization of the Continuous Variables Using RSM RSM optimization of continuous variables kriging model type

kriging solution

RSM solution

% improvement

partially refined (Table 6) partially refined (Table 7) completely refined (Table 8) completely refined (Table 9)

217 215 240 241

230 227 252 253

5.65 5.29 4.76 4.74

Table 11. Percentage Improvement Attained in the Objective after Local Optimization of the Continuous Variables Using RSM direct search optimization of integer variables kriging model type partially refined (Table 6) partially refined (Table 7) completely refined (Table 8) completely refined (Table 9)

direct search RSM integer % algorithm solution solution improvement local global local global

230 227 252 252

244 286 269 286

6.09 25.99 6.75 13.49

tion, as shown in Table 10; (2) the percentage improvement attained in the objective is different when optimizing each of the continuous and integer variable sets, as shown in Tables 10 and 11, motivating the need to develop strategies for determining an optimal algorithm for selecting the most efficient optimization protocol; (3) the application of global direct search for the integer variables leads to significantly improved objective values in contrast to local direct search, as shown in Table 11. The required computational time for the results presented in Tables 6–9 is lower compared to the corresponding times shown in Tables 2 and 3. The reason for this is that the CHEMCAD simulation of the t-BMA process requires a long computational time in order for convergence to be attained in the rigorous TOWR distillation models. Conversely, for the ADH process, the corresponding code is written in Matlab and has a fast computational time that is of the same order of magnitude as the time required for the modeling and optimization stages of the unified algorithm.

simulations required algorithm total 51 43 38

CPU time (s)

15 66 109 147

1.07 13.23 5.76 5.72

4. Conclusions and Future Work The novel contribution of this work has been the integration of direct search with B&B, kriging, and RSM in order to address process synthesis and design problems containing black-box functions, which can depend on both continuous and integer variables. B&B is used to optimize integer variables having a continuous relaxation, while RSM is used to optimize the continuous variables. The integer variables appearing in the black-box functions are optimized using either local or global direct search, and it is found that global search leads to better solutions being obtained based on the algorithmic performance for the two presented case studies. The unified B&B-krigingRSM-direct search algorithm proceeds as follows, whereby at each node of a B&B tree kriging is used to build global models of partially relaxed NLP subproblem objectives. The surrogate models are used to identify subregions containing potential local optima, and the best kriging solutions serve as starting iterates for further optimization using RSM and direct search. The additional costs resulting from global model creation are offset by successful convergence to improved local solutions. Future work is targeted at optimizing (1) the number and spatial arrangement of the sampling points used in building nominal kriging models, in order to reduce model refinement sampling expense, and (2) the sequence in which RSM and direct search are applied in order to attain the maximum improvement in the objective at the lowest sampling expense possible. Acknowledgment The authors gratefully acknowledge financial support from NSF Grant CTS 0625515 and also the USEPA-funded Environmental Bioinformatics and Computational Toxicology Center under Grant GAD R 832721-010. Appendix A Mathematical Models for Example 2: Alcohol Dehydrogenase (ADH) Purification The main goal of fermentation is to grow cells to a high density in order to increase the ADH yield, knowing that a

6118 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008

[X0, µ, t, Yxs] ) [10, 0.21, 3600, 0.5] 100 e sin e 500 X ) X0 exp(µt) X - X0 Sc ) Yxs Sc Vout ) V0 + sin

Figure A1. Schematic of the homogenizer process.

me,ins ) [0.95 + N(0, 0.1)][(-695.05 × 107)µ2 +

Figure A2. Schematic of the first centrifuge process (C1).

significant amount may be lost during downstream processing. At the end of the fermentation time period t, cellular growth is arrested and the corresponding diameter indicates whether the cell was in the lag, exponential, or stationary phase. The cell diameter ranges from 2 to 12 µm, and a simulated size distribution model can be generated using linear combinations of Weibull probability density functions. The total cell number is determined by summing the number of cells R(di) having given diameter di, i ) 1, ..., Nd, for Nd equispaced intervals. The median cell diameter dc,50 and cumulative distribution function scatter points c(di) are obtained once the total cell number has been obtained. The corresponding mass fraction and mass of cells having diameter di entering the homogenizer are given as mc,frac,i and mc,i, respectively, as shown in eqs A1 and A2. mc,frac,i )

R(di) Nd

,

i ) 1, ..., Nd

(A1)

,

i ) 1, ..., Nd

(A2)

∑ R(d ) i

i)1

mc,i ) X

mc,frac,i Nd

∑m

c,frac,j

j)1

The cumulative size distribution of the cell diameter can be modeled using a Boltzmann equation as given in eq A3 by fitting a standard deviation parameter w to experimental or simulated mass fraction data.28 In addition, a prespecified value of w can be used to generate a sample distribution from which mc,frac,i can alternatively be obtained.

{

(

)}

)

di - dc,50 wc ) , di - dc,50 1 + exp wc i ) 1, ..., Nd di - dc,50 exp wc mc,frac,i ) , i ) 1, ..., Nd di - dc,50 2 wc 1 + exp wc

1 1di - dc,50 c(di) ) 1 + exp wc

(

(

[

exp

)

(

(

)

)]

Nd

wc ) argmin

∑ [c(d ) - m i

]

c,frac,i

(A3)

i)1

The remaining fermentation equations are presented as follows in eq A4:

(131.97 × 107)µ + 7.9234 × 107] mp,ins ) 1.125me,unr mp,ins eins,out ) Vout me,ins pins,out ) Vout V , e [ out,f ins,out,f, pins,out,f] ) [Vout,f, eins,out,f, pins,out,f] x1 ) sin zF ) [X, Sc, Vout,f, mp,unr, me,unr, eins,out,f, pins,out,f, mc,frac, mc, c, wc] (A4) where the vector of fermenter output variables zF consists of the amount of biomass X created, amount of glucose substrate consumed Sc, final broth volume Vout,f, mass of unreleased, and currently insoluble, protein mp,unr, and corresponding mass of intracellular enzyme me,unr, respectively. The sizedistributed cellular mass, mass fractions, and Boltzmann parameters are included for completion. The glucose substrate concentration sin is also referred to as x1 in eq 23, in keeping with the designation of continuous variables as x to be consistent with model (1). Equations A1–A4 comprise the fermenter model equations and are symbolized by Γ1[x, N(0,σ2)]. At the end of the fermentation, the biomass broth is fed to the homogenizer where ADH release is achieved via pressurized cellular disruption. The fermentation broth can be passed through the homogenizer up to nine times, with each additional pass resulting in a higher number of cell walls breaking apart. If too few passes are used, a high number of cell walls may remain intact, causing the ADH to remain inaccessible and consequently resulting in a lowered process yield. However, employing too many passes may lead to micronization of the cellular wall debris, causing separation of ADH from both the debris and the intracellular protein contaminant to become more difficult. A process schematic is shown in Figure A1. Because the biomass is comprised of cells having different diameters, a model is proposed by Groep et al.27 describing the pressure required for cell breakage because cells having a larger diameter require a higher disruption pressure. The model describing the threshold breakage pressure Pc,i is given in eq A5, where χ is defined as a cell strength parameter. The constants Pc0 and PcN denote the threshold pressures required to induce cell breakage for cells having the lowest and highest diameters d1 and dND, respectively. The pressure P is a manipulated parameter given as x2 whose range varies between 689 476 and 3.447 × 106 Pa. The homogenizer effluent consists of undisrupted cells, released protein and ADH, and cell wall debris. The equations representing the mass of undisrupted cells mc,out, released enzyme concentration esol,out, and protein concentration psol,out released are also included in eq A5:

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6119

[χ, Pc0, PcN, kc] ) [1, 275790, 620528, 4.619 × 10 ] 6

[mc,in, Vin, eins,in, pins,in] ) [mc, Vout,f, eins,out,f, pins,out,f] 1eNe9 689476 e P e 3.447 × 106

[

Pc,i ) χ Pc0 +

(di - d1)(PcN - Pc0) dND - di

{ {

]

mc,out,i ) mc,in,i exp[-kcNβc(P - Pc,i)Rc],

esol,out ) eins,in

psol,out ) pins,in

eins,out )

me,unr Vin

pins,out )

mp,unr Vin

Figure A3. Schematic of the first precipitation process (P1).

i ) 1, ..., Nd

,

Nd

∑m

[1 - exp[-kcNβ (P - Pc,i)R ] c

c

c,in,i

i)1

Nd



mc,in,i

i)1

Nd

∑m

} }

i ) 1, ..., Nd

(A5)

[1 - exp[-kcNβ (P - Pc,i)R ]] c

c

c,in,i

i)1

Nd

∑m

c,in,i

i)1

The cell size distribution for disrupted cells can also be fitted to a Boltzmann equation in order to model the cumulative size distribution. The median cell diameter dq,50 is obtained in a manner similar to that of how dc,50 was obtained. For the disrupted cells, the size-distributed mass mq,out,i and mass fraction mq,frac,i models are given as shown in eq A6. After passing through the homogenizer, it is assumed that any undisrupted cells will remain intact after further processing. As a result, the undisrupted cells are considered, in addition to the cell debris, to be waste particles. The corresponding size-distributed waste particle mass fraction is obtained by summing the respective distributions of undisrupted cells and cellular debris. Equations A5 and A6 comprise the homogenizer model equations and are symbolized by Γ2[x2, y1, N(0,σ2)]. mq,frac,i )

exp(di - dq,50) , wq[1 + exp(di - dq,50)]2

mq,out,i ) mq,frac,i

(

i ) 1, ... , Nd Nd

X - Vpsol,out - Vesol,out - Σ mc,out,j j)1

Nd

Σ mq,frac,j

j)1

)

,

i ) 1, ... , Nd

mw,out,i ) mc,out,i + mq,out,i, i ) 1, ... , Nd [mc,out,h, esol,out,h, psol,out,h, eins,out,h] ) [mc,out, esol,out, psol,out, eins,out] [pins,out,h, mq,frac,h, mq,out,h, mw,out,h] ) [pins,out, mq,frac, mq,frac, mq,out, mw,out] [x2, y1,1] ) [P, N] zH ) [mc,out,h, esol,out,h, psol,out,h, mq,frac,h, mq,out,h, mw,out,h] (A6) Once cell disruption has occurred; the next step is to purify ADH from the cell debris, remaining undisrupted biomass, and protein contaminant. In the first step, the heavier cell

Figure A4. Schematic of the second centrifuge process (C2).

debris and remaining whole cells are separated from the ADH and protein using centrifugation. The ADH and protein leave in a supernatant stream, while the cell debris and whole cells exit in a sediment stream. A simple schematic is shown in Figure A2. The centrifuge model used is a disk-type centrifuge.29 The incoming feed enters the top of the centrifuge and flows downward through a conical disk stack having narrow flow channels between each disk. The centrifuge is horizontally agitated such that the feed stream splits off into smaller streams entering each of the flow channels. The heavier particles settle to the bottom of each channel, slide down to the bottom of the disk stack via mechanical agitation and gravity, and then exit the centrifuge at the base of the body. The lighter particles remaining in suspension now comprise a clarified liquid phase and are pumped out of the centrifuge at the top of the unit. Because of the nonuniform cell size distribution, not all waste particles exit in the sediment phase. The grade efficiency T(di), given by eq A7, is a function fitted to experimental data describing the percentage of solids recovered in the sediment phase stream, based on particles having diameter di. If a cell diameter di falls below a critical diameter dc,, the cell is assumed to be light enough such that it exits with the supernatant. The parameters k and n are regression constants having randomly initialized values for simulation purposes. The remaining parameters consist of a hindered settling factor fs, gravitational constant g, volumetric throughput Q, carrier fluid viscosity η, density difference between liquid and solid phases ∆F, outer and inner disk diameters Ro and Ri, and angular bowl velocity ω. -3 [k, n, fs, η] ) [0.9501, 0.4621, 1.6, 1.005 × 10 ]

[∆F, Ro, Ri, ω] ) [191.8, 0.076, 0.036, 960]

( ( ))

T(di) ) 1 - exp - k



dc ) fs



)

di dc

18Qη ∆F

∑g

2πZω2(R03 - Ri3) 3g tan θ

30o e θ e 75o 1 e Z e 72

n

,

i ) 1, ..., Nd

(A7)

6120 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008

The variables represented in the supernatant equations are given as follows: (1) size-distributed mass of waste solids mw,out,i, (2) waste solids concentration wout, (3) dissolved enzyme concentration esol,out, (4) dissolved protein concentration psol,out, (5) intracellular enzyme concentration eins,out, and (6) intracellular protein concentration pins,out. The constants λ1 and λ2 respectively designate the fraction of dissolved enzyme or protein that becomes denatured or exits in the sediment phase stream. Equations A7 and A8 comprise the centrifuge model equations and are symbolized by Γ3[x3, y1,2, N(0,σ2)].

[Vin, esol,in, psol,in, mw,in] ) [V, esol,out,h, psol,out,h, mw,out,h] [mc,in, eins,in, pins,in] ) [mc,out,h, eins,in,h, pins,in,h] λ1 ) 0.05[1 + N(0, σ2)] λ2 ) 0.05[1 + N(0, σ2)] fsup ) 0.95[1 + N(0, σ2)] Vsup ) fsupVin esol,out )

esol,inVin (1 - λ1)(1 - λ2) Vsup

psol,out )

psol,inVin (1 - λ1)(1 - λ2) Vsup

mw,out,i ) [1 - T(di)]mw,in,i, mc,out,i )

( )

mw,out,i m , mw,in,i c,in,i

i ) 1,..., Nd i ) 1,..., Nd

(A8)

Nd

eins,out ) eins,in

∑m

c,out,i

i)1 Nd

pins,out ) pins,in

∑m

c,out,i

i)1

Nd

∑m

w,out,i

wout )

i)1

Vsup

[mw,out,C1, mc,out,C1, λ1,C1, λ2,C1] ) [mw,out, mc,out, λ1, λ2] [fsup,C1, Vsup,C1, esol,out,C1, psol,out,C1] ) [fsup, Vsup, esol,out, psol,out] [eins,out,C1, pins,out,C1, wout,C1] ) [eins,out, pins,out, wout] [x3, y1,2] ) [θ, Z] zC1 ) [mw,out,C1, mc,out,C1, λ1,C1, λ2,C1, fsup,C1, Vsup,C1, esol,out,C1, psol,out,C1, eins,out,C1, pins,out,C1, wout,C1] The centrifuge supernatant is fed to a tank whereby an ammonium sulfate precipitant is then added in order to precipitate out the protein contaminant. A process schematic is presented in Figure A3. The amount of precipitant Vz needed is determined by the desired output precipitant concentration zout as given by eq A9. The models for Vz and Vout are not treated as black-box because these output variables become explicitly known once zout has been specified. These models would be represented in model (1) by the equality constraints h(x,z1), where x and z1 correspond to the vectors [Vin, zin] and [Vz, Vout], respectively:

[Vin, zin] ) [Vsup,C1, 0] 10 e zout e 40 Vz )

Vin(zout - zin) 100 - zout

Vout ) Vin + Vz

(A9)

The parameters φe and φp describe the soluble enzyme and protein fractions after precipitation. Equation A10 comprises the remaining precipitation model equations, symbolized by Γ4[x4, N(0,σ2)].

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6121

[esol,in, eins,in, psol,in, pins,in, win] ) [esol,out,C1, eins,out,C1, psol,out,C1, pins,out,C1, wout,C1] φe ) 0.9[1 + N(0, σ2)] φp ) 0.9[1 + N(0, σ2)] esol,out ) eins,out ) psol,out ) pins,out ) wout )

φeesol,inVin Vout

[eins,in + (1 - φe)esol,in]Vin Vout φppsol,inVin Vout

(A10)

[pins,in + (1 - φp)pin]Vin Vout

winVin Vout

[zout,P1, Vz,P1, Vout,P1, φe,P1, φp,P1, esol,out,P1] ) [zout, Vz, Vout, φe, φp, esol,out] [eins,out,P1, psol,out,P1, pins,out,P1, wout,P1] ) [eins,out, psol,out, pins,out, wout] x4 ) zout zP1 ) [φe,P1, φp,P1, esol,out,P1, eins,out,P1, psol,out,P1, pins,out,P1, wout,P1] The remaining solution from the precipitation is passed through another centrifuge (C2) in order to separate the enzyme from the remaining waste solids in the form of cell debris and undisrupted cells. A process schematic is shown in Figure A4. The process models used for C2 are the same as those employed for C1 as given by eqs A7 and A8, where θ and Z are now given as x5 and y1,3, respectively. The corresponding parameter values for the input variables in eq A8 are provided on the right-hand side of eq A11:

[Vin, esol,in, psol,in, mw,in] ) [Vout,P1, esol,out,P1, psol,out,P1, mw,out,C1] [mc,in, eins,in, pins,in] ) [mc,out,C1, eins,out,P1, pins,out,P1]

(A11)

The corresponding output variables are designated as follows in eq A12. Equations A7, A8, A11, and A12 comprise the model equations for the second centrifugation and are symbolized by Γ5[x5, y1,3, N(0,σ2)].

[mw,out,C2, mc,out,C2, λ1,C2, λ2,C2] ) [mw,out, mc,out, λ1, λ2] [fsup,C2, Vsup,C2, esol,out,C2, psol,out,C2] ) [fsup, Vsup, esol,out, psol,out] [eins,out,C2, pins,out,C2, wout,C2] ) [eins,out, pins,out, wout]

(A12)

[x5, y1,3] ) [θ, Z] zC2 ) [mw,out,C2, mc,out,C2, λ1,C2, λ2,C2, fsup,C2, Vsup,C2, esol,out,C2, psol,out,C2, eins,out,C2, pins,out,C2, wout,C2] The residual protein in the supernatant is now precipitated a second time according to the same process schematic as given in Figure A3. The amount of precipitant needed for the second precipitation is given by eq A13, whose equations correspond to a second set of explicitly known equality constraints h.

[Vin, zin] ) [Vsup,C2, zout,P1] max(zin + 10, 40) e zout e 70 Vin(zout - zin) Vz )

100 - zout

Vout ) Vin + Vz

(A13)

The input data required for eq A10 are now given by the right-hand side of the first equation in eq A14. The corresponding output variables in eq A10, in turn, are redefined by the left-hand side of the second and third equations in eq A14. Together with eq A10, eqs A13 and A14 comprise the model equations for the second precipitation and are symbolized by Γ6[x6, N(0,σ2)].

6122 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008

[esol,in, eins,in, psol,in, pins,in, win] ) [esol,out,C2, eins,out,C2, psol,out,C2, pins,out,C2, wout,C2] [zout,P2, Vz,P2, Vout,P2, φe,P2, φp,P2, esol,out,P2] ) [zout, Vz, Vout, φe, φp, esol,out] [eins,out,P2, psol,out,P2, pins,out,P2, wout,P2] ) [eins,out, psol,out, pins,out, wout]

(A14)

x6 ) zout zP2 ) [φe,P2, φp,P2, esol,out,P2, eins,out,P2, psol,out,P2, pins,out,P2, wout,P2] The solution from the second precipitation is fed to a centrifuge one last time in order to improve ADH purity. The process schematic differs from the process schematic given in Figure A4 in that this time the purified enzyme is component-rich in the sediment phase stream. The final ADH purity is given by z1, and the corresponding recovery percentage, compared to the total amount available from fermentation, is denoted by z2. The corresponding equations are given in eq A15, along with the remaining centrifugation model equations. Together with eq A7, eq A15 comprises the model equations symbolized by Γ7[x7, y1,4, N(0,σ2)].

[Vin, esol,in, psol,in, mw,in] ) [Vout,P2, esol,out,P2, psol,out,P2, mw,out,C2] [mc,in, eins,in, pins,in] ) [mc,out,C2, eins,out,P2, pins,out,P2] λ1 ) 0.05[1 + N(0, σ2)] fsed ) 0.8[1 + N(0, σ2)] Vsed ) fsedVout eins,out )

eins,inVin r (1 - λ2) Vout e

mw,out,i ) T(di) mw,in,i,

i ) 1,..., Nd

pins,out )

pins,inVin r (1 - λ2) Vout p

esol,out )

esol,inVin λ Vout 1

psol,out )

psol,inVin λ Vout 1

z2 )

(

(A15)

esol,outVsed Nd

∑m

w,out,i

+ psol,outVsed + esol,outVsed

i)1

)

× 100

[λ1,C3, λ2,C3, fsed,C3, Vsed,C3, mw,out,C3] ) [λ1, λ2, fsed, Vout, mw,out] [esol,out,C3, psol,out,C3, eins,out,C3, pins,out,C3] ) [esol,out, psol,out, eins,out, pins,out] [x7, y1,4] ) [θ, Z] zC3 ) [λ1,C3, λ2,C3, fsed,C3, Vsed,C3, esol,out,C3, psol,out,C3, eins,out,C3, pins,out,C3] Nomenclature for Example 2 Fermentation i ) dummy index designating a given cell diameter di ) ith cellular diameter [m] R(di) ) final number of cells having diameter di dc,50 ) median cell diameter [m] Nd ) number of size-differentiated cell diameters mc,frac,i ) mass fraction of cells having diameter equal to di after fermentation mc,in,i ) final mass of cells having diameter di [kg] mc,frac ) size-distributed cell mass fraction after fermentation mc ) size-distributed cell mass after fermentation [kg] c(di) ) fraction of cells having diameter equal to or less than di wc ) Boltzmann parameter for modeling cdf of undisrupted cells X0 ) initial biomass concentration [kg/m3] µ ) cellular growth rate [s-1]

t ) fermentation time period [s] X ) final biomass concentration [kg/m3] Sc ) amount of substrate consumed [kg] Yxs ) yield coefficient of enzyme on the substrate [U/kg] sin ) initial glucose concentration [kg/m3] V ) final broth volume [m3] V0 ) initial broth volume [m3] me,unr ) mass of ADH inside undisrupted cells [kg] mp,unr ) mass of protein inside undisrupted cells [kg] Homogenization χ ) cell strength parameter Pc0 ) threshold breakage pressure for the strongest cells having the lowest diameter [Pa] PcN ) threshold breakage pressure in the strongest cells having the highest diameter [Pa] N ) number of homogenizer passes

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6123 kc ) cell disruption rate constant, [N Pa] mc,in ) size-distributed cell mass in the feed broth [kg] Vin ) feed broth volume [m3] P ) homogenizer pressure [Pa] Pc,i ) threshold pressure required for disruption of cells having diameter di [Pa] d1 ) lowest cell diameter [m] dND ) highest cell diameter [m] mc,out,i ) mass of undisrupted cells having diameter di leaving the homogenizer [kg] Rc ) cell disruption constant βc ) cell disruption constant esol,out ) concentration of dissolved ADH in homogenized broth [U/m3] psol,out ) concentration of dissolved protein in homogenized broth [kg/m3] eins,out ) concentration of intracellular ADH in homogenized broth [U/m3] pins,out ) concentration of intracellular protein in homogenized broth [kg/m3] mq,frac,i ) mass fraction of disrupted cells having diameter di [kg] dq,50 ) median diameter of the disrupted cells [m] wq ) Boltzmann parameter for modeling cdf of disrupted cells mq,out,i ) mass of cellular debris having diameter di [kg] mw,out,i ) mass of waste solids having diameter di [kg] 0.4

Centrifugation T(di) ) grade efficiency measuring the recovery of cells having diameter di dc ) critical diameter below which any smaller cells leave in the supernatant [m] k ) grade efficiency parameter n ) grade efficiency parameter fs ) settling parameter g ) gravitational constant [m/s2] Q ) volumetric centrifuge throughput [m3/s] η ) carrier viscosity [N s/m2] ∆F ) density difference between the liquid and solid phases [kg/m3] Ro ) outer disk radius [m] Ri ) inner disk radius [m] ω ) angular bowl velocity [rad/s] Σ ) equivalent centrifuge settling area [m2] Z ) number of disks in the centrifuge disk stack θ ) disk angle [rad] Vin ) feed broth volume [m3] esol,in ) concentration of dissolved ADH in the feed broth [U/m3] psol,in ) concentration of dissolved protein in the feed broth [kg/m3] eins,in ) concentration of intracellular ADH in the feed broth [U/m3] pins,in ) concentration of intracellular protein in the feed broth [kg/m3] esol,out ) concentration of dissolved ADH in the supernatant [U/m3] psol,out ) concentration of dissolved protein in the supernatant [kg/m3] eins,out ) concentration of intracellular ADH in the supernatant [U/m3] pins,out ) concentration of intracellular protein in the supernatant [kg/m3] mc,in ) size-distributed mass of undisrupted cells in the feed [kg] mc,in,i ) mass of undisrupted cells in the feed having diameter di [kg] mw,in ) size-distributed mass of feed waste solids [kg] mw,in,i ) mass of feed waste solids having diameter di [kg]

λ1 ) fraction of dissolved enzyme or protein denatured from process operation λ2 ) fraction of dissolved enzyme or protein exiting in the sediment stream fsup ) fraction of the feed stream exiting as the supernatant Vsup ) supernatant volume [m3] Vsed ) sediment broth volume [m3] mw,out,i ) mass of supernatant waste solids having diameter di [kg] mw,out ) size-distributed mass of supernatant waste solids [kg] mw,out,i ) mass of undisrupted cells having diameter di in the supernatant [kg] mw,out ) size-distributed mass of undisrupted cells in the supernatant [kg] wout ) concentration of supernatant waste solids [kg/m3] Precipitation Vz ) precipitant volume needed [m3] zin ) precipitant input concentration [kg/m3] zout ) precipitant output concentration [kg/m3] Vout ) broth volume after precipitant addition [m3] φe ) fraction of soluble enzyme precipitated φp ) fraction of soluble protein precipitated esol,in ) concentration of dissolved ADH in the feed broth [U/m3] psol,in ) concentration of dissolved protein in the feed broth [kg/m3] eins,in ) concentration of intracellular ADH in the feed broth [U/m3] pins,in ) concentration of intracellular protein in the feed broth [kg/ m3] esol,out ) concentration of dissolved ADH remaining in the precipitated solution [U/m3] psol,out ) concentration of dissolved protein remaining in the precipitated solution [kg/m3] eins,out ) concentration of intracellular ADH remaining in the precipitated solution [U/m3] pins,out ) concentration of intracellular protein remaining in the precipitated solution [kg/m3] wout ) waste solids concentration in the unprecipitated solution [kg/ m3] wout ) waste solids concentration in the precipitated solution [kg/ m3]

Appendix General x ) vector of continuous variables y ) vector of integer-valued variables y1 ) vector of integer design variables y2 ) vector of synthesis variables z1 ) vector of output variables whose input-output models are known z2 ) vector of output variables whose input-output models are black boxes Rn ) subspace of continuous variables q1 ) subspace of integer design variables q2 ) subspace of synthesis variables g ) feasibility constraint h ) feasibility constraint or explicitly known closed-form process models Γ ) input-output models lacking closed-form equations F ) objective function µ ) mean value for a noisy, black-box model output variable σ2 ) estimated or simulated standard deviation when process noise is modeled as a normally distributed random function x0 ) set of continuous variables in any nominal sampling vector y10 ) set of integer design variables in any nominal sampling vector

6124 Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 y20 ) set of synthesis variables in any nominal sampling vector Ω ) nominal sampling set SK ) kriging optimal solution xK ) set of kriging optimal continuous variables y1K ) set of kriging optimal integer design variables y2K ) set of kriging optimal synthesis variables xR ) set of RSM optimal continuous variables y2R ) set of RSM optimal synthesis variables y1D ) optimal sampling vector with respect to y1 SR ) optimal sampling vector with respect to x and y2 SD ) optimal sampling vector with respect to x, y1, and y2 Kriging Algorithm i ) index for denoting a sampled process input vector j ) index for denoting a sampled process input vector k ) index denoting specific sampling vector at which to generate a kriging prediction d ) Euclidean distance between continuous components of two vectors Si ) ith sampled process input vector Sj ) jth sampled process input vector Sk ) kth unsampled process input vector Fi,j ) squared function differences between two sampling vectors STOT ) total number of input vectors having known sampled output data h ) lag distance interval σVARIO2 ) covariance function sill γ(hp) ) semivariance based on distance interval hp N(hp) ) number of sampling pairs whose separation distance is within hp p ) index for classifying distance intervals by size P ) number of covariance function distance intervals λ ) vector of kriging weights λ′ ) Lagrange multiplier Cov(d) ) covariance between two vectors having separation distance d kcluster ) number of sampling vectors used to obtain a kriging prediction Scluster,k ) set of nearby sampling vectors used to obtain a kriging prediction at test point Sk σk2(Sk) ) kriging prediction variance at sampling point Sk ktest ) number of test points at which kriging predictions are generated µm ) mean value of the mth kriging predictor m ) kriging algorithm iteration index TolKrig ) stopping tolerance for kriging algorithm hcluster ) matrix of sampling-pair separation distances h0 ) vector of separation distances between sampling points and a kriging test point C ) matrix of sampling pair covariances D ) vector of covariances between sampling points and a kriging test point FVARIO ) fitted semivariogram model Λ ) Lagrangian-based vector of kriging weights TolKrig ) stopping tolerance for the kriging algorithm RSM Algorithm RSM ) response surface methodology n ) RSM problem dimension w ) RSM iteration index Scoll,w ) set of sampling points used to build the wth response surface S0 ) nominal locally optimal solution vector to be locally optimized with respect to x and relaxed y2 F0 ) objective value corresponding to S0

FK ) kriging optimal objective function value Sopt,w ) current best solution vector Fopt,w ) current best objective value Sopt,w+1 ) sampling vector obtained from optimizing the wth response surface Fopt,w+1 ) objective function value corresponding to wth RSM optimal sampling vector bw ) wth response surface radius TolRSM ) stopping tolerance for the RSM algorithm Direct Search Algorithm DS ) direct search DS-L ) local direct search algorithm DS-G ) global direct search algorithm m ) iteration index YLL ) lowest feasible value for a strict integer variable YUL ) highest feasible value for a strict integer variable Ym ) mth best solution for a strict integer variable YmL ) sampling point corresponding to the mth bracket low end point YmC ) sampling point corresponding to the mth bracket midpoint YmU ) sampling point corresponding to the mth bracket high end point F( · ) ) objective function value corresponding to sampling point (·) Nm ) mth bracket midpoint-end point interval length YD ) integer optimal solution for a strict integer variable Branch-and-Bound Algorithm/Unified Algorithm B&B ) branch-and-bound algorithm LB ) lower bound UB ) upper bound opt SNLP ) best locally optimized solution vector corresponding to a partially relaxed NLP opt FNLP ) objective value corresponding to the best local solution found for a partially relaxed NLP opt SMINLP ) MINLP solution vector that is integer optimal in y1 and y2 Fopt MINLP ) MINLP objective function value corresponding to integer optimal y1 and y2 TolBB ) stopping tolerance for the B&B algorithm Example 1 t-BMA ) tert-butyl methacrylate MA ) methacrylic acid IB ) isobutylene DIB ) diisobutylene H2SO4 ) sulfuric acid catalyst NaOH ) catalyst neutralizer RM ) raw material F0 ) reactor fresh feed stream [kg/h] FIB ) fresh isobutylene feed stream [kg/h] FMA ) fresh methacrylic acid feed stream [kg/h] FH2SO4 ) fresh sulfuric acid catalyst feed stream [kg/h] FNaOH ) fresh catalyst neutralizer feed stream [kg/h] Ft-BMA ) t-BMA feed stream containing purchased t-BMA [kg/h] VCW ) volume of cooling water needed for the t-BMA process [m3/h] VChW ) volume of chilled water needed for the t-BMA process [m3/h] Vref ) volume of refrigerant needed for the t-BMA process [m3/h] MStm ) mass of steam needed for the t-BMA process [kg/h] F1 ) reactor exit stream [kg/h] F2 ) feed stream for separation train 1 [kg/h] F3 ) feed stream for separation train 2 [kg/h] F4 ) feed stream for separation trains 3 and 4 [kg/h]

Ind. Eng. Chem. Res., Vol. 47, No. 16, 2008 6125 F5 ) feed stream for separation trains 5 and 6 [kg/h] F6 ) feed stream for separation train 3 [kg/h] F7 ) feed stream for separation train 4 [kg/h] F8 ) feed stream for separation train 5 [kg/h] F9 ) feed stream for separation train 6 [kg/h] yi ) synthesis variable indicating the existence of feed stream Fi QCW ) cooling water duty [J/h] FCW ) cooling water density [kg/m3] Cp,CW ) cooling water heat capacity [kJ/kg K] ∆TCW ) minimum approach temperature for cooling water [K] QChW ) chilled water duty [J/h] FChW ) chilled water density [kg/m3] Cp,ChW ) chilled water heat capacity [kJ/kg K] ∆TCW ) minimum approach temperature for chilled water [K] Qref ) refrigerant duty [J/h] Fref ) chilled water density [kg/m3] Cref ) chilled water heat capacity [kJ/kg K] ∆Tref ) minimum approach temperature for refrigerant [K] Djk ) jth component-rich distillate obtained for the kth separation [kg/h] Bjk ) jth component-rich bottoms obtained for the kth separation [kg/h] Djk ) jth component-rich distillate obtained for the kth separation [kg/h] Bjk ) jth component-rich bottoms obtained for the kth separation [kg/h] Dnoisy ) jth component-rich distillate obtained for the kth separation jk when noise exists [kg/h] Bnoisy ) jth component-rich bottoms obtained for the kth separation jk when noise exists [kg/h] ω ) scaling parameter limiting the amount of RM fed to reactor RRk ) reflux ratio for the kth separation Treb,k ) reboiler temperature for kth separation [K] Example 2 i ) dummy index for a repeated process ADH ) alcohol dehydrogenase F ) fermentation process H ) homogenization process Ci ) ith centrifugation process Pi ) ith precipitation process z1 ) ADH yield [kg/h] z2 ) ADH purity y1,1 ) number of homogenizer passes y1,2 ) number of centrifuge disks for the first centrifugation y1,3 ) number of centrifuge disks for the second centrifugation y1,4 ) number of centrifuge disks for the third centrifugation x1 ) glucose concentration [kg/m3] x2 ) homogenizer pressure [N/m2] x3 ) centrifuge disk stack angle for the first centrifugation [rad] x4 ) precipitant concentration for the first precipitation [kg/m3] x5 ) centrifuge disk stack angle for the second centrifugation [rad] x6 ) precipitant concentration for the second precipitation [kg/m3] x7 ) centrifuge disk stack angle for the third centrifugation [rad] zF ) any fermentation output variable for the fermentation process zH ) any homogenization output variable for the homogenization zC1 ) any output variable for the first centrifugation process zP1 ) any output variable for the first precipitation process zC2 ) any output variable for the second centrifugation process zP2 ) any output variable for the second precipitation process zC3 ) any output variable for the third centrifugation process R ) standard deviation parameter designating the process noise intensity

Literature Cited (1) Davis, E., Ierapetritou, M. A kriging based method for the solution of mixed-integer nonlinear programs containing black-box functions. J. Glob. Opt. DOI 10.1007/s10898-007-9217-2. (2) Wei, J.; Realff, J. Sample average approximation methods for stochastic MINLPs. Comput. Chem. Eng. 2004, 28, 333. (3) Goyal, V.; Ierapetritou, M. Stochastic MINLP optimization using simplicial approximation. Comput. Chem. Eng. 2007, 31, 1081. (4) Floudas, C.; et al. Global optimization in the 21st century: AdVances and Challenges; Elsevier: New York, 2005; Vol. 29, p 1185. (5) Meyer, C.; Floudas, C.; Neumaier, A. Global Optimization with Nonfactorable Constraints. Ind. Eng. Chem. Res. 2002, 41, 6413. (6) Chaudhuri, P.; Diwekar, U. Synthesis approach to the determination of optimal waste blends under uncertainty; INIST: Vandoeuvre, Cedex, France, 1999; Vol. 45, p 1671. (7) Barton, R. R.; Ivey, J. S. Nelder-Mead Simplex Modifications for Simulation Optimization. Manage. Sci. 1996, 42, 954. (8) Jones, D.; Perttunen, C.; Stuckman, B. Lipschitzian optimization without the Lipschitz constant. J. Opt. Theory Appl. 1993, 79, 157. (9) Huyer, W.; Neumaier, A. Global Optimization by Multilevel Coordinate Search. J. Global Opt. 1999, 14, 331. (10) Storn, R.; Price, K. Differential Evolution: A Simple and Efficient Adaptive Scheme for Global Optimization over Continuous Spaces. J. Global Opt. 1997, 11, 341. (11) Gilmore, P.; Kelley, C. T. An Implicit Filtering Algorithm for Optimization of Functions with Many Local Minima. SIAM J. Opt. 1995, 5, 269. (12) Choi, T. D.; Kelley, C. T. Superlinear Convergence and Implicit Filtering. SIAM J. Opt. 2000, 10, 1149. (13) Brekelmans, R. Gradient Estimation Schemes for Noisy Functions. J. Opt. Theory Appl. 2005, 126, 529. (14) Myers, R.; Montgomery, D. Response Surface Methodology; John Wiley & Sons: New York, 2002. (15) Davis, E.; Ierapetritou, M. A kriging method for the solution of nonlinear programs with black-box functions. AIChE J. 2007, 53, 2001. (16) Gutmann, H.-M. A Radial Basis Function Method for Global Optimization. J. Global Opt. 2001, 19, 201. (17) Jones, D. A Taxonomy of Global Optimization Methods Based on Response Surfaces. J. Global Opt. 2001, 21, 345. (18) Jones, D.; Schonlau, M.; Welch, W. Efficient Global Optimization of Expensive Black-Box Functions. J. Global Opt. 1998, 13, 455. (19) Regis, R.; Shoemaker, C. Improved strategies for radial basis function methods for global optimization. J. Global Opt. 2007, 37, 113. (20) Goovaaerts, P. Geostatistics for Natural Resources EVolution; Oxford University Press: New York, 1997. (21) Sacks, J.; Welch, W. J.; Mitchell, T. J.; Wynn, H. P. Design and Analysis of Computer Experiments. Stat. Sci. 1989, 4, 409. (22) Box, G.; Hunter, S.; Hunter, W. G. Statistics for Experimenters: Design, InnoVation, and DiscoVery, 2nd ed.; Wiley-Interscience: New York, 2005. (23) Rahman, M. K. An intelligent moving object optimization algorithm for design problems with mixed variables, mixed constraints and multiple objectives. Struct. Multidisc. Opt. 2006, 32, 40. (24) Lin, Y.-C.; Hwang, K.-S.; Wang, F.-S. A Mixed-Coding Scheme of Evolutionary Algorithms to Solve Mixed-Integer Nonlinear Programming Problems. Comput. Math. Appl. 2004, 47, 1295. (25) Van De Braak, G.; Bu¨nner, M.; Schittkowski, K. Optimal Design of Electronic Components by Mixed-Integer Nonlinear Programming. Opt. Eng. 2004, 5, 271. (26) McKetta, J. Encyclopedia of Chemical Processing and Design; Marcel Dekker: New York/Basel, 1976; Vol. 3. (27) Groep, M. E. Performance modeling and simulation of biochemical process sequences with interacting unit operations. Biotechnol. Bioeng. 2000, 67, 300. (28) Siddiqi, S. F. The effects of fermentation conditions on yeast cell debris particle size distribution during high pressure homogenisation. Bioprocess. Eng. 1995, 14, 1. (29) Mannweiler, K.; Hoare, M. The scale-down of an industrial disc stack centrifuge. Bioprocess. Eng. 1992, 8, 19.

ReceiVed for reView January 6, 2008 ReVised manuscript receiVed March 14, 2008 Accepted April 15, 2008 IE800028A