Sequential Experimental Design Strategy for Optimal Batch Profiles

An underlying experimental design algorithm for obtaining optimal batch profiles in a sequential ... illustrated through two end-point optimization pr...
0 downloads 0 Views 300KB Size
5260

Ind. Eng. Chem. Res. 2004, 43, 5260-5274

Sequential Experimental Design Strategy for Optimal Batch Profiles Using Hybrid Function Approximations Junghui Chen* and Kuan-Po Wang R&D Center for Membrane Technology, Department of Chemical Engineering, Chung-Yuan Christian University, Chung-Li, Taiwan, 32023, Republic of China

An underlying experimental design algorithm for obtaining optimal batch profiles in a sequential fashion is addressed. The proposed design technique, composed of the hybrid function approximation and the Taguchi orthogonal array, is developed to determine the new design profiles in the next run. This hybrid type of function approximation allows the global and local approximations to construct the profiles in the whole design space and some local regions, respectively, for a wide range of the batch profiles. It can convert the batch profiles into a set of function coefficients. The optimal profile can be obtained if the location of the function coefficients is properly adjusted in the function space. The Taguchi approach is used to design experiments and analyze the outcomes of each experimental design before conducting the next new run. To reduce the number of experiments in each run, a search procedure and a forbidding strategy are proposed according to the effect of each coefficient. They adjust the design coefficients and freeze the undesigned coefficients. Without prior knowledge of the batch processes, the proposed method using information from the previous batches can update and modify the profiles that will be applied to the subsequent experiments. The performance of the proposed method is illustrated through two end-point optimization problems, including a nondifferential system and a fed-batch process. Comparisons with some other optimization methods are also made. 1. Introduction Generally, the design problem of batch reactors lies in optimization of the yield and control of the product quality at the end of the batch cycle by following a batch profile. Pontryagin’s principle is one of the widely used approaches to solving this optimal control problem,1-4 but this approach is limited to the number of control variables and the model complexity. Alterative approaches that use numerical methods have been applied for handling nonconvex nonlinear problems, such as dynamic optimization, dynamic programming, and genetic and evolution algorithms.5-7 In all of the above approaches, the best operating conditions can be obtained theoretically under the assumption that accurate batch models are available.8 However, for practical applications, the optimization results might not be reliable because of the mismatch between the process and the model. Iterative learning control approaches integrated with model predictive control (MPC) and batch-to-batch optimization have been developed to correct the persisting error calculated from the model using the measurements of each batch run.9,10 Even though these approaches compensate for the model mismatch, they are still based on a first-principles model. The modeling of batch processes can be difficult because batch processes often involve complex and nonlinear reaction behavior. Furthermore, the above approaches require on-line measurements. They cannot be used when the product quality can only be examined off-line in a laboratory using batch-to-batch optimization. The optimization of batch processes based on the neural network approach has also been developed11,12 as neural networks provide an attractive solution to the * To whom correspondence should be addressed. E-mail: [email protected]. Fax: 886-3-2654199.

modeling problem. The difficulty in applying neural networks is that sufficient experimental data are required to construct a good, healthy neural network model. In reality, however, too few experiments are performed before product launch because of limited time and budgets. With the time-to-market pressure of the dynamic marketplace, one cannot obtain sufficiently accurate and detailed mathematical models of the underlying complex process phenomena in a short time period. Therefore, it is not easy to optimize the batch process on the basis of theoretical assumptions alone. Instead, many experiments have to be performed. Because the operation of batch processes is expensive, it is necessary to design batch process experiments carefully to meet the desired goal. An experimental design based on experience and intuition is often used, but it does not necessarily work effectively. Sometimes, it is also difficult for people to determine the optimal experimental conditions from historical operating data records because of the overwhelming amount of data and the complex nature of chemical processes. Duchesne and MacGregor incorporated conventional experimental design into a multiblock partial least-squares (PLS) model for the optimization of batch profiles.13 Because of the restriction that only conventional experimental design be used for the fixed manipulated variables, the batch operating profiles had to be cut into several pieces before application to the experimental design table. The resulting design profiles were piecewise discrete instead of smoothly continuous. Then, the different values of the finite levels at each time interval were changed according to the multiblock PLS algorithm and factor design to analyze the time-specific effects of process variables on the final product quality and to identify profile features. However, it was not practical to add the design experiments

10.1021/ie030588t CCC: $27.50 © 2004 American Chemical Society Published on Web 07/16/2004

Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004 5261

to the nominal operating conditions at each time interval for a wide variety of the batch profiles. Although this approach emphasized the idea that the intermediate quality measurements could provide more process information for the profile design, prior knowledge of the number of the time intervals and the changed scales at the corresponding time intervals were needed before an appropriate profile experiment could be built. Therefore, a measurement-based strategy for designing continuous optimal operating profiles is strongly needed to improve the process quality. Optimal profile design is considered as an optimal control technique. Such approaches are aided by using a parametric representation for the operating profile. This approximation is one of the fundamental issues exploited in a number of applications, such as data classification and data mining.14 Many different parametric representations for capturing the characteristics of the profile for control design have been used, such as Fourier series and Laguerre functions. Depending on the range of applicability, such representations can be classified as global and local approximations. Global approximations are valid in the whole region defined by side constraints. Although this approach allows for the construction of explicit approximations valid in the entire design space, it cannot easily match the occasionally rapid changes that occur in the design profile, such as in a fed-batch system whose feed flow rate is being changed in a ramp at some operating duration. On the other hand, local approximations are valid only in the vicinity of the points at which they are generated. They can help to reduce the complexity of a problem in some local region. Because of their local characteristics, they lack a global perspective and convergence to a smooth profile. Our previous work has been focused on modeling the operating profile with a global function approximation for optimal experiment design.15 In this research, a hybrid type of profile approximation containing both global and local functions is developed to further improve batch profile design. The local function is considered to enhance the flexibility of the optimal profile design. On the basis of the required design, the proposed method can properly patch the essential features in different time locations so that a systematic profile design procedure can be conducted. This paper is structured as follows: The second section defines the types of batch design problems addressed. The third section makes a comparison between two types of basis functions in batch design problems: global and local function approximations. The designed operating profile combining local and global functions is developed. The design method, which is discussed in detail in section 4, is a sequential technique that integrates the orthogonal experimental array, the search method, and the uncertainty analysis. The optimization search procedure for this integrated method is described in section 5. The effectiveness and potential applications of the proposed method are demonstrated through two examples in section 6. Finally, concluding remarks are made. 2. Problem Formulation The calculation of optimal input profiles used to optimize an objective function at the end of the batch

cycle can be formulated as the following control problem

max J(y(tf,u(t))) u(t)

(1)

subject to

x3 ) g(x,u) y ) h(x,u)

}

0 et e tf

(2)

where J is an objective function; x is the state vector; y is the quality variable; u represents the manipulated variables [L eu(t) eU]; tf is the duration of each batch run; U and L are the lower and upper bounds of the manipulated profile, respectively; and g and h represent descriptions of the dynamic models of the batch process. The optimal experimental setup can be found by searching for the optimal operating profile [uopt(t)] that leads to the maximization of J(y(tf,uopt(t)))

uopt(t) ) argmaxJ(y(tf,u(t))) u(t)

(3)

Although it is straightforward to optimize this equation, finding appropriate mathematical models is as difficult for batch processes as it is for continuous processes, because both types of processes are constructed in terms of conservation equations and complex reaction kinetics. In this research, nonmodel-based experimental design (i.e., without knowledge of g and h) is considered only to meet the needs of the practitioners in practical batch process applications. 3. Profile Approximation Based on Hybrid Basis Functions A complicated operating profile (which might be given explicitly or only implicitly) is approximated by a simple function, f(t). By the approximation theorem, a function f(t) that is assumed to be continuous and square integrable in the range Ω ) [a, b] can be spanned in terms of the linear combinations of a set of predeterN-1 mined orthnormal basis functions {φn}n)0 N-1

f(t) ) f(c,t) )

∑ cnφn

(4)

n)0

where the linearly independent functions φn(t) span the function space and constitute a set of basis functions for the function space. c ) [c0 c1 ‚‚‚ cN-1]T represents unknown coefficients or parameters determined from the data. If the basis functions are linearly independent, the elements of c can be computed by the projection of f(t) onto each basis function

cn )

∫Ωf(t) φn(t) dt

(5)

Therefore, using the approximated function, the original operating profile f(t) is mapped onto the new feature coefficients in function space, i.e., f(t) transforms the operating profile into key parameters related to the operating conditions in each profile. Figure 1a,b explains the general concept of the mapping procedure from the design profile to the feature coefficients. The point in the function space spanned by the set of independent basis functions represents one batch profile. Therefore,

5262 Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004

That is, the whole profile would be changed even if only some of the coefficients in eq 4 were changed. On the other hand, the Mexican hat function, which is a typical wavelet function, is used to explain local functions. The discrete wavelet is generated by means of scale and translation operations16

φm,p(t) ) 2-m/2φ(2mt - p)

Figure 1. Operating profiles are mapped onto the coefficients. (a) Two batch profiles: the optimal profile (line with circles) and the design profile based on the global function approximation (line with squares). (b) Coefficient plot for the corresponding profiles: the optimal point (circles) and the design profile (squares).

Figure 2. Legendre polynomials.

if the locations of the coefficients are properly adjusted in the function space, the optimal profile can be obtained. Depending on the selection of the predetermined basis functions, the function approximation falls into two broad categories: global [φg(t)] and local [φl(t)] basis function approximations. Before detailing the proposed hybrid basis function approximation, we digress somewhat to provide some background information about global and local basis functions. 3.1. Global Basis Functions vs Local Basis Functions. One type of global function, Legendre polynomials, is taken as an example. Legendre polynomials are orthogonal on the interval [-1, 1]. Legendre polynomials of different degrees are shown in Figure 2. It is observed that each polynomial spans the entire space window.

(6)

where φ(t) is a time scaling function with finite energy and rapid decay. The dilation variable m and the translation variable p are nonnegative integers. In Figure 3, all such functions, no matter how they are scaled or translated, have the same shape. Each of the scaling functions is considered a window; both the width and height of the window change with each different scale m and p. As m changes, the function covers a nonzero interval of different size. Thus, a smaller window (smaller m) can be constructed for higher frequencies or a larger window (larger m) for low frequencies. Figure 3 shows that translation parameter p simply shifts the function. By the operations of dilation and translation, the local function can be narrowed, widened, and/or shifted. The lower part of Figure 3 represents the positions of the local function family. This representation is the same as the twodimension resolution defined by Mallat.16 As with a jigsaw puzzle, the selection of the proper windows with different dilations and translations (m, p) can be used to construct an approximate function without bruteforce computational speed. 3.2. Hybrid Function Approximation. Global approximation allows for the construction of an explicit approximation that is valid in the whole design space. Because variations in some local regions occur, many global basis functions should be constructed. An alterative approach based on variations of local approximations can be used. With local function approximation, the approximated function can be broken into several local neighborhoods, with a separate model used for each neighborhood.17 The global approximation can be applied to both the coarse and smooth parts of the whole system. The result of a local approximation applied to the fine and varying parts of a region containing a critical zone can provide enhanced accuracy and expanded applicability. K Assume that a set of data {(tk f(tk)) ∈ R × R}k)0 is first obtained from the sampling data of an operating profile. For brevity, only a single profile is considered here, but extending the application of this technique to a multiprofile operating system is straightforward. In our specific application, tk represents the sampling point, and f(tk) describes the corresponding operating condition at tk. Assume that this set of data can be represented by a function with the global basis functions {φgn(t)}, f: R f R N-1

f(t) ≈ f(cg,t) )

∑ cgn φgn(t)

(7)

n)0

With the fixed terms of the basis functions, the approximation model can be viewed as a linear regression model

d ) Φcg + e

(8)

Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004 5263

Figure 3. Localizations of the Mexican wavelet at different dilations (m) and translations (p): (a) time domain and (b) dilation-translation plane.

where

d ) [f(t1) f(t2) ‚‚‚ f(tK)]T g cg ) [cg0 cg1 ‚‚‚ cN-1 ]T

[

e ) [e(t1) e(t2) ‚‚‚ e(tK)]T φg0(t1) φg(t ) g Φ ) [φg0 φg1 ‚‚‚ φN-1 ]) 0 2 l φg0(tK)

φg1(t1) φg1(t2) l φg1(t2)

‚‚‚ ‚‚‚ s¨ ‚‚‚

]

(9) g φN-1 (t1) g φN-1 (t2) l g φN-1 (tK)

Equation 8 above can be solved by the standard leastsquares method. To properly reduce the approximation error and cover more profile information, a new basis function (φgN) is added

d ) [A φgN ]

[]

cg +e cgN

(10)

N-1

cgn φgn(t) ∑ n)0

dates on the basis of the approximation error function. The time point of the maximum approximation error is

tekmax ) arg max |e(tk)| tk

There is always something to be corrected from the mismatch between the approximate profile and reality. For the function approximated with the global basis function, the approximation error is

e(t) ) f(t) -

Figure 4. Dilation-translation of the local approximation to reduce the approximation error with the adding the selected local function.

(11)

To reduce the prediction error, the appropriate local basis function should be constructed. However, many candidate local basis functions often do not contribute significantly to the corresponding locations for the purposes of profile approximation. A more systematic and economical approach is to properly select the basis functions to accommodate the area of large errors. Figure 4 illustrates the procedure for selecting candi-

k)1,2,...,K

(12)

Here, the prediction error at the time point tekmax is significant when compared with the errors at the other time points. This means that a local basis function in the function space is needed to obtain the desired accuracy. The location of the first selective basis function (h ) 0) at the cell (mh, ph) of the multiresoulton plane is

ph ) int[2mh(temax - 2-mh-1)]

(13)

where int means that the argument is rounded to the next largest integer and (mh, ph) represents the location l of the hth local function [φlh(t) ≡ φm (t)] to be added. h,ph The basis functions are selected from the lowest scale, mh ) 0, at the position ph. If the selected cell (mh, ph) has already been used, the so-called interval-halving search technique is employed. This means that the scale

5264 Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004

is moved into the higher level, i.e., mh ) mh + 1, and the time period length is cut in half. The lower part of Figure 4 shows the horizontal position and length of each box corresponding to the position and length of the time period being investigated. On the vertical axis, the boxes are ordered according to the different dilations m. The upper part of Figure 4 demonstrates graphically that the approximation error will be reduced when the local basis function is added at the point of maximum error. l Based on the local wavelets selected, {φlh} ) {φm (t)}, h,ph with the support of the corresponding position located at (mh, ph) of the mutliresolution plane, the values {clh} ) {cmh,ph} can be obtained by solving the equation

|

min max e(tk) cmh,phk)1,2,...,K

l cm ,p φm ∑ ∑ ,p (tk) , | m p h

h

h

h

h

(mh, ph) ∈ S

h

(14)

where S denotes the set of positions of the current and previously selected basis functions. Note that the profile or shape is important in some local regions. If the above approximation is based on the L2 norm instead of the min-max operation, the L2 minimization procedure prevents the error of the approximation in these regions from being small. The min-max operation searches down to the closeness that guarantees some minimal proximity of

max k)1,2,...,K

|e(t ) - ∑∑c k

mh ph

l mh,phφmh,ph(tk)

|

to produce the estimate18 l Σmh,Σphcmh,phφm (t ) h,ph k

f(t) ) f(cg,t) + f(cl,t) N-1

H-1

∑ cgn φgn(t) + h)0 ∑ clh φlh(t) n)0

cg0 trial 1 2 3 4 avg effect

value -195.54 -198.20 -143.93 -88.90 -156.64 -

level 1

cg1 level 2

level 1

cg2 level 2

level 1

level 2

-15.00 -7.00 -3.00 0.00 0.00 2.00 -195.54 -195.54 -195.54 -198.20 -198.20 -198.20 -143.93 -143.93 -143.93 -88.90 -88.90 -88.90 -196.87 -116.42 -169.73 -143.55 -142.22 -171.06 80.45 26.18 -28.84 -

should be reduced in some local regions. The major steps for the function approximation based on the use of hybrid global and local functions can be outlined as follows: 1. Obtain the global function approximation of the predefined operating profile in terms of the global function coefficients. 2. Remove the global approximation function to obtain the global approximation error. Check whether the accuracy of approximation is acceptable; if not, proceed to local function approximation. 3. Calculate the time point where the maximum approximation error occurs and locate the corresponding multiresolution local basis function at this time point. 4. Compute the new approximation error for the whole measurement profile after adding the new local function to the current approximation profile. 5. Check whether the accuracy of the current approximation is acceptable; if not, continue by adding another new local function, and go to step 3. 4. Sequential Design Strategy

This is important for local approximation applications. The min-max problem in eq 14 can easily be solved by transforming into linear programming. After the new basis function is added, the prediction error is reduced. The same procedures for finding the next location of the local function (h ) h + 1) are repeated until a function approximation that satisfies the defined accuracy is found. This profile is approximated similarly to a wavelet network,19-21 not the conventional wavelet decomposition of the measurements. Based on the wavelet network approximator, the advantage is that the proposed method following the multiresolution formulation inherited by wavelets includes the step of inserting an additional basis function. Furthermore, only a smaller number of wavelet functions is used to reconstruct the approximated function. The number of wavelet coefficients (factors) to be adjusted is decreased, and the corresponding number of experiments is also reduced. In Figure 1, the approximation error can be rapidly reduced if the local function covers the region between 6 and 14. Finally, the approximated profile is given by

)

Table 1. Orthogonal Experimental Design of Example 1 for the First Run

(15)

With the patch local function strategy, adjusting or adding global coefficients that influence the entire region is not needed when the approximation error

An experimental design is a deliberate manipulation of a process. It intends to measure the effect of one or more experimental factors on some set of responses. There are two major benefits of experimental design. First, it helps determine the critical information faster. Also, it is more economical and reliable than design with haphazard or naive experiments. With incremental design factors, the additional complexity should be accommodated with experiments of a reasonable size. In this section, a sophisticated method is presented. It combines an experimental design and an optimal search strategy with uncertainty analysis to provide a framework for exploring a possible optimal operating profile and a systematic sequential strategy for convergence of the optimization. 4.1. Experimental Design. The coefficients from the approximated function can be assigned to the columns of an orthogonal array. Suppose that an experimental objective outcome J is a function of several coefficients [c0 c1 ‚‚‚ cN-1] whose values can be controlled and that J ) f(c0, c1, ‚‚‚ cN-1). The adjusted coefficients [c0 c1 ‚‚‚ cN-1] are factors. The goal is to find the factors that will have the most beneficial effect on J when the factors change. This can be done by varying each factor independently of the others and recording the corresponding change in J. Orthogonal array designs are disciplined ways of varying two or more factors simultaneously. In one of the simplest orthogonal array designs, each individual factor varies between two levels (Table 1). More explanation of this table is included in the illustration of example 1.

Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004 5265

performance to the total effect of two different levels (Avgcji, j ) 1, 2)

Ravg ci )

min(Avgc1i , Avgc2i ) Avgc1i + Avgc2i

(17)

This ratio ensures that the new design levels will be close to the current adjusted coefficient with the best performance. Remark 2: Coefficient Bounds. For the operating profile in the feasible region, a heuristic rule of the bounds of each global and local coefficient should be set up before an experiment is run. The bounds on coefficients are discussed in the Appendix. Remark 3: Forbidding Strategy. By forbidding changes in these coefficients, the bad coefficients without significant contributions can be categorized and avoided. To prevent changes of the coefficients, it is necessary to check whether the current adjusted coefficient is readjusted in the next run. For the forbidding strategy of the search algorithm, two conditions are proposed here Figure 5. Search strategy for finding the possible optimal objective function. The gray background shows the possible optimal region. The solid lines represent two level locations of the current experimental design; the bold line indicates the optimal location of the current run, and the dashed line, the location of the previous experimental points.

4.2. Search Strategy. The region elimination method is used here to adjust factors and obtain the optimal coefficient combinations. The region elimination method is a linear search for deleting a calculated portion of the range of current coefficients in each successive stage of the search for an optimal quality J. Take one of the , for example, where ccurr ,j) current coefficients, ccurr i j 0, 1, ..., i - 1, i + 1, ..., N - 1, are fixed. In Figure 5a, must contain values of the region to the left of ccurr,2 i ) < J(ci). The next step evaluates the quality at J(ccurr,2 i (Figure 5b). The quality at this location the point cnew i ) has two possible outcomes: (i) If J(cnew ) g (cnew i i ) (Figure 5c1), the same procedure at Figure 5a J(ccurr,2 i is repeated until the new point in the search direction ) e J(ccurr,2 ) (Figure 5c2), the is obtained. (ii) If J(cnew i i possible maximum is bracketed by the two points ccurr,1 i and cnew . The size of the interval [ccurr,1 , cnew ] is reduced i i i (Figure 5d), and the same procedure as in Figure 5a is repeated to search the new interval location. The same update procedures are repeated until the desired quality undergoes no significant change. Remark 1: Coefficient Levels. The upper and lower levels of the adjustable coefficient (ci) in the new design are determined by the average effect of two levels (ccurr,1 and ccurr,2 ) at the current run i i curr,1 cnew,1 ) cbest + Ravg - ccurr,2 ) i i ci (ci i curr,1 cnew,2 ) cbest - Ravg - ccurr,2 ) i i ci (ci i

(16)

where cbest is one of two levels that has the best i is the ratio of the minimum average performance. Rcavg i effect of the current coefficient at one level with the poor

(i)

Ravg ci < s 1

N



| |

(18)

(ii) Effectci < w Effectci Ni)1

where s and w are ratio and effect factors, respectively, for the forbidding strategy. s and w values of 0.3 and 0.4, respectively, are used here. N is the number of adjustable coefficients. Effectci is the average effect computed by subtracting the average outcome of the level 1 values from the average outcome of the level 2 values of the factor cg0 (Table 1). The first rule is used to limit the adjusted factor because the small ratio Rcavg i does not provide significant search space. The second rule is used to freeze some adjusted factors when their effects from the orthogonal experimental array are below a weight of the average effect of all coefficients in this experiment. This prevents these bad coefficients from being revisited at the next run. 4.3. Uncertainty Analysis in Local Regions. It is important to identify the uncertainty in the possible new operating profile and characterize the time regions in which there are dominant contributors to the uncertainty. Therefore, the time regions of the manipulated profile should be identified and adjusted at the duration of the new operating profile. Here, sensitivity analysis is used to assist the uncertainty region analysis in achieving this goal. The analysis is undertaken to determine the possible time intervals that have to be changed to improve the operating performance. To explore the possible improvement conditions, the quality output uncertainty at a time point can be estimated as a function of the operating point uncertainty using a first-order approximation. The approximation of uncertainty associated with the model performance measure is derived by obtaining the variance of a first-order approximation of a Taylor series expansion of the model performance at the different time points. It can be expressed as

J(t) - Jopt )

∑ aj,t [fj(tk) - fopt(tk)]

k)1

k

(19)

5266 Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004

Figure 6. Uncertainty analysis used to determine the possible regions for performance improvement. The three-dimensional plot represents the regression coefficients at different time points, and the two-dimensional plot is used to collect all coefficients to explain the possible design time regions.

where fopt(t) is the past operating profile that has the best performance and is regarded as a reference point. Jopt is the optimal value under the operating condition fopt(t). The regression coefficient, aj,tk ≡ [∂J/∂f(tk)], represents the amount of change in J(t) with respect to the change in f(t) at the time point tk. The coefficients (aj,tk) can account for the (predicting) variance. They can be derived from multiple regression based on a linear combination of the operating conditions in the interval or at dichotomous time points. To regress the reliable equations, the past operating profiles that are close to the best operating profile are selected. In our previous discussion, the implicit operating information can be transformed into key coefficients c that cover the necessary part of the operating conditions in each run. Thus, the selected operating profiles can be rewritten as the distance metric of the coefficients

||fopt(t) - fj(t)|| e Df w ||copt - cj|| e Dc (20) where Df and Dc are threshold values. The Euclidean distance is used as a similarity measure between the coefficients copt of the optimal operating profile and the other coefficients cj of the other profile. It is clear to see that two profiles are similar if ||copt - cj|| e Dc. In eq 19, the maximum number of regression coefficients is restricted to the number of operating profiles satisfying eq 20. If R operating profiles are selected, the time points at the corresponding coefficients aj,tk can be defined as

tk ) k∆th

(21)

where ∆th is the time interval. Furthermore, considering the variation at the initial and final time points, the regression at these two points would be included. The time interval is defined as

∆th )

tf (R + 1) - h

Figure 7. Selection of the local functions to be added from the scale-translation plane.

relative variance powers at the corresponding time points. Because of the different prediction errors in each regression equation, the accuracy of variations of each regression equation should be weighted. Here, the minimum error (emin ) min eapp j ) is used to scale the other error that explains the relative accuracy of the app regression coefficients, wj ) emin/eapp ) J(t) j , where ej opt opt - J - Σk)1aj,tk[fj(tk) - f (tk)]. Thus, the variation at the time point tk by multiplication of wj and aj,tk is defined as

S(tk) ) wjaj,tk

The middle part of Figure 7 shows that after all regression equations are projected onto the S-t plane, four proper basis functions in the corresponding coefficient regions that are most important to the operating profile can be added to improve the performance. For example, one of the regions is around [tu,i, tl,i]. The location (mi, pi) of the local function being added to this region on the multiresolution plane can be located by

[

(

mi ) int

)]

tu,i - tl,i tf log 2

-log

(22)

where h ) 0, 1, 2, ..., R - 2, and k ) l, l + 1, ..., R - h - l - 1. l ) 0 represents the regression without both the initial and final time points. l ) 1 and l ) 2 are the regressions with the initial time point and with the final time point, respectively. A three-dimensional plot in Figure 6 shows that the regression coefficients of the regression equations are computed at different points. On the vertical axis, the values of these coefficients are the magnitudes of the

(23)

and

[ (

pi ) int 2mi

)]

tu,i + tl,i - 21-mi 2

(24)

The bottom part of Figure 7 indicates that four local functions should be located. The top of Figure 7 shows the current design profile (solid line) and the new design profile (dashed line) after addition of these local functions. Using this analysis framework, one can see how

Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004 5267

Figure 8. Optimal profile design architecture.

the variance of the product quality can be explained by changing the operating profile at one or a set of time points. 5. Design Procedures Based on the proposed strategy, the working logic of the proposed optimal profile architecture is sketched in Figure 8. Reference operating profiles can be used as initial design profiles if past operating experience or basic physical phenomena of this design system are available. If not, a straight-line profile can be used initially. The whole procedure is sequential and is described stepwise as follows. Phase I: Profile Design Based on the Global Function Approximation. (1) Select the optimization goal (Jt), the minimum improvement ratio of two successive runs (R), and the maximum number of runs to be conducted in this design (rmax). Set r ) 1.

(2) Construct the Taguchi experimental array and the levels of each coefficient. (3) Approximate the operating profile using the proper number of coefficients. Perform the profiles of the batch experiments. (4) Select the optimal experiment design profiles whose corresponding best quality is Jopt r in the current experimental design. is converged to a certain threshold value (5) If Jopt r t (J ) or the number of experimental design runs exceeds the maximum number (r > rmax), stop the experimental design. (6) If the quality is significantly improved in two opt opt successive runs, i.e., Jopt r - Jr-1 > R(Jr ), go to step 7. opt If the current best quality with N basis functions (J•,N ) is significantly improved in comparison with the past opt best quality with N - 1 basis functions (J•,N-1 ), add a

5268 Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004

new basis function, set N ) N + 1, and go to step 7; otherwise, go to step 8. (7) Use the search strategy and the effect analysis of the Taguchi table to search for the optimal region of the quality response and to forbid some coefficients. Set r ) r + 1 and go to step 2 Phase II: Profile Design Based on the Local Function Approximation. (8) Conduct the uncertainty analysis around the current best design profiles to determine the possible locations where new local functions should be added. (9) Repeat steps 2-5 of the design procedure of the global function approximation. (10) If the quality is significantly improved in two opt opt successive runs, i.e., Jopt r - Jr-1 > R(Jr ), go to step 11; otherwise, stop the experimental design. (11) Use the search strategy and the effect analysis of the Taguchi table to search for the optimal region of the quality response and to forbid some coefficients. Set r ) r + 1 and go to step 8.

Figure 9. Initial design profiles in example 1.

6. Illustrations Two examples are employed to illustrate the operation and test the performance of the proposed scheme. For simplicity, a mathematical function with a single profile is used in the first example. This example helps explain why the performance of the objective function with the hybrid-type profile approximations is improved compared to that with either the global or local profile approximation alone. The second example is concerned with an optimization of a fed-batch bioreactor. It demonstrates how this design scheme performs on single-feed and double-feed problems. A performance comparison between the proposed method and the other optimization methods is also included. These examples are discussed separately in the following subsections. In these examples, a global function based on the Legendre polynomial and a local function based on the Mexican-hat function are employed. Example 1: Nondifferentiable System. As a primary test, the design procedures are used to detect the minimum of a benchmark problem22 that is described by

dx1 ) x2 dt dx2 ) - x1 - x2 + 100[H(t - 0.5) - H(t - 0.6)] (25) dt dx3 ) 5x12 + 2.5x22 + 0.5u2 dt where the initial state is x(0) ) [0 0 0]T and the operating time for each run is tf ) 2. Two unit step functions, H(t), are used to generate a rectangular pulse disturbance at 0.5 e t e 0.6. The performance index for this example is defined as

J ) max [-x3(tf)] u(t)

(26)

Initially, we have no idea which terms of the orthogonal functions should be included, so only the first threeterm global functions including a straight line are used. Four profiles are set as the initial reference profiles (Figure 9). The Taguchi table is reproduced at the first run (shown in Table 1) to evaluate the effect of the

Figure 10. Addition of the local functions using the uncertainty analysis in example 1.

different experimental coefficients. The columns labeled cg0, cg1, and cg2 represent three experimental global coefficients. Four combinations of experimental coefficients are needed for these coefficients. Each combination uses levels 1 and 2 in cg0, cg1, and cg2 columns (without dashes) of the row containing the trial number. The J value of each experimental coefficient is shown in the column labeled “value.” The row labeled “effect” in the cg0, cg1, and cg2 columns contains the average effect on the outcome of moving a factor from its level 1 value to its level 2 value. Table 1 indicates that cg0 and cg1 should increase and cg2 should decrease to increase the J averaged value. The same procedures are repeated until at run 4 when the quality performance cannot be significantly improved (R ) 0.2%). The design based on the local approximation is applied. From the past 36 experiments, the uncertainty analysis shown in Figure 10 indicates that three locations should be filled with three local functions. A new experimental design is then performed. The same sequential design procedures are repeated until after run 13 when the stopping criterion (J e 58.2) is satisfied. A total of 13 runs with 52 experiments are conducted to determine the optimum profile. The optimal approximated functions at some runs and the evolving history of each run are plotted in Figure 11. The graphical representation of the successive control performances in the first few runs are significantly improved because the skeleton profile based on the global functions can be quickly obtained. After run 4, the control performance is improved gradually with the addition of new local functions (Figure 11). The problem is also tested with the profile based on only the global function approximation and that based on only the local function approximation. The optimal

Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004 5269

Figure 11. (a) Optimal profiles and (b) average and best objective functions for different runs performed using the hybrid function approximation in example 1.

Figure 13. (a) Optimal profiles and (b) average and best objective functions for different runs performed using the local function approximation in example 1.

Figure 14. Optimal profiles computed from the different design methods in example 1.

Figure 12. (a) Optimal profiles and (b) average and best objective functions for different runs performed using the global function approximation in example 1.

profiles for some runs are separately shown in Figures 12a and 13a. As with the hybrid approximation, the shapes of the optimal profiles are changed after the new

functions are added to increase the search space. The corresponding optimal performance for that run is also changed (Figures 12b and 13b). Table 2 presents the performances of these three types of profile approximations. Not surprisingly, all final optimal results are almost the same, but the hybrid approximation requires the fewest experiments for the design. Iterative dynamic programming (IDP) has been applied to this case to determine the optimal profile.22 The design profiles based on solving the IDP against the previous design profiles are replotted in Figure 14. The results of the proposed method without a system model (58.19) are acceptable compared to the solution of the IDP (58.18) for the given dynamic model. Example 2: Fed-Batch Bioreactor. Basically, the control task of fed-batch fermentation lies in the determination of the proper feed rate of a substrate that facilitates both the growth of the cell concentration and

5270 Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004 Table 2. Summary of the Different Design Methods in Example 1 method

Jopt

number of expts

global local hybrid IDP

58.94 58.39 58.19 58.18

85 78 52

the concentration of the desired product during the fermentation time. The analytical model of the fed-batch fermentation to be designed is23,24

(F1 + F2) dXv ) (µ - kd)Xv Xv dt V (F1 + F2) dGlc F ) Glcin Glc - qGlcXv dt V V (F1 + F2) dGln F2 ) Glcin Glc - qGlnXv dt V V (F1 + F2) dLac ) qLacXv Lac dt V

(27)

(F1 + F2) dAmm ) qAmmXv Amm dt V (F1 + F2) dMAb ) qMAbXv MAb dt V dV ) F1 + F2 dt

Figure 15. (a) Optimal profiles and (b) average and best objective functions for different runs for the single-feed design in example 2.

with the following kinetic expressions

µ ) µmax

(

)(

Table 3. The objective function is the total amount of monoclonal antibody obtained at the end of the fed-batch fermentation (tf ) 10 days)

)

Glc Gln kGlc + Glc kGln + Gln

J ) max[MAb(tf) V(tf)]

kd ) kd,max kd,Gln (µmax - kd,LacLac)(µmax - kd,AmmAmm)(kd,Gln + Gln)

(

)

qGlc )

µ Glc + Glc + mGlc YXv/Glc km,Glc

qGln )

µ YXv/Gln

(28)

qLac ) YLac/GlcqGlc

qAmm ) YAmm/GlnqGln

qMAb )

R0 µ+β kµ + µ

where Xv, Glc, Gln, Lac, Amm, and MAb are the concentrations of viable cells, glucose, glutamine, lactate, ammonia, and monoclonal antibodies, respectively. V is the fermentor volume. Glcin and Glnin are the concentrations of glucose and glutamine, respectively, in the feed stream. The parameter values are listed in

The constraints on the control variable and the culture volume are F(t) e 0.5 L/day and V(t) e 2 L. Two different operating cases are investigated: (1) a single-feed case that involves combining two feeds, replacing F1 + F2 by F, and (2) a double-feed case that involves two separate feed volumetric feed rates, F1 and F2. Figures 15a and 16a present the optimal feed profiles of these two cases for some runs. The design performance as a function of batch number is shown separately in Figures 15b and 16b. Figure 17 shows the design factors for the single-feed case and the corresponding number of experiments during each run, where the global and the local function approximations are applied from run 1 to run 9 and from run 10 to run 19, respectively. We can see that the forbidding strategy can freeze some factors without significant contributions even if the same factors are selected in some runs. This strategy provides a way to reduce the number of

Table 3. Parameter Values in Example 2 parameter µmax YXv/Glc mGlc kglc R0 β kd,amm YLac/Glc

value

parameter

value

1.09 1.09 × 108 cells mmol-1 0.17 mmol × 10-8 cells day-1 1.0 mM 2.57 × 10-8 mg of cells 0.35 × 10-8 mg of cells 0.06day-1 mM-1 1.8 mmol mmol-1

kd,max YXv/Gln km,glc kgln kµ kd,lac kd,gln YAmm/Gln

0.69 day-1 3.8 × 108 cell mmol-1 19.0 mM 0.3 mM 0.02 day-1 0.02 mM 0.02 mM 0.85 mmol mmol-1

day-1

(29)

Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004 5271

Figure 17. (a) Number of experiments and (b) number of factors for different runs for the single-feed case in example 2. Table 4. Summary of the Different Design Methods in Example 2 single profile process structure

design method

Jopt

unknown hybrid (Mexican-hat) 310.3 hybrid (Haar) 311.0 known GS (Roubos et al.27) 312.9

Figure 16. (a) Optimal profiles of F1(t) and F2(t) and (b) average and the best objective functions for different runs for the doublefeed design in example 2.

experiments. In a comparison between these two cases, the performance of the double-feed case presents a significant improvement. In the double-feed case, initially, two coefficients for each profile are implemented simultaneously. The two profiles in the first four runs can be quickly skeletonized because the global functions dominate the change of the whole profile. After convergence, the sequential design of each profile to add the local basis function is applied. F1(t) is first updated by adding the new coefficient, and F2(t) is kept at the previous profile. When F1(t) is converged, F2(t) is updated by a new coefficient with the fixed previous profile F1(t). The alternating design of F1(t) and F2(t) can be continued until the convergence of all design profiles is achieved. As in the simplex design method, a sequential iteration for the individual profile is used to determine better-quality conditions. The local functions added to the skeleton profile that is built from the global basis functions only affects the local behavior. Thus, for each design stage, the system is designed on

double profile

number of expts

Jopt

number of expts

72 108 20000

387.9 388.5 397.9

153 145 30000

the basis of a single profile. More importantly, the number of experiments and the computational complexity of the optimal problem can be reduced for all coefficients of the two profiles. In this study, a piecewise-continuous local function, the Haar function,25,26 is also applied. The results show that the Haar function yields the same good performance (Table 4). In these two cases, the optimal feed profiles calculated using the genetic algorithm (GA) are also included.27 Table 4 documents the performances of these different optimization methods. The corresponding design profiles for the different methods are shown in Figure 18. Without prior knowledge of the process model, the performance achieved by the proposed sequential design procedures is 2% less than the best result obtained by GA, which requires complicated models involving the chemical reactions to be given. Furthermore, to converge to the optimal solution, the number of iterations Roubos et al.27 used for GA is more than 20000 times in the single-feed case and 30000 times in the double-feed case. However, the proposed method with only 19 runs (about 100 experiments) for the single-feed design and with only 34 runs (around 150 experiments) for the double-feed design can quickly reach the desired performance. Roubos’ GA method27 is good when prior knowledge of the model can be obtained. The proposed method, on the other hand, can

5272 Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004

Figure 18. Optimal profiles computed from the different design methods in example 2: (a) single-feed case and (b) double-feed case.

perform well even without prior knowledge of the model, but only with off-line measurements. However, it would be interesting to apply the proposed method with prior knowledge of the model in our next research. 7. Conclusion In today’s world of intense financial competition, chemical processes must be optimized quickly if they are to become successful. These successful processes must continue to be operated optimally if they are to maintain their competitive edge. In this research, an optimal batch profile strategy based on sequential experimental design is developed. It can be used to achieve and maintain optimized processes based on an analysis of historical operating data. This new hybrid algorithm for profile approximation consists of two parts that can be applied together or

independently: a coarse and smooth approximation using global features of the functions and a fine and sharp approximation using local features of the functions for partitioning. On the basis of the projection of the profile onto a small set of global and local basis functions, the partitioning can compress the profile information into a few projection coefficients. The optimal profiles can be obtained as long as the locations of the coefficients are properly adjusted. Changing one of the global coefficients causes a slight difference in the overall shape of the profile. In contrast to the global function, the change of one local coefficient affects only a limited region of the profile. Consequently, the integration of the merits of global and local basis functions is more appropriate for general profile approximation. The adjustment algorithm is based on the effect analysis of the Taguchi experiment to search for possible optimal operating profiles. If the profile is designed according to conventional experimental design only, it is inevitable that the profile will be discretized into pieces.13 This means that the number of time intervals to be cut and the values of changing levels in the corresponding time intervals cannot be easily and accurately obtained. Without discretizing the number of time intervals, the proposed method can be used for a continuous profile. The values to be changed in each level of experimental design can be systematically determined on the basis of an uncertainty analysis. By a sequence of design experiments, the process performance can be gradually improved until it reaches the optimal batch profile. The proposed algorithm has the following advantages: (i) The traditional techniques of experimental design can be applied to the design of dynamic batch processes. (ii) Determining possible optimal operating profiles from past operating historical records is simple. It can significantly reduce the time required to identify complicated batch systems. (iii) By the sequence of the proposed experimental design, the optimal operating profile can be explored as long as the experimental data are properly designed. The potential of the proposed technique is demonstrated by means of simulation studies. The results of the two different cases show good performance of the proposed method in comparison with other methods that require the use of complicated models involving the chemical reactions, but the number of experiments seems too large for practice application. This problem is worthy of further study. In addition, our future work will be concentrated on the effect of noisy data that is usually obtained from experiments and also on performing a real laboratory-scale experiment. Acknowledgment This work is partly sponsored by National Science Council, R.O.C. and by the Ministry of Economic, R.O.C. Appendix (i) Upper and Lower Bounds on the Global Coefficients. The bounds can be obtained from the global approximated profile N-1

L e f(cgN,t) )

∑ cgn φgn(t) eU

n)0

0 e t e tf

(30)

Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004 5273

Figure 19. Hierarchical regions of the global coefficients. The selection order of the coefficient is from top to bottom. The bold line in each horizontal line represents the maximum operating region, and the circles represent the two selected operating levels.

where L and U are the lower and upper bounds, respectively, of the manipulated profile. If the values of cgj , j ) 0, 1, ..., i - 1, i + 1, ..., N - 1, are fixed except cgi , the upper and lower bounds of cgi can be obtained as

lL possible upper (clU j ) and lower (cj ) bounds for this local region can be computed by

Ll(t) e clL j mj

N-1

cgL,1 ) max[L i

cgj φgj (t)/φgi (t)] ∑ j)0,j * i

where N-1

Ll(t) ) L -

Upper bound of cgi N-1

cgU,1 ) min[U i

(31)

l clU j Mj eU (t)

Lower bound of cgi

cgj φgj (t)/φgi (t)] ∑ j)0,j*i

Thus, if the operating range [cg,1 i , g,2 , c ] ⊂ (Figure 19), the condition [cg,1 i i

cg,2 i ] is selected [cgL,1 , cgU,1 ] must i i

be satisfied to obtain L e f(t) e U. Likewise, two regions , cgU,1 ] of the upper and lower bounds of cgj , j * i, [cgL,1 i i gL,2 gU,2 and [ci , ci ], can be obtained if the other coefficients are fixed. Therefore, the operating range [cg,1 j , g,1 cg,2 j ] should be selected according to the condition [cj , g,2 gL,1 gU,1 gL,2 gU,2 cj ] ⊂ ([cj , cj ] ∩ [ci , ci ] to satisfy L e f(t) eU. According to this method, the selected operating levels marked in circles (Figure 19) are shrunk from the upper selected coefficient to the lower one. As for the selection order of the coefficients, the effect analysis in the experimental array can be used. The effect value can separate the total variability of the data into the contributions from each of the coefficients. This can help determine which of the coefficients has a significant effect. The more significant the effect of the coefficient on the quality function, the earlier the coefficient should be selected. (ii) Upper and Lower Bounds on the Local Coefficients. Let Mj ) max|φlj(t)| and mj ) -Mj in the local range of [tL, tU]. Substituting these two extreme conditions into the constraint of the design profile, the

∑ n)0

J-1

cgn φgn(t) -

clj φlj(t) ∑ j)0

and N-1

Ul(t) ) U -

∑ n)0

J-1

cgn φgn(t) +

clj φlj(t) ∑ j)0

Literature Cited (1) Rani, K. Y.; Rao, V. S. R. Control of FermenterssA Review. Bioprocess Eng. 1999, 21, 77. (2) Soroush, M.; Kravaris, C. Optimal Design and Operation of Batch Reactors. 2. A Case Study. Ind. Eng. Chem. Res. 1993, 32, 882. (3) Chen, S. A.; Jang, W. F. Minimum End Time Polices for Batchwise Radical Chain Polymerization. Chem. Eng. Sci. 1978, 33, 735. (4) Bryson, A. E.; Ho, Y. C. Applied Optimal Control; John Wiley and Sons: New York, 1975. (5) Roubos, J. A.; van Straten, G.; van Boxtel, A. J. B. An Evolutionary Strategy for Fed-Batch Biorecator Optimization: Concepts and Performance. J. Biotechnol. 1999, 67, 173. (6) Luus, R. Optimal Control by Dynamic Programming Using Systematic Reduction of Grid Size. Int. J. Control 1990a, 51, 995. (7) Luus, R. Application of Dynamic Programming to High Dimensional Non-Linear Optimal Control Problems. Int. J. Control 1990b, 52, 239. (8) Bonvin, D.; Srinivasan, B.; Ruppen, D. Dynamic Optimization in the Chemical Industry. In Chemical Process ControlsCPC VI; Rawlings, J. B., Ogunnaike B. A., Eaton, J. B., Eds.; AIChE Symposium Series 98; American Institute of Chemical Engineers (AIChE): New York, 2002; No. 326, p 255.

5274 Ind. Eng. Chem. Res., Vol. 43, No. 17, 2004 (9) Lee, K. S.; Chin, I. S.; Lee, H. J.; Lee, J. H. Model Predictive Control Technique Combined with Iterative Learning for Batch Processes. AIChE J. 1999, 45, 2175. (10) Zafiriou, E.; Chion, H. W.; Adomaitis, R. A. Nonlinear Model-Based Run-to-Run Control for Rapid Thermal Process with Unmeasured Variable Estimation. Electrochem. Soc. Proc. 1995, 95, 18. (11) Dong, D.; McAvoy, T. J.; Zafiriou, E. Batch-to-Batch Optimization Using Neural Networks. Ind. Eng. Chem. Res. 1996, 35, 2269. (12) Tholudur, A.; Ramirez, W. F. Optimization of Fed-Batch Bioreactors Using Neural Network Parameter Function Models. Biotechnol. Prog. 1996, 12, 302. (13) Duchesne, C.; MacGregor, J. F. Multivariate Analysis and Optimization of Process Variable Trajectories for Batch Processes. Chemom. Intell. Lab. Syst. 2000, 51, 125. (14) Vecman, V. Learning and Soft Computing: Support Vector Machines, Neural Networks and Fuzzy Logic Models; MIT Press: Cambridge, MA, 2001. (15) Chen, J.; Sheui, R.-G. Using Taguchi’s Method and Orthogonal Function Approximation To Design Optimal Manipulated Trajectory in Batch Processes. Ind. Eng. Chem. Res. 2002, 41, 2226. (16) Mallat, S. G. A Theory for Multiresoultion Signal Decomposition: The Wavelet Representation. IEEE Trans. Pattern Anal. Mach. Intell. 1989, 11, 7. (17) Farmer, D.; Sidorowich, J. Exploiting Chaos to Predict the Future and Reduce Noise. In Evolution, Learning and Cognition; Lee, W. C., Ed.; World Scientific: Singapore, 1988; p 277. (18) Koulouris, A.; Bakshi, B. R.; Stephanopoulos, G. Empirical Learning through Neural Networks: The Wave-Net Solution. In Intelligent Systems in Process Engineering, Paradigms from Design and Operations; Stephanopoulos, G., Han, C., Eds.; Academic Press: New York, 1996; p 437.

(19) Chen, J.; Bruns, D. D. WaveARX Neural Network Development for System Identification Using a Systematic Design Synthesis. Ind. Eng. Chem. Res. 1995, 34, 4420. (20) Baskshi, B.; Stephanopoulos, G. Wavelet-net: A MultiResolution, Hierarchical Neural Network with Localized Learning. AIChE J. 1993, 39, 57. (21) Zhang, Q.; Benveniste A. Wavelet Networks. IEEE Trans. Neural Networks 1992, 3, 889. (22) Luus, R. Piecewise Linear Continuous Optimal Control by Iterative Dynamic Programming. Ind. Eng. Chem. Res. 1993, 32, 859. (23) de Tremblay, M.; Perrier, M.; Chavarie, C.; Archambault, J. Optimization of Fed-Batch Culture of Hybridoma Cells Using Dynamic Programming: Single and Multi Feed Cases. Bioprocess Eng. 1992, 7, 229. (24) de Tremblay, M.; Perrier, M.; Chavarie, C.; Archambault, J. Fed-batch Culture of Hybridoma Cells: Comparison of Optimal Control Approach and Closed Loop Strategies. Bioprocess Eng. 1993, 9, 13. (25) Daubechies, I. The Wavelet Transform, Time-Frequency Localization and Signal Analysis. IEEE Trans. Inf. Theory 1990, 36, 961. (26) Strang, G. Wavelets and Dilation Equations: A Brief Introduction. SIAM Rev. 1989, 31, 614. (27) Roubos, J. A.; Van Straten, G. A.; Van Boxtel, J. B. An Evolutionary Strategy for Fed-Batch Bioreactor Optimization: Concepts and Performance. J. Biotechol. 1999, 67, 173.

Received for review July 11, 2003 Revised manuscript received April 27, 2004 Accepted June 2, 2004 IE030588T