Optimal Batch Trajectory Design Based on an Intelligent Data-Driven

Optimal Batch Trajectory Design Based on an Intelligent. Data-Driven Method. Junghui Chen* and Rong-Guey Sheui. R&D Center for Membrane Technology, De...
0 downloads 7 Views 973KB Size
Ind. Eng. Chem. Res. 2003, 42, 1363-1378

1363

Optimal Batch Trajectory Design Based on an Intelligent Data-Driven Method Junghui Chen* and Rong-Guey Sheui R&D Center for Membrane Technology, Department of Chemical Engineering, Chung-Yuan Christian University, Chung-Li, Taiwan 32023, Republic of China

The goal of this paper is to extend recently developed work [Chen, J.; Sheui, R.-G. Ind. Eng. Chem. Res. 2002, 41 (9), 2226] to design the optimal trajectory for a batch process on the basis of an intelligent data-driven experimental design scheme. This method integrates orthonormal function approximation with soft-computing techniques. The continuous batch trajectories of process measurements represented by a set of orthonormal functions are mapped onto a finite number of coefficients in the function space. These coefficients capture the trajectory behavior of the process measurements. The optimal trajectory can be obtained as long as the locations of the coefficients are properly adjusted. An adjustment algorithm combining a neural network with a genetic algorithm is developed to search for optimal coefficients. The neural network is used to identify the relationship between the coefficients and the objective function and to predict the quality response. Once a suitable neural-network model is obtained, the genetic algorithm is used to explore the optimal conditions. There are two major advantages of the proposed method. First, it can deal with the dynamic set-point problem that cannot be handled well using traditional experimental design. Second, by a sequence of design experiments, the objective of the process performance can be gradually improved for the optimization of dynamic batch processes. Potential applications of the proposed method are shown through three detailed simulation studies. Introduction With the increasing need to reduce the time-to-market in the constantly changing marketplace, batch reactors are widely used in many industries. Batch processes are flexible in operation for a given set of same equipment. They are ideally suitable for value-added specialty chemicals, such as pharmaceuticals, polymers, biochemicals, etc. Batch processes are characterized by the prescribed processing of raw materials as a discrete entity for a finite amount of time to produce a finished product. They are transient in nature, with their state changing with time. Given a specified recipe of batch processes, products are produced, and their characteristics are significantly affected by the operating trajectories of the manipulated variables. Therefore, the design of optimal trajectories plays a critical role in batch processing. Several approaches have been used to compute optimal reference trajectories by converting the optimal control problem into a dynamic programming problem. Pontryagin’s maximum principle has often been used.2-4 However, such calculations demand a great deal of memory and are time-consuming. Thus, the iterative formulation of dynamic programming5,6 and a sequential quadratic programming approach7,8 were derived. Tsang and Clarke (1988) proposed that the differential equation problem be transformed into a nonlinear algebraic equation problem using a discretization method9 (e.g., orthogonal collocation, finite differences). However, the fatal problem of the above optimization is that it is based on a reliable and accurate process model. Furthermore, the success/failure of the above * To whom all correspondence should be addressed. Email: [email protected].

approaches for real processes has not been conclusively established. For a more detailed discussion, see our past work.1 Statistical data analysis and decision-making tools have traditionally been used in experimental design.10-12 The Taguchi method has successfully been applied in automobile, microelectronic, and chemical processes control problems.13,14 However, the Taguchi method is not a panacea for all parameter design problems. It was proven to be more effective in continuous processes than in batch processes whose manipulated variable followed a prespecified trajectory within the course of a batch. Duchesne and MacGregor (2000) cut the profile into several pieces and then used a multicblock PLS algorithm and factor design to analyze the time-specific effects of process variables on the final product quality and to identify trajectory features.15 However, it was not practical to add design experiments to the nominal operating conditions at every time interval for a wide range of variations of the batch trajectories. Prior knowledge about the number of the time intervals was also needed before an appropriate trajectory experiment could be built. A well-developed design method should be inferred from a set of data collected from an experiment with the system of interest. Even though the method is a shortcut without the need for the derivation of mathematical models, experiments are required to provide a maximum amount of information throughout the entire range of operation to effectively achieve the goal of performance. In the past, soft-computing [neuralnetwork (NN) and genetic-algorithm (GA)] applications related to process designs were developed to mine the data information. Because of their learning capacity, fault tolerance, and model-free characteristics, such

10.1021/ie020480y CCC: $25.00 © 2003 American Chemical Society Published on Web 03/06/2003

1364

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

approaches were used as a competitive tool in processing quantitative/qualitative data. An integrated method using design of experiments in conjunction with a NN illustrated how optimal parameter design could be achieved.16,17 A statistical-neural-network modeling technique was used to design the optimization of fine pitch stencil printing for solder paste deposition on pads of printed circuit boards.18 Ko et al. (1997) determined the optimal operating conditions for a crude tower using the neural-network model and the successive quadratic programming method.19 The use of GAs was instrumental in achieving satisfactory solutions to optimization problems unsatisfactorily addressed by other methods.20 Basically, GAs mimick the principles of natural selection to constitute the search and optimization procedure. Past applications21,22 indicated that GAs were more reliable than the existing machining methods and the conventional methods and that they converged consistently to the optimal solution. Although soft computing is a flexible design method, it does not efficiently use the experimental data and systematically explore for the new optimal condition. Because of the limited time and budgets for product launches, it is unlikely that sufficient experimental data will be generated. In addition, some experimental data lack appropriate information for quality design improvement. It is even difficult to design the proper trajectory of the batch process and to evaluate the effects of variations in different batch design trajectories. In this research, an innovative data-driven method integrating the soft-computing methods (neural networks and genetic algorithms) with orthogonal function approximation is developed to explore optimal trajectory design in batch processes. Initially, the continuous trajectories of the batches are mapped by the function approximation technique onto a finite number of coefficients. These coefficients capture the trajectory behavior of the process measurements. The optimal trajectory can be obtained as long as the locations of the coefficients are properly adjusted in the function space. Because of the possible existence of multiple optima, a global search strategy for the optimal parameters based on neural networks and a genetic algorithm is used for predicting the process behavior and designing the optimal coefficients, respectively. Once the coefficients of the approximate trajectory are found, the design conditions can be improved. The evolutionary operation provides information about its applicability. On the basis of this information, the possible new optimal setpoint trajectory can be determined. This systematic search technique can diminish the size of the search region for each experimental design and reduce the experimental time. The combination of the neural network and the genetic algorithm used in this work were still popular recently. Past applications can be found in the literature,23,24 but they include little discussion about how to update the neural-network model and reduce the number of experiments. Our previous work constructed neural-network models with information analysis for process design.25 That approach, which was focused only on the improvement of traditional experimental design, is further modified in this study to better construct a batch trajectory design method. This paper is structured as follows. The second section classifies the types of batch design problems. Techniques for batch trajectory design, orthogonal function ap-

proximation, and soft-computing methods are discussed in sections 3 and 4, respectively. The optimization search for this integrated method is developed in section 5. The effectiveness of the proposed method is demonstrated through three case studies in section 6, showing its potential applications. Finally, concluding remarks are made. 2. End-Point Optimization The general dynamic batch system can be described by the vector differential equations

x3 (t) ) g(x(t),f(t)), 0 e t e tf y(t) ) h(x(t),f(t))

(1)

where x(t) is the state vector, tf is the duration of each batch run, and g and h give the description of the dynamic model of the batch process. f(t) represents the designed manipulated trajectories. The performance index J is

max J(y(tf)) f(t)

(2)

because, in most cases, the quality variables, y(t), are available only at the end of the batch and cannot be measured during production. Therefore, the end-point optimization of quality in batch processes is considered here. The optimal trajectory design problem considered here is to choose the control policy during the batch run so that the performance can be maximized under the unknown process model. Note that the single trajectory f(t) is used in the following sections to simplify the explanation, but multitrajectory design is demonstrated in example 3. 3. Batch Trajectory Approximation Using Orthogonal Functions In batch processes, the time evolution of a batch follows a prespecified trajectory. The batch trajectory can have different shapes depending on the recipe being used. To identify the shape of the batch trajectory, a well-known conventional orthogonal method for parametrization of the trajectory, such as Fourier series or Laguerre functions, can be used. Then, the continuous trajectory is compressed into a finite number of key parameters. In the past, the coefficients of the function used to be estimated by dynamic optimization based on the process model. The desired trajectory could thus be obtained.26,27 Because this requires a reliable process model, possible applications are limited. In the following section, the coefficients are computed from experimental data. 3.1. Orthogonal Function Approximation. Let {φn(t)}n)0,1,...,N-1 be a set of orthogonal basis functions over Ω ) [a, b]. By the approximation theorem, the batch trajectory f(t) is assumed to be continuous and square integrable in the range Ω ) [a, b]. It can be spanned in terms of the following orthogonal basis functions N-1

f(t) = ˆf(c,t) )

∑ cnφn(t)

(3)

n)0

where the coefficients are c ) [c0 c1 ‚‚‚ cN-1]. This

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1365

Φn ) [φn(t1) φn(t2) ‚‚‚ φn(tK)]T

(6b)

The measurement trajectory ({f(tk), k ) 1, 2, ..., K}) is projected onto a subspace of the function space spanned by the basis {φn, n ) 0, 2, ..., N - 1}. Equation 5 shows the projection of {fn(tk), k ) 1, 2, ..., K} onto span {φn} at the nth iteration and the removal of the variance associated with the component in the span {φn} for the next basis function {φn+1}. The complete formulation is described in our previous work.28 By the orthogonal function approximation, the measurement variable {f(tk)} in each batch trajectory is mapped onto the coefficient variables c in the functional space. Figure 1a and b explains the manipulated trajectories and the corresponding coefficients, respectively. This figure represents the general concept of the mapping procedure from the design trajectory to the feature coefficients; it does not represent any specific cases. This point c in the functional space spanned by a set of independent basis functions represents one batch trajectory. Therefore, if the coefficients are properly adjusted in the functional space, the optimal trajectory can be obtained. 3.2. Upper and Lower Bounds of the Manipulated Trajectory. For operation and process safety, it is common to set the upper and lower bounds of the manipulated trajectory in batch processes. To keep the approximate function within the feasible region during each batch run, the orthogonal coefficients must be checked for whether the approximate trajectory still falls within the upper and lower bounds of the manipulated trajectory. Take the first two coefficients, for example. The approximate function must satisfy the boundary condition

L e ˆf(c0,c1,t) ) c0φ0(t) + c1φ1(t) e U, 0 e t e tf (7)

Figure 1. (a) Two batch trajectories: the optimal operating trajectory (bold line) and the non-optimal one (dashed line). (b) Coefficient plot for the two corresponding batch trajectories: the optimal point (circle) and the non-optimal one (square).

combination is the so-called best approximation to f(t) with respect to the linear combinations of {φn}n)0,1,...,N-1. The coefficient cn is the projection of f(t) onto the corresponding basis function φn

cn )

∫Ωf(t)φn(t) dt

(4)

Suppose a set of K measurements, {f(tk), k ) 1, 2, ..., K}, is obtained from a batch trajectory at times t1, t2, ..., tK. The coefficients can be calculated using the discrete form

(5) fn+1(tk) ) fn(tk) - cnφn(tk), k ) 1, 2, ..., K; n ) 0, 1, ..., N - 1 where f0(tk), k ) 1, 2, ..., K, is equal to f(tk), k ) 1, 2, ..., K, at the first iteration

and

L e c0 + M1c1 e U L e c0 + m1c1 eU

(8)

where φ0(t) ) 1 for the Legendre polynomial. To decrease the complexity, a method is used to conveniently transform the representation [c0 c1]T from an original basis to a new basis by setting z0 ) c0 + M1c1 and z1 ) c0 + m1c1, yielding

L e z0 e U

cn ) [ΦnTΦn]-1ΦnTfn

fn ) [fn(t1) fn(t2) ‚‚‚ fn(tK)]T

where L and U are the lower and upper bounds, respectively, of the manipulated trajectory. Because of the finite duration of the batch operation, an orthogonal function with a finite interval approximation is used here. In this study, a Legendre polynomial is employed here. Let M1 ) max{φ1(t)} and m1 ) min{φ1(t)} in the continuous range of [0, tf]. Substituting these two extreme conditions into eq 7, the possible region is bounded by four inequalities, which are graphically depicted in Figure 2

(6a)

L ez1 eU

(9)

If any generated pattern [z0 z1]T satisfies the above upper and lower bounds, the design trajectory must be within the feasible region (gray area in Figure 2). The original coefficients [c0 c1]T can be obtained from a given pattern [z0 z1]T by solving the linear program

min [|c0 + c1m1 - z0| + |c0 + c1M1 - z1|] c0,c1

(10)

1366

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

Figure 2. Geometric interpretation of the feasible region of the design trajectory.

subject to

Figure 3. Distribution of the radial base function with the different levels of resolution, where the shaded regions represent the possible selected neuron candidates.

L e c0 + M1c1 e U

(11)

L e c0 + m1c1 e U

To generalize this discussion, the bounds for the trajectories approximated using N coefficients N-1

L e ˆf(c,t) )

∑ cnφn(t) e U

(12)

n)0

can be transformed to 2N constrained inequalities N-1

zi ) c0 +

cjHi, ∑ j)1

i ) 0, 1, 2, ..., 2N-1

(13)

where Hi ∈ {Mi, mi} is selected from Mi (Mi ) max{φi(t)}) or mi (mi ) min{φi(t)}) to form the new combinatonal coefficient zi (eq 13), with L e zi eU, i ) 0, 1, ..., 2N-1. After the proper [z0 z1 ‚‚‚ z2N-1]T values have been obtained, the given N coefficients with respect to the new basis {zi}i)0,1,...,2N-1 are represented in the old basis{ci}i ) 0,1,...,N-1 by

min c

[

2N-1

∑ i)0

|

c0 +

N-1

∑ j)1

Hi∈ {Mj,mj}

cjHi - zi

|]

(14)

subject to N-1

L e c0 +

∑ j)1

cjHi e U, i ) 0, 1, 2, ..., 2N-1 (15)

Hi∈{Mj,mj}

Thus, the approximate trajectory (fˆ(c,t)) with the orthogonal coefficients (c) (computed by eqs 14 and 15) is within the desired upper (U) and lower (L) bounds. 4. Design of Orthogonal Coefficients Using Soft Computing Traditionally, Taguchi’s factor design method uses an orthogonal array to arrange the experiments for the problem. The corresponding responses can be obtained by the specified parameter combinations. However, Taguchi’s method suffers from the problems of interactions among parameters and local optima. In addition, it cannot be directly applied to the design of dynamic

process problems. Conventional nonlinear dynamic programming can be used to solve for the optimal trajectory, but it has difficulty in building process models for highly nonlinear, time-varying batch processes seriously interconnected with uncertainties. In this section, a method is proposed that integrates a NN with a GA to provide a framework for exploring a new possible optimal operating trajectory and a systematic cycle design for converging to the optimal conditions. 4.1. Neural Networks. A neural-network black box is often used to represent either a product or a process. The input variables are the coefficients (c) computed from the orthogonal function, and the output variable is the performance index of the product objective function (J). A traditional back-propagation neural network (BPN) with a global function can lead to large extrapolation errors without warning; therefore, the construction of a good BPN model usually requires an abundance of data. However, from the experimental design viewpoint, the number of experiments should be kept to a minimum. Thus, radial-basis-function neural networks (RBFNs) with local functions are used to eliminate unnecessary and extrapolation error.29 Especially when the Gaussian function is employed as the radial basis function, the network learns the pattern probability density instead of dividing up the pattern space as a BPN would do. Consequently, the accuracy of nonlinear function approximation an RBFN is very high within the trained data range. Whereas a BPN has a fixed number of neurons, an RBFN is free to add new neurons depending on the complexity of the underlying problem. However, there are two disadvantages to RBFNs. First, the local nature of RBFNs can be a curse, because the number of hidden neurons exponentially increases with the growing input space dimensions.30 Second, the method of determining the neurons and the corresponding radial basis parameters is vague. In the following paragraph, an RBFN is modified to avoid the above problems. An RBFN consists of three layers: an input layer, a hidden layer, and an output layer. It can be mathematically defined as

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1367 S

J(c) )

wiφi(||c - hi||) + aTc + a0 ∑ i)1

(16)

where S is the number of neurons, φ is called the basis function

[

φi ) exp -

]

||c - hi||2 2σi2

c ⊂ Rn represents input orthogonal coefficients, hi serves as the center of basis function i, σi2 is the variance of the basis function i, J(c) is the quality objective variable, wi is referred to as the radial weighting coefficient for basis function i. Equation 16 is a modified version of the traditional RBFN. It consists of two parts: a traditional RBFN to identify the nonlinear characteristics of the process and a linear model to capture linear characteristics of the batch process. Thus, the vector a contains the linear coefficients, and a0 is a constant. The new model structure is called RBFN+Lin as the acronym for RBFN with a linear structure. Several learning algorithms for RBFNs have been discussed in the past few years. They are all similar in weight adaptation, employing the well-known gradient descent method, but they are quite different in finding the centers and the widths of the objective functions. The positioning of the center is especially critical for convergence and approximation precision.29 In this paper, three steps for training the RBFN+Lin based on previously developed work are used.31 The training procedures are briefly outlined below. (1) Selecting Neuron Candidates. Neuron candidates are selected on the basis of the distribution density of the input training data. The distribution of the input training data can be viewed at different levels of resolution. A neuron is assigned to the center location of each resolution cell that covers the training data. For a brief explanation, the resolution cells in two-dimensional space are illustrated in Figure 3. The “Level” axis represents different levels of resolution. Binary partitions are selected for efficient computation and easy explanation. This means that the length of the lower level is divided in half to get the length of the following upper level. As shown in Figure 3, level 1 spans the entire operating space; thus, it creates a coarse resolution to support all data; i.e., neuron 1 would cover cell (1,1). The resolution becomes progressively finer going down through the levels. Therefore, only a limited number of data are covered in the more finely resolved cells. The input data (gray points and black points) distributed at cell (1,2) of level 2 would be covered by neuron 2, and the input data (black points) at cell (2,3) of level 3, by neuron 3. Depending on the desired data density, active neurons are selected as candidates. As the candidates are found, the corresponding center (hi) and width (σi) values are determined by the location of the active cell. (2) Purifying Neurons. Not all of the neuron candidates have a strong impact on the desired output. The set of candidates should be refined using the output training data. The classical Gram-Schmidt (CGS) algorithm is used as a purifying scheme to retain the neurons that contribute the most and remove those that contribute the least. To identify the RBFN model (eq 16), it is assumed that P experimental patterns of the

batch trajectories, {(ci;Ji),i ) 1, ..., P}, are given. The coefficient variables c are computed by approximating the operating trajectory f(t) in each batch trajectory (eqs 5 and 6). The experimental patterns represent the operating features (c) and the corresponding quality (J) of the system to be modeled. Substituting each data pair into eq 16 yields a set of P equations. These equations can be written in a concise matrix form as where

J ) Rθ + e

(17)

J ) [J1 J2 ‚‚‚ JP]T e ) [e1 e2 ‚‚‚ eP]T θ ) [w1 w2 ‚‚‚ wS a1 ‚‚‚ aN a0]T

(18)

R ) [r1 ‚‚‚ rS rS+1 ‚‚‚ rS+N rS+N+1]T and

{

ri ) [φi(||c1 - hi||) φi(||c2 - hi||) ‚‚‚ φi(||cP - hi||) ]T 1 ei eS S + 1 ei eS + N [ci-(S+1),1 ci-(S+1),2 ‚‚‚ ci-(S+1),P ]T 1 i)S+N+1 (19)

In eq 19, ci,p is the orthogonal coefficient i of the pattern p, 1 is a column vector consisted of P 1’s. The variable e represents the residuals or errors of the approximation problem defined by the training data. The goal of the approximation is to minimize the sum of square errors, i.e. P

ep2 ) eTe ∑ p)1

(20)

Generally, the columns of the matrix [r1 ‚‚‚ rS rS+1 ‚‚‚ rS+N rS+N+1] are not mutually orthogonal. The regressor matrix R can be factored into a product using an unnormalized QR decomposition, R ) QΓ

J ) Qg + e

(21)

and the vector g ) Γθ can be determined by the leastsquares method

g ) (QTQ)-1QTJ

(22)

where Q ) [q1 q2 ‚‚‚ qS+N+1]. Because the columns of Q are orthogonal, eq 22 becomes

gi )

qiTJ

, i ) 1, 2, ..., S + N + 1

(qiTqi)

(23)

and g ) [g1 g2 ‚‚‚ gS+N+1]. Under the assumption that e is white noise, eq 21 yields S+N+1

JTJ )

∑ j)1

gj2qjTqj + eTe

(24)

Now define [SSR]j ) gj2qjTqj. This term, the sum square regression, is used to explain the variation of J expanded in direction qi, which corresponds to one column of the matrix R. The most important neuron or linear term that maximizes [SSR]j should be selected,

1368

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

Figure 4. Structure of a chromosome from a batch trajectory.

Figure 5. Scheme of optimal batch trajectory design.

and the corresponding weight coefficient of each regressor (rj) should also be computed (wi or ai). Therefore, by projecting the output training data onto the neurons, the projections can be dense on some neuron candidates and sparse on the others. Those neurons responding to a few or no data from the output space are considered redundant and should be removed. The procedure of selecting neurons using the CGS scheme is included in the Appendix. A detailed description can be found in

refs 32 and 33. Note that several methods of selecting neurons can be used here,34,35 but a comparison of the selecting procedures is not the focus in this paper. (3) Optimizing the Neural Parameters. The refined network can be made to be even closer to the system model using a gradient search algorithm for optimization. Newton’s method is used to minimize the objective function (eq 25)

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1369

Figure 6. Strategy of the global optimal search based on the neural network and genetic algorithm. P

M(Ωq) )

(J(cj) - Jj)2 ∑ j)1

(25)

and

Ωq+1 ) Ωq - λ[H(Ωq)]-1∇M(Ωq)

(26)

θq+1 ) (Φpur+)q+1J

(27)

where Ωq ) [{hi},{σi}], J(cj) is the predicted output from the RBFN+Lin, and Jj is the corresponding actual measured quality. [H(Ωq)]-1 is the inverse of the Hessian matrix of M(Ωq), ∇M(Ωq) contains the gradient vectors {∂M}/{∂z}i and {∂M}/{∂}σi, and λ is the step size. Two-folded stages, the gradient method (eq 26), and the least-squares estimation (eq 27) are separately used to adjust the inside parameters (Ωq ) [{hi},{σi}]) and outside parameters (θ) and enhance the efficiency of the computation. Note that, if the Hessian matrix is not positive definite, the Levenberg-Marquardt modification can be altered here by adding a positive definite matrix λI to H, where I is the identity matrix and λ is some nonnegative value.

4.2. Genetic Algorithm. Once the neural-network model is obtained, the input coefficients (c) of the neural network must be determined via the genetic algorithm (GA) to yield optimal quality performance. The explanation of GA operations can be found in ref 20. Here, four components of the GA algorithm for the optimization of the batch trajectory are described as follows: (1) Chromosome Encoding. When applying a GA to this search problem, one must know how to code the possible solution to the problem as finite bit strings. To avoid computing the complicated bounds of the orthogonal coefficients (c), the coefficients with respect to the new basis {zi}i)0,1,...,2N-1 (eq 13) are encoded as a list of real numbers (Figure 4). A common coding method called concatenated, mapped, unsigned binary coding is used here. Bit string encoding has advantages over other types of encoding, as it is simple to mate and manipulate the strings. A coefficient zj is computed by mapping from a B-bit, unsigned binary integer using B

zj ) L +

2l-1dlj ∑ l)1 2B - 1

(U - L)

(28)

1370

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

Figure 7. (a) Objective function with the coefficients c1 and c2 and (b) the corresponding contour.

where dlj ∈ {0, 1} is a binary bit in the string [d1j d2j ‚‚‚ dlj]. B bits determine the resolution of the parameter zj. Because of the distribution of data of given training patterns located in a variety of regions, higher resolution can be established for the design profile using the reference value (σi2)min ) min{σi2}, because min{σi2} is the smallest variance obtained from the trained RBFN to spread out the current data. If much higher resolution is needed, we can set κ(σi2)min, 0 < κ e1, such that

[ (

B ) INT log 2

)]

U-L +1 κ(σi2)min

(26)

where INT means that the argument is rounded to the next largest integer. The encoding resolution determines how the coefficients are transformed into the bit string representation. This is accomplished by using binary mapping for each coefficient and concatenating the binary values. In this way, the first B bits represent the first coefficient, the next B bits represent the second coefficient, and so forth, until all coefficients are represented. Thus, each string represents a point in the search space and a certain possible batch trajectory solution.

(2) Evaluation Function. The merit of each string is evaluated. The string on the gene pool is unencoded into the coefficients (zi). Then, the coefficients zi are transformed back into the orthogonal coefficients ci using eqs 14 and 15. Coefficients ci are then used as the inputs of the neural network to predict quality performance. This approach makes use of the neural network as a tool for representing the process model to reduce the time and cost of the real experiments. Only the final best strings representing the possible optimal trajectories are conducted in the real process. Because not much experimental information would be obtained in the first few runs, the representation of the relationship between the trajectory and the process quality might not be good enough. However, after several runs, the neural-network model would become able to predict batch quality more accurately with the additional experimental data. (3) Initialization Procedure. The initial members of the population are chosen randomly with a Gaussian probability distribution. The mean centers of the Gaussian distribution are based on the past historical experimental data. Because the existing RBFN is most trustworthy around the experimental points, it makes sense that the gene pool is used as the starting point of the search. This search strategy is different from the traditional genetic algorithm, where the parameters are usually uniformly distributed throughout the whole search space. (4) Operators. The goals of the experiments are to determine how different operators perform in different situations and to explore for the optimal conditions. The operators in the GA are grouped into three basic categories: parent reproduction, crossovers, and mutations. A reproduction operator takes individual strings to form a new population using the GA principle that the strings with good fitness are more likely to be selected than the strings with poor fitness. The most common roulette-wheel selection approach is used here. A crossover operator takes two parents with a crossover probability and creates one or two new strings (offspring) juxtaposing some genetic material of each parent. A mutation operator takes one parent and randomly changes some of the entries in its string with low mutation probability to create the allele of the string of the population. Mutations are performed to introduce new information into the population and to prevent the best strings from converging and stagnating at a local optimum. During the evolution process, the elitist strategy copies the highest fitness of each generation into the succeeding generation. This can increase the speed of domination of a population and improve the local search at the expense of a global perspective. The GA operation procedure is repeated until a stopping criterion is met. Then, the best strings representing the possible optimal trajectories can be used to conduct the actual experiments. Note that GAs cannot guarantee that a true optimal solution is found because they employ a stochastic method. However, if they are appropriately applied, they will at least find an acceptable solution close to the true optimal solution. 5. Search Procedures for Optimal Batch Trajectory Design The proposed strategy consists of three phases. The first phase involves identifying the relationship between

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1371

Figure 8. Optimal trajectory design at different runs for example 1: (a) the current (crosses) and past (solid circles) coefficient points against the contour of the objective function and (b, c) the corresponding RBFN model contour whose crosses represent the optimal points found by the GA (b) c0 vs c1 and (c) c1 vs c2.

the objective function J and the batch trajectory. Orthogonal function approximation is first applied to extract the finite number of feature coefficients from the batch trajectory. A RBFN+Lin is trained to derive the relationships between the coefficients of the orthogonal function and the corresponding value of the objective function. The trained network can accurately predict the quality of the possible operating trajectory. Thus, adding the coefficient values to the trained network allows the corresponding quality to be obtained. The second phase involves the transformation of J into a fitness function based on the quality characteristics of the problem. In phase three, the GA is directly applied so as to solve the coefficient design problem. The GA can be used to obtain the optimal values of the coefficients from the possible solution space. Figure 5 schematically depicts the design procedure. The detailed procedure is summarized as follows:

Phase I. Identify the relationship function to predict the response. (1) Set a reference-operating trajectory based on prior knowledge. Select the maximum number of orthogonal function terms to approximate the operating trajectory (Nmax) and the maximum number of runs to be conducted in the batch experiments (rmax). Set r ) 0. (2) Perform the trajectories of the batch experiments. Collect the training patterns from the past batch trajectories of the experimental records. (3) Let r ) r + 1 and approximate the operating trajectory using the proper number of orthogonal functions c ) [c0 c1 ‚‚‚ cN-1] via eqs 5 and 6. (4) Develop an RBFN+Lin model (eq 16) to establish the relationship between the design parameters and the quality response. This trained network is represented as the relationship function. Phase II. Formulate the fitness function.

1372

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

(13) If the number of experimental design runs exceeds the maximum number, i.e., r > rmax, then stop; otherwise, go to step 15. (14) If the optimal quality response in the current run is not significantly improved at a certain threshold ratio compared with the previous run, i.e., (Jopt)r < R(Jopt)r-1, and the number of orthogonal coefficients has not reached the maximum number of orthogonal function, i.e., N e Nmax, then increase the number of terms in the orthogonal functions, i.e., let N ) N + 1; go to step 2. Otherwise, directly go to step 3 Remark 1. Whenever a new coefficient (cN+1) is included, extra experiments must be run. The traN cnφn(t) of the experiments are jectories ˆf(c,t) ) Σn)0 obtained from only the appropriate change of the new added coefficient (cN+1), with the rest coefficients opt ) remaining at their past optimal values. ({cn}n)0,1,...,N-1 The new information will enhance the prediction of the neural-network model to prevent the best operating conditions from converging and stagnating at the same local optima as in the previous step. Remark 2. A reference operating trajectory can be used as the initial designed trajectory if past operating experience or the basic physical phenomena in the design system are available. If not, then a straight-line trajectory can be used. 6. Numerical Illustrations

Figure 9. Different runs of (a) the optimal trajectory and (b) the performance of the average (upper subplot) and best (lower subplot) objective functions in example 1.

(5) Transform the objective function (J) into the fitness function. Phase III. Determine the optimal conditions. (6) Set the GA operating conditions (i.e., generation size, population size, crossover rate, and mutation rate). The gray region for the GA procedures in Figure 5 is expanded to describe the GA operations in Figure 6. (7) Change the orthogonal coefficients, {ci}, into the new basis coefficients, {zi}, by eq 13 using the bounded constraints. Create an initial population of candidate strings from the past historical experimental data. Each string represents a possible batch trajectory. (8) Create a mating pool of parent strings using the three GA operations. Select the parameter values according to the fitness values. Crossover the fitness values and replace the nonsurvival parameters. Mutate the parameter values to yield the next generation. (9) Retrieve the coefficients ({zi}) from the binary strings and compute the orthogonal coefficients using eqs 14 and 15. (10) Apply the RBFN+Lin model and evaluate the fitness value of the predicted value of the string. Repeat steps 8-10 until convergence to the optimal value. (11) Select the best string having the best fitness value and conduct the trajectory design of the experiment. Select the optimal design trajectory whose corresponding quality function is (Jopt)r in the current experimental design. (12) If (Jopt)r is converged to a certain threshold value, stop the experimental design; otherwise, go to step 13.

Three examples are employed here to demonstrate the application and test the performance of the proposed scheme. To simplify the explanations, a mathematical function with a single trajectory consisting of two orthogonal functions is used in the first example. This helps illustrate how to improve the objective function in each experiment. The second example is concerned with an optimal batch trajectory problem in a nonlinear fermentation chemical process. The third example demonstrates how this design scheme performs on a complex, multitrajectory problem. Performance comparisons between the proposed method and the conventional methods are also included. Each of the examples is discussed in a separate subsection below. Example 1: Mathematical Problem with a Single Manipulated Function. Consider an arbitrary objective function represented by

max [ ˆf(t)

∫01[(f1(t-1) - ˆf(t-1))2(f2(t-1) ˆf(t-1))2] dt - R[fˆ(tf)1)]2] (29)

where R ) 0.005, f1(t) ) 2φ1(t) + 3φ2(t), and f2(t) ) 8.5φ1(t) + 7φ2(t). These two functions, f1 and f2, contain the second and the third terms of the Legendre polynomial function. Without the last term of eq 29, the objective function has two equal local points with the values of 0. After addition of the last term, a global point at (c1,c2) ) (2,3) with a value of -0.2587 can be found. The surface and the contour plots of the objective function (eq 29) with respect to these two coefficients c1 and c2, shown in Figure 7, indicate that there are two local optimal points with a global optimum point. Usually, it is difficult for the traditional optimization method to find the global optimum. The aim of this simple problem is to show how the proposed method with the orthogonal function approximation quickly converges to the global

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1373

maximum of the objective function without wasting the number of experimental designs. Initially, it is completely unclear which terms of the orthogonal function should be included, so the approximate orthogonal function is assumed to be

ˆf(t) ) ˆf(c0,c1,t) ) c0φ0(t) + c1φ1(t)

Cell mass production b1 dy1 ) b1y1 - y12, y1(0) ) 0.03 dt b2

(31)

Penicillin synthesis (30)

The search for the optimal points based on the proposed batch experimental design method is depicted in Figure 8. A total of 23 runs of experiments were performed, and the results of runs 1, 9, 14 and 23 are shown. Column a lays out the location of the experiments conducted and the contour background of the actual objective function. Columns b and c show the locations of the candidates c0 vs c1 and c2 vs c1, respectively, found by the GA for the next run of the experiment and the contour background of the current trained RBFN+Lin model. The background contour with all coefficients is fixed at the current optimal values, except for the two coefficients shown in the subplots. Assume that no prior information about the function is available; eight initial points are arbitrarily generated to get the corresponding values of the objective function. Using the initial coefficients and the corresponding values of the objective function, the initial RBFN+Lin is trained. Without time wasted in actual experiments, the GA based on the RBFN+Lin model is used to determine the possible optimal coefficients. The corresponding optimal trajectories are then applied in the next run to determine whether the performance criterion is satisfied. The same procedures are repeated until run 4, whose performance is not significantly improved. Then, then new orthogonal term φ2(t) is introduced to increase the search space. In Figure 8, there is some deviation from the desired target in run 9 because of the poor representation of the initial RBFN+Lin model. However, even the desired optimal point cannot be obtained in the first few runs. The model can be updated with the new information from the trajectories to find better locations for the optimal coefficients. With the increase of the new experimental data, the approximate function gradually approaches the original function. (See runs 14 and 23 of Figure 8.) As a result, incresing numbers of locations of the coefficients are found around the optimal point. The RBFN+Lin model approaches the original system around the optimal regions. After 23 runs, the locations of the coefficients fall within the optimal region. The optimal trajectories for some runs are shown in Figure 9a. The shape of the optimal trajectory is changed after the new orthogonal function is added to increase the search space. The corresponding optimal objective function for that run is also changed (Figure 9b). Although there is no additional coefficient after run 4, the optimal objective function might still possibly be improved when the RBFN model is updated with the new information. It can be seen that the best objective functions are significantly improved at run 8 and run 14 (Figure 9b). Moreover, because elitism is used, the best value of the objective function increases monotonically with the number of runs. Example 2: Optimal Temperature Trajectory for Penicillin Fermentation. A batch penicillin fermentation is presented here to design the temperature trajectory.36 The differential equations describing the state of the system in the batch penicillin fermentation process are

dy2 ) b3y1, y2(0) ) 0.0 dt with

b1 ) w1 b2 ) w4 b3 ) w5

[ [ [

1.0 - w2(θ - w3)2

(32)

] ] ]

1.0 - w2(25 - w3)2 1.0 - w2(θ - w3)2

1.0 - w2(25 - w3)2 1.0 - w2(θ - w6)2

(33)

1.0 - w2(25 - w6)2

where y1 and y2 represent the dimensionless concentrations of cell mass and penicillin, respectively; t is the dimensionless time, 0 et e1; and θ is the temperature, in degrees Celsius. The parameters bi, i ) 1-3, are functions of temperature, and bi g0. The parameter values are w1 ) 13.1, w2 ) 0.005, w3 ) 30 °C, w4 ) 0.94, w5 ) 1.71 and w6 ) 20 °C. The parameter-temperature functions have shapes typical of those encountered in microbial or enzyme-catalyzed reactions. The performance index is the maximum concentration of penicillin at the end of the batch fermentation, tf ) 1

J ) max [y2(tf)1)] T(t)

(34)

As in example 1, assume that the initial temperature trajectory for this test problem is a straight line that contains two design coefficients. When the performance is not significantly improved at the sixth batch run, a new coefficient corresponding to the third term of the Legendre function is added. The proposed design procedures at some runs are depicted in Figure 10. Each row of this figure represents one run containing three subplots. In each subplot, the background contour of the objective function is based on the current trained RBFN+Lin model with all coefficients fixed at the current optimal values, except for the two coefficients shown in each subplot. The locations of the candidates found by the GA for the next run of the experiment are also shown in these subplots. The fourth term of the coefficients is added at the ninth run. When the objective function is not significantly improved at run 22, the experiment is stopped, and a near optimum is found. Figure 11 shows the best optimal trajectories at certain runs. This problem was also tested in recent research based on Taguchi’s method and the region reduction method.1 In the previous work, only 50 experiments in 8 runs were required to reach the same optimum, whereas the currently proposed method requires 65 experiments in 22 runs. It seems that the previous method is better. However, the design region of the previous method should be predefined well before the search region is reduced, because it only gives the search direction, not the size of the move needed in this direction. On the

1374

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

Figure 10. Optimal trajectory design of T(t) for example 2: (a) c0 vs c1, (b) c0 vs c2, and (c) c0 vs c3. The crosses represent the optimal coefficient points found by the GA at the current run.

contrary, the size of the step is not needed in the current approach because it is based on a stochastic search algorithm. In addition, the proposed method is a global optimizer that will find a global optimum given enough computation time. Furthermore, the performances obtained with the maximum principle and the proposed scheme for this optimal temperature trajectory are 1.479 and 1.476, respectively. These results show slight differences between the values of the objective functions and between the trajectories. However, the maximum principle requires the use of given complicated models involving the chemical reactions. Using the proposed method without any process model, the data-driven optimal policy can improve the yield of the desired product. Example 3: Dynamic System with Two Manipulated Trajectories. Consider the example used by Chen and Mills (1981).37 The dynamic system described by three state variables is

dx1 ) x2 + f1 dt dx2 ) -x1 + f2 dt

(35)

dx3 ) f12 + f22 dt with the initial conditions x1(t)0) ) -1, x2(t)0) ) 1, and x3(t)0) ) 0. The operation time for each batch run is tf ) π/2. Two equality constraints apply: y1 ) x1(tf) 1 and y2 ) x2(tf) + 1. The performance index for this case is defined as

I)

min x3(tf) f1(t),f2(t)

(36)

Luus and Storey (1997) included these two equality constraints and the performance index to construct the

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1375

Figure 11. Different runs of the optimal trajectory in example 2.

Figure 13. Optimal trajectory design at some runs for example 3.

Figure 12. Optimal coefficients of the manipulated function at different runs for example 3: (a) u1(t) and (b) u2(t).

augmented performance index38

J)

min {I + θ[(y1 - s1)2 + (y2 - s2)2]} (37) f1(t),f2(t)

where s1 and s2 represent the tolerances of the shifting terms and θ is the penalty function factor.

In this initial work, the constraints of the batch trajectory are not considered, but they will be further studied in the future. Therefore, the selections of the proper penalty function factor and the shifting terms are based on Luus and Storey’s work. The optimal policy of each of the control variables, f1(t) and f2(t), is designed to optimize the augmented performance index given the predefined constants θ ) 100, s1 ) 1 × 10-7, and s2 ) 0.012 73. Initially, f1(t) ) c01φ0(t) + c11φ1(t) and f2(t) ) c02φ0(t) + c12φ1(t), as the first two terms of the orthogonal functions are used. The optimal locations of the coefficients of the search path during these runs are shown pictorially in Figure 12. Initially, the coefficients with two terms are located in a plane. When the third term of the Legendre function is added to each manipulated function after run 3, the positions of the optimal coefficient are out of the plane. In addition, it is obvious to see that the optimal locations of some runs are clustered together, but the mutation operation of the GA can lead to the exploration of a new region for the possible new locations of the coefficients. After run 14, the locations of the final optimal coefficients are found, and two close-to-optimum trajectories are identified. The optimal control policy is given in Figure 13, and the state trajectories are given in Figure 14. Note that, in the currently developed method, the scheme for updating the coefficients is followed for all of the trajectories. It seems that some redundant coefficients would be involved in the optimization

1376

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

Figure 14. State trajectory resulting from the optimal control policy of run 23 for example 3.

problem, but the scheme still does not waste the number of experiments because the optimization based on the model is run off-line. However, after a few runs, the values of the redundant coefficients are still close to 0, and they would be removed to reduce the computational complexity of the optimization problem. In Figure 12, we find that the coefficients do not change much beyond run 7. The corresponding trajectories are also very close (Figure 13). If the performance level can be decreased, the number of the experiments can be reduced accordingly. Iterative dynamic programming (IDP) was also applied to this case to determine the optimal control policy for the same penalty function factor and shifting terms. The results from the proposed method without a system model (I ) 2.6170) are acceptable compared to the solution of IDP (I ) 2.5465) for the given dynamic model. 7. Conclusion Batch processes traditionally are regarded as a valuable and effective way of executing process operations. The purpose of this paper is to develop a framework that integrates orthogonal function approximation with softcomputing methods for the design policy of a batch trajectory. The orthonormal function approximation is used to eliminate the problem resulting from the dynamic set-point trajectory. It can turn the implicit and continuous information available into a finite number of key coefficients that contain the necessary parts of the operating conditions for each variable in each batch. In most optimal control problems, multiple optima might exist. Therefore, a proper search direction strategy based on soft-computing method is applied to tackle the resulting problem. More specifically, this approach provides the following advantages: (1) Without a model-based design, the proposed method can not only determine the potentially available knowledge of the batch system but also explore the optimal feasible region. This reduces the time required when experimental studies are undertaken. (2) This method can deal with the dynamic set-point problem, which cannot be handled well via classical statistic experimental design. Furthermore, it can effectively cope with interactions among the coefficients. Extra experiments are not needed as long as the historical experimental data are sufficient. (3) Through its learning and adaptation strategies, the RBFN+Lin can use currently available experimen-

tal data to quickly identify the relationship between the coefficients and the objective function to predict the quality response. The genetic algorithm procedure based on the neural-network model explores the optimal conditions. (4) The proposed procedure is relatively simple and fairly easy for engineers to apply to diverse industrial batch applications. Perform the proposed procedure does not require a great deal of statistical background. Furthermore, engineers can directly use the computational software package to optimize the design of batch trajectory problems even without too much theoretical knowledge. Although the proposed procedure yields encouraging results, all of the results presented in this paper are based on a simulation model. Additional work using experimental data from a real batch plant is needed. Furthermore, the aim of the proposed method is to construct the basic framework for the batch process design system. Several research subjects are worthy of investigation to further extend applications, such as compensating for possible changes in the optimal trajectory caused by time-varying parameters, undesired effects of process disturbances, and batch-to-batch variations. Also interesting for future study methods for determining the optimal time duration of the batch and for coping with dynamic path constraints. Acknowledgment This work is partly sponsored by National Science Council, R.O.C. and by the Ministry of Economic, R.O.C. Nomenclature a ) linear coefficient vector of the RBFN+Lin, [a1 a2 ‚‚‚ aN] ci ) projection of the trajectory onto the basis function φi c ) orthogonal coefficient vector, [c0 c1 ‚‚‚ cN-1] ci ) orthogonal coefficient vector of manipulated trajectory i di ) binary bit i ∈ {0, 1} f(t) ) designed manipulated trajectory hi ) center of the ith neuron of the RBFN+Lin Hi ) either Mi ) max{φi(t)} or mi ) min{φi(t)} J ) quality objective function J ) quality vector with P measurements, [J1 J2 ‚‚‚ JP] K ) number of measurement points in each trajectory Ks ) number of the selected neurons selected L ) lower bound of the manipulated trajectory N ) number of the orthogonal basis functions Nmax ) maximum number of the orthogonal basis functions P ) number of training patterns qj ) orthogonal column ji extracted by the factorization of regressor matrix R rmax ) maximum number of batch runs ri ) regressor vector i of regressor matrix R R ) threshold ratio of the quality improvement R ) regressor matrix of the RBFN+Lin [SSR]j ) variation of the measured quality in the direction qji tf ) duration of each batch run U ) upper bound of the manipulated trajectory wi ) weighting coefficient of the ith neuron of the RBFN+Lin y(tf) ) measured quality vector at the end of the batch run zi ) new basis variable i obtained from eq 13 Greek Symbols Γ ) matrix from QR decomposition

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1377 ∇ ) gradient operator Ω ) unknown parameter vector, which consists of the center and variance of each neuron of the RBFN+Lin model φi ) orthogonal function i φi ) radial basis function i σi ) variance of the ith neuron of the RBFN+Lin θ ) unknown parameter vector of the RBFN+Lin model, [w1 w2 ‚‚‚ wS a1 ‚‚‚ aN a0]T

Appendix With the given P experimental patterns of the batch trajectories, {(ci;Ji), i ) 1, ..., P}, the regressor matrix form can represented as eqs 17 and 18. The regressor selection procedures using the CGS method are summarized as follows: Step 1. k ) 1, 1 e i e S + N

q(i) 1 ) ri g(i) 1 )

T (q(i) 1 ) J T (i) (q(i) 1 ) q1

(i) 2 (i) T (i) [SSR](i) 1 ) (g1 ) (q1 ) q1

Select the first important vector q1 ) q(i1 1) such that i1 ) maxi[SSR](i) 1 . Step 2. k ) k + 1; 1 e i e S + N; and i * i1, ..., i * ik-1

R(i) jk

)

gjTri gjTgj k-1

q(i) k ) φi g(i) k )

R(i) ∑ jk qj j)1

T (q(i) 1 ) J T (i) (q(i) 1 ) q1

(i) 2 (i) T (i) [SSR](i) k ) (g1 ) (qk ) qk

Select the kth important vector qk ) q(i1 k) such that ik ) maxi[SSR]ik. Step 3. Two criteria, AIC(4) (Akaike’s information criterion) and BIC (Bayesian information criterion),39 are taken into account to evaluate the performance and complexity of the network. If [AIC(4)]k g [AIC(4)]k-1 or [BIC]k g [BIC]k-1, stop the procedure and go to step 4; otherwise, go to step 2 and proceed to find the next most important neuron. Step 4. The coefficients θ can be directly computed after the desired neuron is obtained via

Φpur ) [ri1 ri2 ‚‚‚ riKs rS+N+1] and

θ ) Φpur+J where Ks is the number of the neurons selected and Φpur+ is the pseudoinverse of matrix Φpur. Literature Cited (1) Chen, J.; Sheui, R.-G. Using Taguchi’s Method and Orthogonal Function Approximation to Design Optimal Manipulated Trajectory in Batch Processes. Ind. Eng. Chem. Res. 2002, 41 (9), 2226.

(2) Soroush, M.; Kravaris, C. Optimal Design and Operation of Batch Reactors. 2. A Case Study. Ind. Eng. Chem. Res. 1993, 32, 882. (3) Thomes, I. M.; Kiparissides, C. Computation of NearOptimal Temperature and Initiator Polices for Batch Polymerization Reactors. Can. J. Chem. Eng. 1984, 62, 284. (4) Sacks, M. E.; Lee, S.; Biesenberger, J. A. Optimal Policies for Batch, Chain Additional Polymerizations. Chem. Eng. Sci. 1972, 27, 2281. (5) Luus, R.; Rosen, O. Application of Dynamic Programming to Final State Constrained Optimal Control Problems. Ind. Eng. Chem. Res. 1991, 30, 1525. (6) Luus, R. Optimal Control by Dynamic Programming Using Systematic Reduction in Grid Size. Int. J. Control 1990, 51, 995. (7) Eaton, J. W.; Rawlings, J. B. Feedback Control of Chemical Processes Using On-Line Optimization Techniques. Comput. Chem. Eng. 1990, 14, 469. (8) Cuthrell, J. E.; Biegler, L. T. On the Optimization of Differential-Algebraic Process Systems. AIChE J. 1987, 33, 1257. (9) Tsang, T. T. C.; Clarke, D. W. Generalized Predictive Control with Input Constraints. IEEE Proc. D. 1988, 135, 451. (10) Lochner, R. H.; Matar, J. E. Design for Quality, An Introduction to the Best of Taguchi and Western Methods of Statistical Experimental Design; ASQC Quality Press: Milwaukee, WI, 1990. (11) Box, G.; Draper, N. R. Empirical Model-Building and Response Surface; Wiley: New York, 1987. (12) Taguchi, G. Introduction to Quality Engineering: Designing Quality into Products and Processes; Productivity Organization: Tokyo, Japan, 1986. (13) Fowlkes, W. Y.; Creveling, C. M. Engineering Methods for Robust Product Design; Addison-Wesley: Reading, MA, 1995. (14) Phake, M. S. Quality Engineering Using Robust Design; Prentice Hall: Englewood Cliffs, NJ, 1991. (15) Duchesne, C.; MacGregor, J. F. Multivariate Analysis and Optimization of Process Variable Trajectories for Batch Processes. Chemom. Intell. Lab. Syst. 2000, 51, 125. (16) Tay, K. M.; Butler, C. Modeling and Optimizing of MIG Welding ProcesssA Case Study Using Experimental Design and Neural Networks. Qual. Reliab. Eng. Int. 1997, 13, 61. (17) Rowlands, H.; Packianather, M. S.; Oztemel, E. Using Artificial Neural Networks for Experimental Design in Off-line Quality. J. Syst. Eng. 1996, 6, 46. (18) Mahajan, R. L.; Nikmanesh, N. Fine Pitch Stencil Printing Process Modeling and Optimization. J. Electron. Packag. 1996, 118, 1. (19) Ko, J.-W.; I, Y.-P.; Wu, W.-T. Determination of an Optimal Operating Condition for a Crude Tower via a Neural Network Model. Process Control Qual. 1997, 10, 291. (20) Goldberg, D. E. Genetic Algorithm in Search, Optimization, and Machine Learning; Addison-Wesley: Reading, MA, 1989. (21) Omori, R.; Sakakibara, Y.; Suzuki, A. Applications of Genetic Algorithms to Optimization Problems in the Solvent Extraction Process for Spent Nuclear Fuel. Nucl. Technol. 1997, 118, 26. (22) Lansberry, J. E.; Wozniak, L. Optimal Hydrogenerator Governor Tuning with a Genetic Algorithm. IEEE Trans. Energy Convers. 1992, 7, 623. (23) Nandi, S.; Ghosh, S.; Tambe, S. S.; Kulkarni, B. D. Artificial Neural-Network-Assisted Stochastic Process Optimization Strategies. AIChE J. 2001, 47, 126. (24) Hugget, A.; Se´bastian, P.; Nadeau, J.-P. Global Optimization of a Dryer by Using Neural Networks and Genetic Algorithms. AIChE J. 1999, 45, 1227. (25) Chen, J.; Wong, D. S. H.; Jang, S.-S.; Yang, S.-L. Product and Process Development Using Artificial Neural-Network Model and Information Analysis. AIChE J. 1998, 44, 876. (26) Mujtaba, I. M.; Macchietto, S. Efficient Optimization of Batch Distillation with Chemical Reaction Using Polynomial Curve Fitting Techniques. Ind. Eng. Chem. Res. 1997, 26, 2287. (27) Ray, W. H. Advanced Process Control; McGraw-Hill: New York, 1982. (28) Chen, J.; Liu, J. Derivation of Function Space Analysis Based PCA Control Charts for Batch Process Monitoring. Chem. Eng. Sci. 2001, 56 (10), 3289. (29) Poggio, T.; Girosi, F. A Theory of Networks for Approximation and Learning. Proc. IEEE 1990, 78, 1481. (30) Haykin, S. Neural Networks: A Comprehensive Foundation; Prentice Hall: Englewood Cliffs, NJ, 1999.

1378

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

(31) Chen, J.; Bruns, D. D. WaveARX Neural Network Development for System Identification Using a Systematic Design Synthesis. Ind. Eng. Chem. Res. 1995, 34, 4420. (32) Chen, S.; Billing, S. A.; Luo, W. Orthogonal Least Squares Methods and Their Application to Nonlinear System Identification. Int. J. Control 1989, 50, 1873. (33) Chen, S.; Cowan, C. F. N.; Grant, P. M. Orthogonal Least Squares Learning Algorithm for Radial Basis Function Networks. IEEE Trans. Neural Networks 1991, 2, 302. (34) Gemperline, P. J.; Long, J. R.; Gregoriou, V. G. Nonlinear Multivariate Calibration Using Principal Components Regression and Artificial Neural Networks. Anal. Chem. 1991, 63, 2313. (35) Qin, S. J.; McAvoy, T. J. Nonlinear PLS Modeling Using Neural Networks. Comput. Chem. Eng. 1992, 16, 379. (36) Constantinides, A.; Spencer, J. L.; Gaden, E. L., Jr. Optimization of Batch Fermentation Processes, I. Development of Mathematical Models for Batch Penicillin Fermentations. Biotechnol. Bioeng. 1970, 12, 803.

(37) Chen, G.; Mills, W. H. Finite Element and Terminal Penalization for Quadratic Cost Optimal Control Problems Governed by Ordinary Differential Equations. SIAM J. Control Optim. 1981, 19, 744. (38) Luus, R.; Storey, C. Optimal Control of Final State Constrained System. In Proceedings of the IASTED International Conference on Modelling, Simulation and Control; ACTA Press: Calgary, Alberta, Canada, 1997; p 245. (39) Kashyap, R. Bayesian Comparison of Different Classes of Dynamic Models Using Empirical Data. IEEE Trans. Autom. Control 1977, 22, 715.

Received for review June 27, 2002 Revised manuscript received October 28, 2002 Accepted January 13, 2003 IE020480Y