Batch-to-Batch Optimization Using Neural Network Models | Industrial

However, such optimization can be difficult since batch processes often involve complex, nonlinear phenomena. .... IEE Proceedings - Control Theory an...
11 downloads 0 Views 168KB Size
Ind. Eng. Chem. Res. 1996, 35, 2269-2276

2269

Batch-to-Batch Optimization Using Neural Network Models Dong Dong, Thomas J. McAvoy,* and Evanghelos Zafiriou Department of Chemical Engineering and Institute for Systems Research, University of Maryland, College Park, Maryland 20742

As chemical plants become more flexible, the importance of batch processing has increased in recent years. Batch processes are also used in emerging areas such as semiconductor manufacturing. In order to derive the maximum benefit from batch processes, it is important that their operation be optimized. However, such optimization can be difficult since batch processes often involve complex, nonlinear phenomena. In this paper an approach to batch-tobatch optimization is coupled with neural network modeling to improve the performance of batch processes. The neural models yield results that are comparable to those achieved with first principle models. This accuracy is achieved through the use of feedback from each batch which effectively compensates for plant-model mismatch. 1. Introduction The current trend toward the production of low volume/high cost materials has generated high interest in the use of batch and semibatch processes. Examples include reactors, crystallizers, distillation towers, injection-molding processes, biochemical processes, and other processes involved with the manufacture of polymers. Batch processes are also used in emerging areas such as semiconductor manufacturing. In order to derive the maximum benefit from batch processes, it is important that their operation be optimized. Many papers have been published on batch process optimization (Luus, 1993; Modak, 1993; Pyun et al., 1989; Keeler et al., 1992; Semones and Lim, 1989). However, in all these approaches usually there is an assumption that accurate batch models are available. Optimal trajectories are calculated based on these accurate models. In cases of practical interest, there is often plant-model mismatch, and the optimization results can be far away from true optima. Filippi-Bossy et al. (1989) proposed a new approach for batch process optimization, that uses tendency models to identify and modify actual kinetic models. In their approach, remodeling and reoptimization techniques are required. Such techniques involve a significant effort in identifying many unknown parameters in complex reaction systems, and this is often a very difficult task in industrial applications. Zafiriou and Zhu (1989, 1990) proposed a very interesting approach to deal with plant-model mismatch for batch process optimization. In their approach, both plant and model information are used, and the data from every batch run are essentially used as feedback to correct gradients that are calculated from models. Usually after 10 or 15 batch runs, a real optimal point for a plant can be achieved even if there is a large plant-model mismatch. Their approach provides an attractive new approach for batch process optimization. However, their approach is based on a first principle model, and such modeling can be difficult since batch processes often involve complex, nonlinear phenomena. In many applications, first principle models are unavailable. Another difficulty in applying their approach is that online measurements of state variables are required. In this paper, neural network models are coupled with the approach proposed by Zafiriou and Zhu (1989, 1990), * To whom all correspondence should be addressed. Phone: (301) 405-1939. FAX: (301) 314-9920. E-mail: [email protected].

S0888-5885(95)00518-5 CCC: $12.00

so the development of complicated first principle models is avoided. A novel technique called neural net multiway partial least squares (NNMPLS), which is based on the idea of multiway partial least squares (MPLS) (Wold et al., 1987) and neural net partial least squares (NNPLS) (Qin and McAvoy, 1992), is used to develop the batch process models. The NNMPLS model can directly relate input batch profiles in final batch state variables, so on-line measurements for state variables are not required. The development of such an NNMPLS model is straightforward. A pseudo random binary signal (PRBS) is added to normally used input profiles to generate data, and these data are used to develop NNMPLS models. In order not to change a batch, the magnitude of the PRBS is very small, and thus the variation of the quality variables caused by the PRBS is small. Generating such a data set is acceptable for a real industrial process. Another advantage of the NNMPLS approach is that it can provide analytical gradient information, while for the first principle model approach, it is usually necessary to calculate numerical gradients. Thus, the computation for the NNMPLS approach is typically much faster than that for an approach based on a first principle model. The organization of this paper is as follows. In section 2, the batch-to-batch optimization approach proposed by Zafiriou and Zhu (1989, 1990) is reviewed. In section 3, a neural network model based batch-to-batch optimization approach is proposed, which includes two parts, neural network multiway PLS modeling and gradient calculation. In section 4, two case studies are presented. For both cases, some deliberate plant-model mismatch is introduced to test the effectiveness of the proposed approach, and comparisons are given between the results achieved from the first principle models and the NNMPLS models. In section 5, conclusions are given. 2. Batch-to-Batch Optimization The batch-to-batch optimization approach proposed by Zafiriou and Zhu (1989, 1990) is based on an analogy between the iterations during numerical optimization of an objective function and successive batches during the operation of the plant. In numerical optimization, one uses a dynamic model which is integrated forward in time given an initial guess of the control trajectory. The adjoint equations are then developed using linearization, and these equations are integrated backward © 1996 American Chemical Society

2270

Ind. Eng. Chem. Res., Vol. 35, No. 7, 1996

Figure 1. Data structure for modeling a batch process.

in time. Next, changes in the control trajectory are calculated so as to improve the objective function being optimized. This procedure is repeated until there is no further change in the objective function. For batch-tobatch optimization, Zafiriou and Zhu also use a process model to develop the adjoint equations but the linearization is made around the measurements obtained from the actual plant. Furthermore, the plant itself is used for the forward trajectory. The main advantage in their approach is that a very accurate model is not required. Rather, they only require that the model lead to a descent direction on the objective function under study. In the gradient and objective function calculations, the plant data are fed back to compensate for the plantmodel mismatch. In a fed-batch fermentation example which was given in Zafiriou and Zhu (1990), even if a key coefficient has a 50% percent mismatch between the model and the plant, their approach can provide significant improvement in the objective function, and after 10 batch runs, the calculated optimal point is almost the same as the true optimal point. A first principle model described by differential equations is used in their approach. However, such modeling can be difficult since batch processes often involve complex, nonlinear phenomena and, in many cases, first principle models are unavailable. Another difficulty in applying their approach is that on-line measurements for state variables are required. If on-line state measurements are unavailable, complicated state estimation techniques will be needed. In order to overcome these problems, a neural network model based, batch-to-batch optimization approach is proposed. Since batch process data have a different structure from continuous process data, a novel technique called neural net multiway partial least squares (NNMPLS), which is very suitable for batch process modeling, is developed. In the next section, the method of developing NNMPLS models is discussed. 3. Neural Network Modeling 3.1. NNMPLS Approach. For batch process optimization, the objective usually involves finding the profiles of input variables to maximize an objective function which usually is a function of the batch state variables at final batch time. So a batch model should relate the input data profiles to final state variables. If there is more than one input variable, the input data profiles take the form of a three-way array. Figure 1 illustrates the data structure for batch process optimization. Similar to partial least squares, multiway partial least squares (MPLS) has been proposed (Wold et al., 1987) to handle the data illustrated in Figure 1. The difference between PLS and MPLS is that MPLS

decomposes the X array into score vectors (tl; l ) 1, 2, ..., L) and loading matrices Pl according to the Kronecker product. This decomposition is equivalent to unfolding the three-dimensional array X to two dimensions and then carrying out a PLS calculation. MPLS can be written as follows:

X ) t1 X P1 + E1

(1)

Y ) u1 × q1 + F1

(2)

where t1 and u1 are score vectors of the first factor, P1 and q1 are the loading matrix and vector corresponding to this factor. E1 and F1 are the residuals. The above two equations formulate an MPLS outer model. The score vectors are related by a linear inner model:

u1 ) b1t1 + r1

(3)

where b1 is a coefficient which is determined by minimizing the residual r1. After going through the above calculation, the residual matrices are calculated as:

E1 ) X - t1 X P1

(4)

F1 ) Y - b1t1q1

(5)

Then the second factor is calculated based on the residuals E1 and F1 by going through the same procedure as for the first factor. The same procedure is repeated until the last factor l is calculated, which leaves almost no information in the residuals El and F l. The above MPLS approach is a linear approach. Since batch processes are usually nonlinear in nature, it is more desirable to have a nonlinear MPLS modeling approach. A neural network PLS approach (Qin and McAvoy, 1992) can be used directly here to form a neural network MPLS (NNMPLS) model which uses neural networks as the inner regressors:

ul ) N (tl) + rl

(6)

where N(‚) stands for the nonlinear relation represented by the neural network. For NNMPLS, the MPLS outer model is kept to generate score variables from the data. Then the scores (ul and tl) are used to train the inner network models. Details of the neural network PLS algorithm are given by Qin and McAvoy (1992). The prediction of the NNMPLS model can be written as follows:

Ind. Eng. Chem. Res., Vol. 35, No. 7, 1996 2271 a

Y ˆ )

u ˆ hqhT ∑ h)1

(7)

u ˆ h ) N(th) ) σ(thω1hT + eβ1hT)ω2h + eβ2h

(8)

˜h th ) Xw

(9)

where Y ˆ stands for model prediction, ω1h and ω2h are the weight vectors for the input and output layer of the hth inner network model, and β1h and β2h are the bias weights for the input and output layer of the hth inner network model, respectively, and h-1

w ˜h)

(I - wjphT)wh ∑ j)1

(10)

One difference between NNMPLS and a direct neural network approach is that NNMPLS can deal with limited samples. Since every sample involves a complete batch run, the data set for a batch process often has very limited samples. After unfolding, the number of input variables is usually very large. Thus, for the direct neural network approach, the number of samples required for modeling will be large in order to avoid overfitting. For the NNMPLS approach, after the MPLS outer layers transform, the number of factors will be much less than the number of input variables because the input variables are highly correlated. Thus, the network size is much smaller than that of the direct neural network approach, and the number of samples required for modeling is much smaller. This feature will be illustrated in the case studies. 3.2. Gradient Calculation. The objective of batch process optimization can be written as follows:

min J ) F(Y)

(11)

X

The gradient information g ) ∂J/∂X is needed in the optimization calculation. For a first principle model, typically a numerical gradient is calculated; if the model is very complicated, the computation time for gradient calculation can be high. For the NNMPLS model, an analytical gradient can be calculated. The calculation is as follows:

∂J ∂J ∂Y ) ∂X ∂Y ∂X

(12)

From eqs 7-9, ∂Y/∂X can be derived as follows: a

∂Y ) ∂X

∂Y ∂uˆ h ∂th

∑ ˆ h)1 ∂u

a

) h

∂th ∂X

qhN′(th) w ˜ Th ∑ h)1

(13)

Because the inner network is a multilayer feedforward network, the derivative of N′(th(t)) can be derived using the generalized delta learning rule (Rumelhart and McClelland, 1986). Based on eqs 12 and 13, the analytical gradient can be easily calculated. For conventional optimization techniques, the values of Y in the gradient calculation are the predictions if the model and plant formation is not used. If there is plant-model mismatch, an inaccurate gradient will be calculated. In order to deal with plant-model mismatch, the plant data for Y are used in the gradient calculation in the batch-to-batch optimization approach used here. Specifically, ∂J/∂Y in (12) is calculated at the values of Y measured in the last batch. The plant

Figure 2. Schematic illustration of batch-to-batch optimization using neural network models.

data work as a feedback to correct the gradient calculated from the NNMPLS model. The cost is that the optimal point cannot be calculated in one step, and several iterations are needed. This idea is similar to feedback control, where several sampling intervals are needed for a controlled variable to reach setpoint. Here we work at the time scale of a batch instead of a sampling interval. In this work, a nonlinear programming technique, FSQP (feasible sequential quadratic programming) (Zhou and Tits, 1993), is used for optimization. In the simulation study, the first principle model is used as the plant. The NNMPLS model is developed from plant data generated by the first principle model, but some mismatch is deliberately introduced. The procedure for the optimization is as follows. The initial input profile is fed to the plant, and plant outputs are recorded. Then the gradient information is obtained from both the NNMPLS model and the plant data, and the changes in the input profile are calculated to improve the objective function being optimized. This procedure is repeated until no further improvement in the objective function is obtained. Figure 2 shows the schematic illustration of this approach. 4. Case Studies 4.1. Semibatch Polymerization Example. This example involves a simulation of a polymer reactor, and it is the same one treated by Zafiriou and Zhu (1990). The first principle model for this polymerization process was proposed by Kwon and Evans (1970, 1973, 1975) through reaction mechanism analysis and laboratory testing as:

(r1 + r2Tc)2 x˘ 1(t) ) (1 - x1)2 exp(2x1 + 2χx12) × Mm 1 - x1 x1 Em + A exp (14) r1 + r2Tc r3 + r4Tc m T

(

x˘ 2(t) )

x˘ 3(t) )

x˘ 1(t) x2 1 + x1 x˘ 1(t) 1 + x1

(

1-

(

)

1400x2

( ) )

Aw exp(B/T)

)

Aw exp(B/T) - x3 1500

(15)

(16)

where the state variables x1, x2, and x3 are conversion, dimensionless number-average chain length (NACL), and dimensionless weight-average chain length (WACL), respectively. Aw and B are coefficients in the relation between WACL and temperature, obtained from experiments; Am and Em are the frequency factor and activation energy of the overall monomer reaction; and the r1

2272

Ind. Eng. Chem. Res., Vol. 35, No. 7, 1996

to r4 constants are density-temperature corrections. Mm and χ are the monomer molecular weight and polymermonomer interaction parameter. T or Tc is the control variable (absolute temperature and temperature in degrees Celsius, respectively). For this process, the objective is to derive an optimal temperature profile which minimizes the batch time to reach the specified conversion and other polymer properties. The proposed method is difficult to apply directly to this process since the final time is not specified. However, a coordinate transformation method, proposed by Kwon (1970) and Kwon and Evans (1975), can convert the original fixed-end-point, free-end-time problem to a free-end-point, fixed-end-time problem. The transformation is accomplished by selecting a new set of state variables (y1 dimensionless reaction time, y2 NACL, y3 WACL) and using conversion, τ, as the independent variable. The new optimization problem statement for the reaction time minimization with final specific NACL and WACL equal to 1 and conversion (τ) at 0.8 can be mathematically described as follows:

min J ) y1(τf) + γ[(y2(τf) - 1)2 + (y3(τf) - 1)2]

(17)

Tc(τ)

y˘ 1(t) ) Mm/[5.0(r1 + r2Tc)2(1 - τ)2 exp(2τ + 2χτ2)

(

)

( )

Em τ 1-τ + A exp (18) r1 + r2Tc r3 + r4Tc m T

(

)

(

)

y˘ 2(t) )

y2 1400t2 11+τ Aw exp(B/T)

y˘ 3(t) )

Aw exp(B/T) 1 - y3 1+τ 1500

(19)

(20)

y1(0) ) 0

(21)

y2(0) ) 1

(22)

y3(0) ) 1

(23)

100 e Tc e 200

(24)

In our simulation study, a first principle model is used to generate the data for training the NNMPLS model. The manipulated variable to be optimized is the temperature trajectory during the batch. In order to develop training data for the NNMPLS model, a pseudo random binary signal (PRBS) with a magnitude of (2 °C is added to the normal trajectory. Forty batch runs are used to develop the model, and one typical run is shown in Figure 3. Because the (2 °C magnitude is quite small, the variation of the quality variables caused by the PRBS is very small. Using such forcing is acceptable for a real industrial process to generate a data set. The trajectory of the manipulated variables is divided into 101 samples. For the NNMPLS modeling, the dimension of the input array is 40 × 1 × 101. The output is the final batch quality variables y1(tf), y2(tf), and y3(tf). The dimension of the output matrix is 40 × 3. After unfolding, the number of input variables is 101, and the number of output variables is 3. Crossvalidation indicates that three factors are best for the NNMPLS model. Because the NNMPLS model involves only three SISO feedforward networks for the factors, the size of the final network is quite small.

Figure 3. Initial temperature profile with PRBS signal added and final profile.

There are two key coefficients in the first principle model, Aw and B. In order to simulate the plant-model mismatch, these two coefficients are deliberately changed in simulating the plant, while the NNMPLS model is kept unchanged. Eight case studies are conducted for different Aw and B changes, and the results are given in Table 1. In order to compare with the first principle model method, the results for this method are also included in Table 1. As can be seen, both approaches converge to almost the same optimum in terms of the objective function. Consider case 1 in more detail. For this case, there is a 50% mismatch in parameter Aw. The initial profile is optimal for the nominal model, and as can be seen, it yields undesirable results because of the mismatch. For the first batch, the objective index is 792, the reaction time tf is 5.22 h, and NACL (y2(tf)) and WACL (y3(tf)) are 1.174 and 1.222, respectively. Note that the desired value for both NACL and WACL is 1. The true optimal point is calculated by using the same first principle model for both plant and model. As can be seen, both approaches converge to almost the same point. The NNMPLS approach takes approximately 15 iterations to converge compared with 10 for the first principle model. Thus, the NNMPLS approach is slower; this is because the PRBS added to the input profile to generate data for NNMPLS modeling has a very small magnitude. Therefore, together with the plant-model mismatch introduced from coefficients Aw and B, there is also mismatch from extrapolating the NNMPLS model. The final input profile for case 1 is given in Figure 3. The objective function via batch number is shown in Figure 4. All results indicate that the NNMPLS method is very promising for use in batchto-batch optimization of real processes where the necessary data are available, but a model does not exist. 4.2. A BioBatch Process Example. This example involves a semibatch fermentation process to produce foreign protein. A first principle model which describes cell growth and product formation for recombination Escherichia coli for this fermentation process is proposed by Bentley and Kompala (1989). The equations describing the fed-batch fermentation are given by the following mass balances:

Biomass: F dX ) µX - X dt V

(25)

where X is the bulk biomass concentration (g of DW/L),

Ind. Eng. Chem. Res., Vol. 35, No. 7, 1996 2273 Table 1. Results for Batch-to-Batch Optimization Using the Neural Network Model case no.

mismatch

initial batch

true opt. value

FP model after 10 batches

NN model after 15 batches

case 1

Aw ) 1.5Aw B)B

case 2

Aw ) 2Aw B)B

case 3

Aw ) 3Aw B)B

case 4

Aw ) 0.5Aw B)B

case 5

Aw ) Aw B ) 1.05B

case 6

Aw ) Aw B ) 1.1B

case 7

Aw ) Aw B ) 0.95B

case 8

Aw ) Aw B ) 0.9B

tf ) 1.044 ys(tf) ) 1.174 y3(tf) ) 1.222 tf ) 1.044 y2(tf) ) 1.286 y3(tf) ) 1.443 tf ) 1.044 y2(tf) ) 1.421 y3(tf) ) 1.889 tf ) 1.044 y2(tf) ) 0.693 y3(tf) ) 0.778 tf ) 1.043 y2(tf) ) 1.370 y3(tf) ) 1.694 tf ) 1.045 y2(tf) ) 1.520 y3(tf) ) 2.506 tf ) 1.043 y2(tf) ) 0.944 y3(tf) ) 0.945 tf ) 1.046 y2(tf) ) 0.709 y3(tf) ) 0.783

tf ) 0.416 y2(tf) ) 1.000 y3(tf) ) 1.000 tf ) 0.217 y2(tf) ) 1.000 y3(tf) ) 1.000 tf ) 0.087 y2(tf) ) 1.001 y3(tf) ) 1.001 tf ) 4.818 y2(tf) ) 0.998 y3(tf) ) 0.990 tf ) 0.137 y2(tf) ) 1.000 y3(tf) ) 1.000 tf ) 0.050 y2(tf) ) 1.000 y3(tf) ) 1.000 tf ) 1.424 y2(tf) ) 1.000 y3(tf) ) 0.999 tf ) 5.229 y2(tf) ) 0.998 ys(tf) ) 0.986

tf ) 0.417 y2(tf) ) 0.999 y3(tf) ) 0.999 tf ) 0.218 y2(tf) ) 1.000 y3(tf) ) 0.999 tf ) 0.094 y2(tf) ) 1.024 y3(tf) ) 1.013 tf ) 4.984 y2(tf) ) 0.993 y3(tf) ) 0.997 tf ) 0.134 y2(tf) ) 1.001 y3(tf) ) 0.993 tf ) 0.094 y2(tf) ) 1.138 y3(tf) ) 1.157 tf ) 1.427 y2(tf) ) 0.999 y3(tf) ) 0.999 tf ) 5.429 y2(tf) ) 0.991 y3(tf) ) 0.992

tf ) 0.417 y2(tf) ) 1.001 y3(tf) ) 0.998 tf ) 0.215 y2(tf) ) 1.004 y3(tf) ) 0.996 tf ) 0.085 y2(tf) ) 1.010 y3(tf) ) 0.991 tf ) 5.173 y2(tf) ) 0.989 y3(tf) ) 1.004 tf ) 0.135 y2(tf) ) 1.008 y3(tf) ) 0.993 tf ) 0.054 y2(tf) ) 1.031 y3(tf) ) 1.006 tf ) 1.428 y2(tf) ) 1.000 y3(tf) ) 0.999 tf ) 5.461 y2(tf) ) 0.986 y3(tf) ) 0.993

Table 2. Stoichiometric, Maximum Rate, and Saturation Constants for Amino Acids, Nucleotides, and Protein

F is the feed flow rate (L/h), V is the fermentor volume (L), and µ is the instantaneous specific growth rate (h-1).

Substrate: (26)

where S is the bulk substrate concentration (g/L), Sf is the substrate concentration in the feed (g/L), and Yx/s is the substrate yield based on glucose (g of DW/g).

Foreign Protein: dPf A ) µ44 RGf - KtPf - µPf dt Kpfa + A

1.69 h-1 0.5485 g of A/g of N 0.1418 g of A/g of F 1.167 g of A/g of P 0.125 0.01 g/L 0.001 1.19 h-1 1.056 g of N/g of G 0.125 0.01 g/L 0.026

µ1

154.4 (g of m)/ (g of R‚g of G‚h) 0.002 0.08 h-1

KPA KTP

Figure 4. Objective function versus batch number.

µX F dS )+ (S - S) dt Yx/s V f

k1 1 2 γ1 KA KAS K2A k2 γ2 KN KNS KNA

(27)

where Pf is the foreign protein mass fraction (g/g of DW), A is the amino acid mass fraction (g/g of DW), Gf is the DNA mass fraction (g/g of DW), R is the RNA mass fraction (g/g of DW), µ4 is the kinetic constant for foreign protein synthesis synthesis (g of DW/g/h), Kpfa is the saturating constant for amino acid (g/g of DW), and Kt is the kinetic constant for foreign protein turnover (h-1).

Amino Acids maximum rate balance Stryer (1981) Stryer (1981) Stryer (1981) 6TMFA; Ingraham et al. (1983) Monod-type constant “1/25 rule”, Domach (1983) Jensen (1983) Stryer (1981) 5TMFN; Ingraham et al. (1983) Monod-type constant sensitive to A, TMFN Protein Ingraham et al. (1983) 1/

10 TMFA 5% turnover, Ingraham et al. (1983)

Volume: dV/dt ) F

(28)

The numerical values of the model parameters are listed in Tables 2 and 3. The structured model is given in appendices I and II and is described in detail in the original reference (Bentley and Kompala, 1989). This metabolic model lumps the intracellular constituents of E. coli into eight different pools: protein (P), foreign protein (Pf), ribosome (R), chromosomal DNA (G), plasmid DNA (Gf), lipid (L), nucleotides (N), and amino acids (A). The relationship between these different pools and the method to obtain specific growth rates is described in detail by Bentley and Kompala (1989). The optimization problem is to find a time profile of the feed rate which maximizes the total foreign protein in the reactor at the end of the fed-batch culture. In terms of the system variables, the performance index to be maximized is given by:

J ) X(tf) Pf(tf) V(tf)

(29)

where tf is fixed. The system is controlled by manipulating the substrate feed flow rate, which is bounded by:

2274

Ind. Eng. Chem. Res., Vol. 35, No. 7, 1996

0 e u(t) e umax

(30)

and the system is subjected to constraints on the volume:

V e Vm

(31)

and maximum cell mass concentration:

X e Xm

(32)

The optimization problem for this process was studied by Chen et al. (1994), where they assumed the model is accurate. The time profile of the feed rate was divided into N sampling intervals. During each interval, the feed rate was constant, which was the same approach used in the last case study. It is clear that, if N ) 1, there is a constant feed rate, and if N is chosen sufficiently large, the result will be sufficiently close to a continunous optimal control policy. Chen et al. (1994) found that, because the first principle model of this process is very stiff, the computation time for the optimization which includes solving the differential equations and evaluating the gradient increases very quickly with an increase in N. For example, if N ) 3, the computation time for the optimization is 52.08 h on a Sun Sparcstation 2, but if N ) 8, the time is 527.5 h. For the NNMPLS model method, the analytical gradient is available. The function evaluations for the gradient calculations are not needed, and the computation time is much less. In our simulation study, N is chosen as 10, the total fed-batch time (tf) is 50 h, the inlet substrate concentration (Sf) is 160 g/L, and the initial and final reactor volumes are V0 ) 1 L and Vmax ) 5 L, respectively. For this process, a constant feed rate of 0.08 L/h is usually used, and we use this constant feed rate also as an initial profile. A PRBS with a magnitude of 0.005 is added to the initial profile to generate data for the NNMPLS modeling. Data for 40 batches are generated, and the same procedure used in the last case study is used to develop the NNMPLS model. The model input involves 10 variables which consist of the feed rate profile, and the model output involves the state variables at final batch time, X(tf), Pf(tf), and V(tf). Again, FSQP is used to solve this finite dimensional optimization problem. In order to simulate plant-model mismatch, a mismatch factor is introduced. The NNMPLS model obtained from the above procedure is used as a nominal model. Each neural newtork weight in the NNMPLS model actually used in the optimization calculation is equal to the nominal MNNPLS weights plus a random number from a zero-mean Gaussion distribution. The standard deviation of the Gaussion distribution is equal to the average value of the nominal model weights times the mismatch factor. Case 1 is the nominal case; the mismatch factor is 1. The objective function which is the value of intracellular protein (TM × Pf × Vf) after 15 batch runs reaches 56.82 g. Note that for the initial profile, which is obtained for the constant feed rate of 0.08 L/h, the value of the objective function is 49.21 g. There is an increase of 15.5% in the intracellular protein due to the optimization. This result is also better than the result in Chen et al. (1993) where N ) 8, and the value of the intracellular protein is 55.92 g. This is not surprising because we use N ) 10. The reason that we can use a larger N is that the computation time for the NNMPLS method is much less than that for the first

Table 3. Maximum Rate and Saturation Constants for DNA, Fatty Acids and Lipids, Ribosomal RNA, Foreign Protein, and DNA DNA Ingraham et al. (1983) 1/ TMFN (sens. to N) 2

µ2 KGN

0.078 h-1 0.01

µ3 KLS KLA

0.52 h-1 0.001 g/L 0.026

µ6 KRN KTR KTRS K′TR KRA

19.64 g/(g of G‚h) 0.026 0.147 h-1 0.01 g/L 0.179 h-1 0.001

µ4 KPfA

800 h-1 0.002

µ5 KGfN

Foreign DNA, Plasmids 0.0005 h-1 Bentley et al. (1989e) 1 × 10-9 Bentley et al. (1989e)

Fatty Acids and Lipids Shuler et al. (1979) Monod-type constant (sens. to A) TMFA rRNA Kjeldgaard et al. (1974) TMFN 10% turnover, high S Monod-type constant 70% turnover, low S 1/ 25 TMFN

Foreign Protein Bentley et al. (1989e) same as endogenous protein, P

Table 4. Results for Case Study Two objective function 3-4 case no.

mismatch factor

if only use NNMPLS model

NNMPLS method after 15 batches

case 1 case 2 case 3

0.0 0.02 0.1

52.58 49.03 39.42

56.82 56.82 56.56

Figure 5. Initial final profiles of feed rate. Dotted line: initial profile. Solid line: final profile.

principle model method used by Chen et al. (1994). Only 20 min is required for the NNMPLS method on a Sun Sparcstation 10 for N ) 10. It should be noted that it is not necessarily true that an increase in N can always give a meaningful increase in the objective function; for N’s which are larger than 10, the objective function for each case changes very little. Figure 5 shows the initial profile of the feed rate and the final profile of the feed rate. More interesting results are obtained from plantmodel mismatch simulations. As can be seen from the last column in Table 4, when the mismatch factor is equal to 0.02 and 0.1, the NNMPLS method still achieves almost the same results as the nominal case. The results indicate that the proposed method can effectively deal with plant-model mismatch. Next, the results if one does not consider the plant-model mismatch are considered. The calculations involve two

Ind. Eng. Chem. Res., Vol. 35, No. 7, 1996 2275

steps. First, the optimal feed rate profile is calculated using only the model information. Second, the resulting optimal profile is implemented on the real reactor. The third column of Table 4 shows these results. For the nominal case, the result is 52.58 g, which is less than 56.82 g obtained by the proposed method. The reason is that there is still plant-model mismatch for the nominal case which is introduced from two sources: (1) the fact that the model does not exactly match the plant and (2) the error from extrapolating the model. The second source raises a general problem in optimization using a neural network model. One usually needs to extrapolate the neural network model in the optimization calculation because the optimal point cannot be included in a training data set before this point is determined. So, sometimes the optimization results are not satisfactory even though the model is believed to be very accurate in the range of its training data. It can be seen that the proposed approach can deal with this general problem. For the second and the third cases, because there is a very large plant-model mismatch, the results are only 49.03 and 39.42 g, respectively, which are even less than the result for the constant feed rate profile. The results show that considering the plant-model mismatch in optimization is very important. 5. Conclusions Most batch optimization schemes assume that accurate first principle batch models are available. This is not true in many cases of practical interest where there may be plant-model mismatch or no first principle model available. In this paper, a batch optimization scheme, which integrates a batch-to-batch optimization approach (Zafiriou and Zhu, 1989, 1990) and a neural network modeling, is proposed. Two case studies are conducted, and the results show that several advantages have been achieved by the proposed method. First, a first principle model and on-line state measurements are not needed in the proposed approach. This feature makes the method very attractive for real processes where the necessary data are available, but a first principle model does not exist. Second, the method can deal with plant-model mismatch. Even if there is a large plant-model mismatch, the results in case studies show that the method can converge to real optimal points. Third, the computation time in the optimization is much less than that of a first principle model method because analytical gradients are available from the neural network model.

Appendix 1. Dynamic Equations for Constituent Pools

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

dA dA dN dL dP ) - 1 - 2 - γ1 dt dt s dt s dt s dt s dPf γ1 - µA (A1) dt s dGf dN dN dG dR ) - γ2 - γ2 - γ2 - µN dt dt s dt s dt s dt s (A2)

[ ]

(A3)

dPf dPf ) - µPf dt dt s

[ ]

(A4)

dG dG ) - µG dt dt s

[ ]

(A5)

dGf dGf ) - µGf dt dt s

[ ]

(A6)

dL dL ) - µL dt dt s

(A7)

[ ] dR dR ) [ ] - µR dt dt

(A8)

s

Calculation of Growth Rate:

µ)

[ ]

[ ] [[ ] [ ] ]

[ ] [[ ] [ ] [ ]]

dA dN dL + (1 - 1) + (1 - 2) + dt s dt s dt s dPf dP dG (1 - γ1) + + (1 - γ2) + dt s dt s dt s dGf dR + (A9) dt s dt s

Appendix II. Synthesis Rate Expressions for Constituent Pools

[ ] [ ][ ][ ] [ ] [ ][ ][ ] KA dA S A ) k1 dt s KA + A KAS + S K2A + A

KN dN A S ) k2 dt s KN + N KNA + A KNS + S

[ ] [

]

dP A ) µ1 RG - KTPP dt s KPA + A

[ ] [ dPf dt

s

) µ4

]

]

(A14)

N KGfN + N

(A15)

[ ] [ dt

[] [

s

) µ5

[ ] [

]

][

]

dL S A ) µs dt s KLS + S KLA + A

][

(A11)

(A13)

dG N ) µ2 dt s KGN + N

dGf

(A10)

(A12)

A RGf - KTPPf KPfA + A

[ ] [

Acknowledgment This material is based upon work supported by the National Science Foundation under Grants NSF D CDR 8803012 and NSF EEC 94-02384.

dP dP ) - µP dt dt s

]

dR N A ) µ6 G - KTRR dt s KRN + N KRA + A K′RTR

[

KTRs

(A16)

]

KTRs + S

(A17)

Literature Cited Bentley, W. E.; Kompala, D. S. A Novel Structured Kinetic Modeling Approach for The Analysis of Plasmid Instability in Recombinant Bacterial Cultures. Biotechnol. Bioeng. 1989, 33, 49-61. Chen, Q.; Bentley, W.; Weigand, W. Optimization for a Recombinant E. Coli Fed-Batch Fermentation. Appl. Biochem. Biotechnol. 1995, 51/52, 449-461.

2276

Ind. Eng. Chem. Res., Vol. 35, No. 7, 1996

Domach, M. M. Refinement and Use of Structured Model of a Single Cell of Escherichia coli for the Description of AmmoniaLimited Growth and Asynchronous Population Dynamics. Ph.D. Thesis, Cornell University, Ithaca, NY, 1983. Filippi-Bossy, C.; Bordet, J.; Villermanux, J.; Marchal-Brassely, S.; Georgakis, C. Batch Reactor Optimization by Use of Tendency Models. Comput. Chem. Eng. 1989, 13, 35-47. Ingraham, J. L.; Maaloe, O.; Neidhardt, F. C. Growth of the Bacterial Cell; Sinauer Assoc.: Sunderland, MA, 1983. Jensen, K. F.; Metabolis of 5-phosphoribosy 1-pyrophosphate (PRPP) in Escherichia coli and Salmonella typhimurium. In Metabolism of Nucleotides, Nucleosides, and Nucleobases in Microorganisms; Munch-Petersen, A., Ed.; Academic: New York, 1983. Keeler, S. E.; Hull, J. W.; Agin, G. L. The Dynamic Modeling and Optimization of an Industrial Batch Reactor. Proceedings of DYCORD+92, College Park, MD, 1992. Kjeldgaard, N. O.; Gausing, K.; Control of Ribosome Synthesis. In Ribosomes; Nomura, M., Tissieres, A., Lengyel, P., Eds.; Cold Spring Harbor Laboratory: New York, 1974. Kwon, Y. D. Optimal Design and Control of Bulk Polymerization Processes. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, 1970. Kwon, Y. D.; Evans, L. B. Continuous Blending Models for Free Radical Polymerization Systems. Can. J. Chem. Eng. 1973, 51, 71. Kwon, Y. D.; Evans, L. B. A Coordinate Transformation Method for the Numerical Solution of Non-linear Minimum-time Control Problems. AIChE J. 1975, 21, 1158. Luus, R. Optimation of Fed-batch Fermentors by Iterative Dynamic Programming. Biotechnol. Bioeng. 1993, 41, 599-602. Modak, M. J. Choice of Control Variable for Optimization of Fedbatch Fermentation. Chem. Eng. J. 1993, 52, b59-b69. Pyun, Y. R.; Modak, J. M.; Chang, Y. K. Optimization of Biphasic Growth of Sacharomyces Carlsbergensis in Fed-Batch Culture. Biotechnol. Bioeng. 1989, 33, 1-10.

Qin, S. J.; McAvoy, T. J. Nonlinear PLS Modeling Using Neural Networks. Comput. Chem. Eng. 1992, 16 (4), 379-391. Rumelhart, D.; McClelland, J. Parallel Distributed Processing: Explorations in the Microstructure of Cognition; MIT Press: Cambridge, MA, 1986. Semones, G. B.; Lim, H. C. Experimental Multivariable Adaptive Optimization of the Steady-state Cellular Productivity of a Continuous Baker’s Yeast Culture. Biotechnol. Bioeng. 1989, 33, 16-25. Shuler, M. L.; Leeung, S.; Dick, C. C. Ann. N.Y. Acad. Sci. 1979, 35, 326. Stryer, L. Biochemistry; Freeman: New York, 1981. Wold, S.; Geladi, P.; Esbensen, K.; Ohman, J. Multi-way Principal Components and PLS Analysis. J. Chemom. 1987, 1, 41-56. Zafiriou, E.; Zhu, J. M. Optimal Fee-rate Profile Determination for Fed-batch Fermentations in the Presence of Model-plant Mismatch. Proceedings of 1989 American Control Conference, Pittsburgh, PA, June 1989; pp 2006-2009. Zafiriou, E.; Zhu, J. M. Optimal Control of Semi-batch Processes in the Presence of Modeling Error. Proceedings of 1990 American Control Conference, San Diego, CA, May 1990; pp 1644-1649. Zhou, T. J.; Tits, A. L. User’s Guide for FSQP Version 3.3. Technical Report, Institute for Systems Research, University of Maryland, College Park, MD, 1993.

Received for review August 17, 1995 Revised manuscript received February 28, 1996 Accepted April 1, 1996X IE950518P

X Abstract published in Advance ACS Abstracts, May 15, 1996.