2736
Ind. Eng. Chem. Res. 2004, 43, 2736-2746
Control of Copolymer Properties in a Semibatch Methyl Methacrylate/Methyl Acrylate Copolymerization Reactor by Using a Learning-Based Nonlinear Model Predictive Controller Myung-June Park and Hyun-Ku Rhee* School of Chemical Engineering and Institute of Chemical Processes, Seoul National University, Kwanak-ku, Seoul 151-742, Korea
This paper is concerned with the production of copolymers with uniform copolymer composition and desired weight-average molecular weight by using a learning-based nonlinear model predictive control (NLMPC). For an effective control in a semibatch copolymerization system, the successive linearization method used in NLMPC is modified. The nonlinear model for the copolymerization system is linearized by using the previous batch data within the prediction horizon in such a way that the linear time-varying model is obtained as a function of the increment of the inputs between the two consecutive batches. The learning algorithm is incorporated into the controller by virtue of the model structure, and the effectiveness of the proposed controller is shown by comparison with a conventional nonlinear model predictive controller as well as a linear model-based learning controller. Through the implementation of the controller to a semibatch methyl methacrylate/methyl acrylate copolymerization reactor, it is proven that copolymers with desired properties are produced effectively using the algorithm proposed in this study. 1. Introduction Chemical processes are characterized by the sluggish response and the existence of several kinds of constraints on input and output. The slow dynamic behavior of the chemical processes makes the discretization of control input possible, and thus the optimal control input can be calculated in the form of a piecewise trajectory. The optimization based on the matrix algebra and consideration of constraints has introduced the algorithm called model predictive control (MPC).1-3 Since its advent, the MPC has become a powerful tool for the optimization and control of chemical processes by virtue of its advantage in that it incorporates the constraints effectively into the algorithm and a feedback control algorithm is realized in a manner of receding horizon implementation. In recent years, a nonlinear MPC (NLMPC) has been suggested for the control of nonlinear systems. Gattu and Zafiriou4 adopted the successive linearization method to predict the future behavior of nonlinear systems and solved the optimization problem via nonlinear programming. Lee and Ricker5 suggested the augmentation of an unmeasured disturbance model and also applied the successive linearization method. As reviewed in Henson’s work,6 a number of variants of NLMPC have been developed, and they may be categorized into three types according to the nonlinear programming solution technique used in the optimization: nonlinear programming, successive linearization of model equations, and sequential model solution. Qin and Badgwell7 made a substantial effort for an overview of the industrial application of MPC. Control strategy has been applied to a batch or semibatch polymerization reactor for temperature control purposes, in which the reaction temperature was * To whom correspondence should be addressed. Tel.: +822-880-7405. Fax: +82-2-888-7295. E-mail:
[email protected].
controlled to follow the optimal temperature trajectory in such a way that the product with specified properties is produced at the final operation time.8 However, because this strategy is delicate to a disturbance, the property control strategy based on the feedback of the measurement of the property has been suggested.9,10 For this strategy, more effort should be made because this type of reactor is hard to control because of its transient nature and severe nonlinearity. Though various types of strategies11,12 have been suggested for property control in a polymerization reactor, the feedback within a batch was found to be insufficient for a satisfactory control performance. Crowley et al.13 showed that batchto-batch strategy is needed for control of the particle size distribution in an emulsion copolymerization reactor because only the increase of the particle size is possible within a batch. In this regard, many researchers have focused their efforts on the application of batchwise adaptation and developed the iterative learning control (ILC).14 Learning control updates the control input trajectory in the current batch by using the trajectories of input and error in the previous batch. This adaptation mechanism brings about a great deal of improvement in the control performance. In addition, the ILC with any positive learning gain was found to guarantee the convergence of the input trajectory even when it is applied to an unstable system.15 Since Amann et al.16 provided the optimal learning gain by using a linear time-invariant model, the ILC has been extended to various types of controllers,17-19 among which the ILC combined with a predictive control algorithm has gotten into the spotlight. Bone20 combined the generalized predictive control with the ILC, while Amann et al.21 proposed a predictive optimal ILC. After the modelbased ILC with a quadratic criterion was developed,22 the algorithm was improved to the level of MPC combined with ILC by Lee et al.23 In the algorithm, they
10.1021/ie0301379 CCC: $27.50 © 2004 American Chemical Society Published on Web 04/17/2004
Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004 2737
used the linear time-varying model based on the nominal trajectory for the construction of the prediction equation. In this study, for effective property control in a semibatch copolymerization reactor, the successive linearization of a nonlinear model within the prediction horizon is carried out by using the previous batch data to incorporate the learning algorithm into the MPC. From the viewpoint of the learning algorithm, unlike other learning control algorithms in which optimal learning gains are calculated on the basis of a linear model, a nonlinear model is directly used in the algorithm. Because the algorithm has a structure of the MPC combined with the ILC, it inherits all of the merits of both controllers. The effectiveness of the controller is corroborated by conducting experiments in a semibatch methyl methacrylate (MMA)/methyl acrylate (MA) solution copolymerization reactor for control of copolymer properties. 2. Semibatch MMA/MA Copolymerization Reactor In this section, we consider a semibatch reactor in which solution copolymerization of MMA and MA occurs. The reaction kinetics are assumed to follow the free-radical copolymerization mechanism including chaintransfer reactions to solvent and both monomers. The moments of living and dead polymer concentrations are defined and used in the calculation of copolymer properties. The adoption of the pseudo kinetic rate constant method (PKRCM)24 reduces the complex kinetic rate expressions to simpler ones so that mass balances for various species in a semibatch MMA/MA copolymerization reactor give rise to the following equations:
d(IV) ) qfIf - kdIV dt
(1)
d(SV) ) qfsSf - kfSSG0V dt
(2)
d(M1V) ) qfM1f - (fkdI + kp1M1 + kf1mM1 + dt kf1SS)G0V (3) d(M2V) ) -(fkdI + kp2M2 + kf2mM2 + kf2SS)G0V dt d(G0V) ) (2fkdI - ktG02)V dt
in which I, M1, M2, S, Gj, and Hj are the molar concentrations of the initiator (AIBN), monomer 1 (more reactive monomer, MMA), monomer 2 (MA), and solvent (toluene) and the jth moments of living and dead polymer concentrations, respectively. Concerning the details about the mechanism, the definition of the moment, and mass balance equations, one may refer to Park et al.25 While the equations were derived in the same form as those for the homopolymerization, the pseudo kinetic rate constants are defined as a function of elementary reaction rate constants and the volume fraction, and hence this system shows quite a severe nonlinearity. Because the feed is composed of initiator and monomer 1, the feed rate of both components was included in the mass balance equations. Because we installed an additional pump and injected the solvent at a fixed rate in order to prevent the viscosity from increasing severely, the feed rate of the solvent qfsSf was added to the mass balance equation for the solvent.26 To determine the polymer volume, we consider the rate of polymer volume change given by the following equation:
dVp/dt ) -[WM1rM1V + WM2rM2V + WSrSV]/Fp WM1 WM 2 WS , VM2 ) (M2V) , VS ) (SV) FM 1 FM 2 FS (11)
VM1 ) (M1V)
where VX, WX, FX, and rX denote the volume, the molecular weight, the density, and the formation rate of species X, respectively. Note that the right-hand side of eq 11 represents the mass rate of polymer production divided by the copolymer density. On the basis of the above equations, the properties of the copolymer product are defined and calculated. One of the most important properties that determine the end user properties of a copolymer is the copolymer composition Fi, which is defined as the mole fraction of monomer i in the copolymer chain. It is calculated by using Mayo’s equation, which is given in terms of the monomer fractions f1 and f2 in the remaining monomers and the reactivity ratios r1 and r2, i.e.,
(5)
}
1 ktd + ktc G02 + (kfMM + kfSS)G0 V (8) 2
)
d(H1V) ) (ktG0 + kfMM + kfSS)G1V dt
Mn )
Mw )
∫0t
d(H2V) ) {(ktG0 + kfMM + kfSS)G2 + ktcG12}V (10) dt
{
{ ∫{
∫0t
t
}
d(G1V) d(H1V) + Mav(t) dt V dt V dt d(H0V) t d(G0V) + dt 0 V dt V dt
∫
0
(9)
(12)
Another property of our concern is the average molecular weights. One can calculate the average molecular weights by using the moments of the polymer concentration and average molecular weight, Mav(t) ) WM1F1 + WM2F2, as follows:
d(G2V) ) {2fkdI + kpM(G0 + 2G1) - ktG0G2 + dt (kfMM + kfSS)(G0 - G2)}V (7)
{(
f1(r1f1 + f2) + f2(r2f2 + f1)
(4)
d(G1V) ) {2fkdI + kpMG0 - ktG0G1 + dt (kfMM + kfSS)(G0 - G1)}V (6)
d(H0V) ) dt
f1(r1f1 + f2)
F1 )
{
} }
}
d(G2V) d(H2V) Mav2(t) dt + V dt V dt d(G1V) d(H1V) Mav(t) dt + V dt V dt
(13)
(14)
3. Learning-Based Nonlinear Model Predictive Controller The algorithm is developed under the assumption that the current state is available by feedback or is estimated
2738 Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004
by using an optimal state estimator. The nonlinear model is expressed by the following nonlinear differential equation:
x˘ (t) ) f[x(t),u(t)]
(15)
y ) h(x)
(16)
where x, u, and y represent the state vector, the manipulated input vector, and the controlled output vector, respectively. To incorporate the learning algorithm into the controller, the nonlinear model in eq 15 is linearized at the state and input of the previous batch, i.e.,
f[xi(t),ui(t)] ) f[xi-1(t),ui-1(t)] +
|
|
∂f [x (t) - xi-1(t)] + ∂x x)xi-1(t),u)ui-1(t) i
∂f [u (t) - ui-1(t)] + HOT (17) ∂u x)xi-1(t),u)ui-1(t) i Here, xi(t) and ui(t) represent the state and input at time t of the ith batch. Equation 17 with the higher order term neglected is inserted into eq 15, and then the equation is integrated from k to k + 1 to obtain the one-step-ahead prediction equation at step k:
xi(k+1) ) xi(k) + = xi(k) +
∫kk+1f[xi(t),ui(t)] dt ∂f [xi(t) ∫kk+1{f[xi-1(t),ui-1(t)] + ∂x
matrices should involve numerical integrals that require a large computational burden. However, because the response of a polymerization reactor is sluggish, the assumption of constant matrices for the sample interval is used. Because Fts[xi-1(k),ui-1(k)] is equal to xi-1(k+1), the increment of state between the current and previous batches is obtained as follows:
∆xi(k+1) ) xi(k+1) - xi-1(k+1) ) [Ai-1(k) + I]∆xi(k) + Bi-1(k) ∆ui(k) )A h i-1(k) ∆xi(k) + Bi-1(k) ∆ui(k)
(22)
The one-step-ahead prediction equation, eq 22, has the increment of input between the consecutive batches, and thus the learning algorithm is involved in the prediction equation. Unlike the existing NLMPC, the nonlinear model is not used in the numerical integration but is linearized by using the previous batch data to calculate the Jacobians at every batch. Because the one-step-ahead prediction of the learningbased NLMPC is approximated into the form of a linear time-varying model (cf. eq 22), the development of the prediction equation within the prediction horizon is straightforward. In the same manner as that in the onestep-ahead prediction, the two-step-ahead prediction is derived in terms of the Jacobians and increments of the state and inputs.
∂f [u (t) - ui-1(t)] dt (18) ∂u i
h i-1(k+1) ∆xi(k+1) + ∆xi(k+2) ) A Bi-1(k+1) ∆ui(k+1) (23)
The integration of the nonlinear model is approximated by assuming that u remains constant between two sampling instants and the zero-order hold is used for the transformation of the first-order derivative terms into a discrete model. Hence, we obtain
By insertion of eq 22 into eq 23, the two-step-ahead prediction is obtained as a linear combination of the current state xi(k), undecided inputs ui(k), and ui-1(k):
xi(k+1) = [xi(k) - xi-1(k)] + Fts[xi-1(k),ui-1(k)] +
∆xi(k+2) ) A h i-1(k+1) A h i-1(k) ∆xi(k) + A h i-1(k+1) Bi-1(k) ∆ui(k) + Bi-1(k+1) ∆ui(k+1) (24)
xi-1(t)] +
}
Ai-1[xi(k) - xi-1(k)] + Bi-1[ui(k) - ui-1(k)] (19) where Fts[xi-1(k),ui-1(k)] ) xi-1(k) + ∫k+1 k f[xi-1(t),ui-1(t)] dt represents the terminal state vector obtained by integrating the ordinary differential equation, eq 15, for one sample interval (ts) with the initial condition of xi-1(k) and the constant input of u ) ui-1(k). The matrices Ai-1(k) and Bi-1(k) are calculated as follows:
ˆ i-1(k) ts]; Ai-1(k) ) exp[A A ˆ i-1(k) ) Bi-1(k) )
|
∂f(x,u) ∂x
x)xi-1(k),u)ui-1(k)
The preceding steps are repeated until one can obtain the p-step-ahead prediction of the state ∆xi(k+p), and all of the predicted terms are rearranged in the form of a matrix as follows:
∆Xi(k+p|k) ) [∆xi(k+1)T, ∆xi(k+2)T, ..., ∆xi(k+p)T]T )A ˜ i-1(k) ∆xi(k) + B ˜ i-1(k) ∆Ui(k+p-1|k) (25)
(20) where
∫0 exp[Aˆ i-1(k)‚τ] dτ‚Bˆ i-1(k); ts
|
∂f(x,u) B ˆ i-1(k) ) ∂u
x)xi-1(k),u)ui-1(k)
∆Ui(k+p-1|k) ) [∆ui(k)T, ∆ui(k+1)T, ..., ∆ui(k+p-1)T]T
(21)
It is worth noting that the matrices A ˆ i-1(k) and B ˆ i-1(k) are assumed to be constant for the sample interval. If this assumption does not hold, the formulas for these
A ˜ i-1(k+p|k) ) 0
∏ Ah i-1(k+j))T]T j)p-1
h i-1(k+1) A h i-1(k)]T, ..., ( [A h i-1(k)T, [A
[
B ˜ i-1(k+p|k) ) Bi-1(k) A h i-1(k+1) Bi-1(k) · · · 1
∏ Ah
i-1(k+j)
Bi-1(k)
j)p-1
2
∏ Ah
0 Bi-1(k+1) · · · i-1(k+j)
0 0 ··
0 0 · · ·
·
Bi-1(k+1) · · · Bi-1(k+p-1)
j)p-1
]
Note that all of the matrices with subscript i - 1 are easily calculated by using the nonlinear model and the previous batch data. Also, ∆xi(k) is available by state feedback (or estimation). Accordingly, we are now able to predict the future behavior of the state within the prediction horizon as a function of the future input sequence. To derive the controlled output prediction, the controlled output function yi(k+j) ) h[xi(k+j)] is linearized by using xi-1(k+j) to obtain
yi(k+j) ) h[xi(k+j)] = h[xi-1(k+j)] + Hi-1(k+j) [xi(k+j) - xi-1(k+j)] j ) 1, 2, ..., p (26)
Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004 2739
δui(k+l) ) ui(k+l) - ui(k+l-1) + [ui-1(k+l) - ui-1 (k+l)] + [ui-1(k+m-1) - ui-1(k+m-1)] l
) ∆ui(k+l) - ∆ui(k+m-1) +
∑ δui-1(k+j)
j)m
(29)
is assumed to be zero. Hence, substituting ∆ui(k+ml δui-1(k+j) for ∆ui(k+l), one can obtain the 1) - ∑j)m prediction equation as a function of ∆Ui(k+m-1|k). To predict the controlled output at k ) kf - p, we make another assumption that the state and input after the final step (kf) remain constant. Of course, such an assumption is not physically meaningful, but it will not deteriorate the control performance because the variation of state variables may not be significant when the prediction horizon is short. Alternatively, when the prediction horizon is so large that the assumption may not be valid, the prediction horizon can be reduced one by one as we approach to the final time. The constraint on the input magnitude
∀ i, k
Umin e Ui(k+m-1|k) e Umax
(30)
where
Hi-1(k+j) )
|
∂h(x) ∂x
(27)
x)xi-1(k+j)
The combination of eq 25 with eq 26 provides the prediction equation as follows: T
T
Yi(k+p|k) ) [yi(k+1) , yi(k+2) , ..., yi(k+p) ]
) Yi-1(k+p|k) + H ˜ i-1(k+p|k) ∆Xi(k+p|k) ˜ i-1(k+p|k) A ˜ i-1(k+ ) Yi-1(k+p|k) + H ˜ i-1(k+ p|k) ∆xi(k) + H p|k) B ˜ i-1(k+p|k) ∆Ui(k+p-1|k)
where
H ˜ i-1(k+p|k) )
[
Hi-1(k+1) 0 · · · 0
0
(28)
∀ i, k
∆Umin e ∆Ui(k+m-1|k) e ∆Umax
(32)
To recast the constraint on the input change in terms of the time index δumin e δui(k+l) e δumax, l ) 0, ..., m - 1, into the one in terms of ∆ui(k+l), the following algebraic manipulation procedure is used:
δui(k+l) ) ui(k+l) - ui(k+l-1)
0
0
Hi-1(k+2) 0
0
· · · 0
Umin - Ui-1(k+m-1|k) e ∆Ui(k+m-1|k) e Umax - Ui-1(k+m-1|k) ∀ i, k (31) The constraint on the input change in terms of the batch index is considered by specifying the upper and lower bounds as follows:
T T
) SY + SU∆Ui(k+p-1|k)
is easily rearranged because the input values in the (i - 1)th batch have been stored and are available in the ith batch:
· · · · 0 Hi-1(k+p)
··
]
Because the control horizon is to be considered shorter than the prediction horizon in the implementation of the conventional MPC, the prediction equation should be rearranged. While the prediction equation of the conventional MPC is easily obtained by putting δui(k+l) ) ui(k+l) - ui(k+l-1) ) 0 for l ) m, m + 1, ..., p - 1, the prediction equation of the learning-based NLMPC requires additional algebraic manipulation because the controller computes the increment of the control input with respect to the batch index. Because ui(k+l), l ) m, m + 1, ..., p - 1, can be replaced by ui(k+m-1), the expression for δui(k+l)
) ui(k+l) - ui-1(k+l) - ui(k+l-1) + ui-1(k+l-1) + ui-1(k+l) - ui(k+l-1) ) ∆ui(k+l) - ∆ui(k+l-1) + δui-1(k+l) (33) Because ∆ui(k-1) is available at step k, we obtain the following matrix inequality:
δUmin - δUi-1(k+m-1|k) + [∆ui(k-1)T, 0, ..., 0]T e IL∆Ui(k+m-1|k) e δUmax - δUi-1(k+m-1|k) + [∆ui(k-1)T, 0, ..., 0]T where
[
I -I IL ) 0 0
0 I -I 0
· · · 0 I ·· ·
0 0 0 ·· ·
∀ i, k (34)
]
The bounds in the constraints, eqs 31, 32, and 34, may not change from batch to batch, but one can handle the case when the bounds change at each batch as occasion
2740 Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004
demands. The controlled output constraint can be changed into the constraint on the input change in terms of the batch index by applying the linear timevarying model, eq 28. However, it is not considered in this study because a severe disturbance may cause the output to violate the constraint and result in an infeasible solution of the optimization problem. On the basis of the controlled output prediction equation, eq 28, the following quadratic function is defined as the performance index:
min
∆Ui(k+m-1|k)
||Λy[Yi(k+p|k) - Ri(k+p|k)]||22 + ||Λu∆Ui(k+m-1|k)||22 (35)
where Ri(k+p|k) ) [ri(k+1)T, ..., ri(k+p)T]T is the future reference vector for y available at step k of the ith batch. Λy and Λu represent the weighting matrices for the controlled output error and the input change in terms of the batch index, respectively. On the basis of the performance index, the optimization problem with constraints (31)-(34) can be solved by the quadratic programming method. From the computed control input sequence, the first input move ∆ui(k) is implemented and the entire procedure is repeated at the next sampling time. It is worth noting that the performance index is composed of errors in the previous batch (cf. eq 28) and the correction terms as a function of ∆Ui(k+m1|k). In the unconstrained case, one can obtain the following optimal input trajectory:
∆Uj(k+p|k) ) ˜ j-1(k+p|k)]TΛy[H ˜ j-1(k+p|k) B ˜ j-1(k+ {[H ˜ j-1(k+p|k) B ˜ j-1(k+p|k) B ˜ j-1(k+p|k)]TΛy[Rj(k+ p|k)] + Λu}-1[H ˜ j-1(k+p|k) A ˜ j-1(k+p|k) ∆xj(k)] p|k) - Hj-1(k+p|k) - H Here, the input trajectory is updated from the trajectory in the previous batch by using the optimal learning gain and the correction terms proportional to the error trajectory in the previous batch. Accordingly, the algorithm inherits the input adaptation method in the ILC and improves the performance of the MPC. The algorithm of the learning-based NLMPC is based on the linearization of a nonlinear model by using the previous batch data at every batch. This means that one obtains a different linear time-varying model at each batch. In case all disturbances are removed from the system or only periodic disturbances exist, the consistent optimal input profile exists and the approximated linear time-varying model converges to the model calculated on the basis of the optimal input trajectory as the input trajectory approaches the optimal trajectory. This feature is similar to the model adaptation algorithm and also plays a role in improving the proposed algorithm. Despite the advantage of the learning method and the model adaptation contained in the algorithm, the algorithm has a limitation that it is delicate to a disturbance, which is unique in the current batch. When each batch may be subjected to a unique sequence of disturbances and the effect of the unique disturbance is strong, the previous input trajectory would be different from the optimal trajectory for the current batch. Hence, the model calculated on the basis of the previous input trajectory will experience the problem of model mismatch, which is not periodic, and then the performance
Figure 1. Periodic disturbance in the reaction temperature.
can be deteriorated. The algorithm is not effective when frequent grade changes are needed in terms of the batch. As in the case of nonperiodic disturbance, a new setpoint in the current batch requires the new optimal trajectory and brings about a model mismatch problem. Therefore, it is recommended to include a disturbance model such as the unmeasured disturbance model.5 The weighting matrix for the control input (Λu) plays a different role in the optimization compared to that of the conventional MPC because of the unique structure of the algorithm, in which the optimal input is calculated in the form of the increment between two consecutive batches. A small value of Λu implies a fast change of the input trajectory, which may make the effect of model/plant mismatch stronger because the higher order terms in eq 17 are neglected under the assumption that the trajectories between the consecutive batches are not much different. In this regard, Λu has to be determined in such a way that the trajectories are not changed abruptly. To establish the effectiveness of the proposed controller, the performance is compared with that of an extended Kalman filter (EKF) based nonlinear model predictive controller.5 The first-principles model in section 2 is used as the plant, and it is assumed that the state is available by feedback. A disturbance in the reaction temperature is considered as a periodic disturbance in this case. As the reaction proceeds and the viscosity increases, the effectiveness of heat transfer is decreased, and thus the actual reaction temperature becomes higher than the desired value. Figure 1 shows the difference between the actual reaction temperature and its desired value, and the difference plays the role of a disturbance in the system. The performance of the conventional NLMPC is presented in Figure 2 when the control aim is to maintain f1 and Mw at 0.85 and 182 000, respectively. In this case, a disturbance model used in our previous work26 is incorporated into the NLMPC for better performance. Prediction and control horizons are specified as 10 and 5, respectively, with a sampling time of 1 min. Weighting matrices are determined by the trialand-error method as Λy ) diag(2.18, 7.0) and Λu ) diag(0.7, 0.1) with the criterion that the control input does not become too aggressive. The feed flow rate and reaction temperature are constrained within the ranges of 0-10 mL/min and 55-90 °C, respectively. The constraints on the rate of input change of the feed flow rate (δqf) and reaction temperature (δTr) in terms of the
Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004 2741
Figure 2. Regulatory performance of the conventional nonlinear MPC (EKF-based NLMPC developed by Lee and Ricker5).
Figure 3. Regulatory performance of the learning-based nonlinear MPC.
time index are -2 to +2 mL/min and -1 to +1 °C, respectively. As shown in Figure 2, the mole fraction of MMA shows an oscillatory response and is not completely regulated because the disturbance in the reaction temperature affects the consumption rate of each monomer. The control input of the reaction temperature shows a somewhat aggressive behavior in the early stage when it drives Mw to its setpoint, but Mw reaches the setpoint before 20 min when the effect of the disturbance is not strong. The deviation is observed as a result of the disturbance, but it is satisfactorily rejected by the controller during the entire reaction time. The learning-based NLMPC proposed without a disturbance model is applied for the same control purpose. The weighting matrix for the control input is Λu ) diag(0.01, 0.1) in this case. It should be emphasized that different weights are used because the control input in this case is determined as an increment between batches while the input in the conventional NLMPC in the previous case is an increment between steps. However, the same criterion that the control input profile should not be too aggressive is also used in this case. The constraints on the input magnitude and the rate of input change in terms of the time index are not changed, but additional constraints on the rate of input change in terms of the batch index are included as -5
mL/min e ∆qf e +5 mL/min and -4 °C e ∆Tr e +4 °C. The initial batch is conducted under constant input conditions (qf ) 3.0 mL/min and Tr ) 80 °C) in an openloop manner to produce initial state trajectories. The initial input trajectory is chosen to be far from the optimal trajectory to evaluate the convergence rate. As shown in Figure 3, outputs are driven to their respective setpoints after the third iteration and inputs converge to their respective optimal trajectories. It is worth noting that only a periodic disturbance is considered in this case. Because the initial input trajectory is not close to the optimal one and constraints are present in the system, it takes more than one iteration to converge, which means that the performance loss is incurred during the first few batches. If the initial trajectory is close enough for fast convergence, one or two iterations would be sufficient for convergence. Despite the performance loss in this algorithm, however, the proposed algorithm shows a better performance than that of the conventional NLMPC, and it also has merit in that the tuning for a disturbance model is not required in the presence of periodic disturbances; this feature significantly reduces the time requirement for the implementation of the controller.27 The computation time for the proposed algorithm is 27.15 s (0.136 s for one sampling time) on the PC with 2.54 GHz Pentium 4 CPU, while the conventional NLMPC requires 125.65
2742 Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004
the nonlinear model using the immediate previous batch data. Therefore, the linear model includes a model mismatch, and the performance of the controller is poor, as shown in Figure 4. The mole fraction of MMA (f1) converges after the 6th iteration, but f1 is driven to its setpoint after 70 min. It is less than 30 min in the case of the learning-based NLMPC, as shown in Figure 3. As for Mw, the output does not reach the setpoint even after the 6th iteration. From the second iteration, a sharp peak appears in the feed flow rate before 10 min and does not disappear as the batch is repeated, while the learning-based NLMPC, which relinearizes the nonlinear model on the basis of the immediate previous batch trajectory, calculates the feed flow rate trajectory without such a sharp peak after the second iteration (cf. Figure 3). The input trajectory of the reaction temperature is even worse. The input of the reaction temperature shows a rather aggressive behavior in the early stage. Such an aggressive behavior may be relaxed if the controller is retuned, but in that case, the convergence of the output would become slower. The performance of the linear-model-based learning controller may be improved if the initial trajectory is close to the operating region. However, the following features will have to be taken into consideration. (1) No information is available about the operating region, and thus it is difficult to find an appropriate initial trajectory. (2) If a strong unique disturbance exists in the current batch and makes outputs deviate from the setpoints, then the controller should make trajectories converge to the optimal trajectories in sequential batches as soon as possible. (3) If a model mismatch exists, the relinearization method used in the proposed algorithm is important and effective because the linear model adequate for the control in the operating region is obtained as the batch is repeated. Figure 4. Regulatory performance of the linear-model-based learning controller.
4. Experimental System with Online Measurement
s for one batch, which corresponds to 0.628 s for one sampling time. Because NLMPC exploits numerical integration for the prediction equation, the computation time may become longer if a long prediction horizon is used. It is also possible that the system is extremely nonlinear and, as a result, the steepest numerical integration may require a long computation time. However, the computation time in the semibatch copolymerization reactor in this study shows that the computation time is not a big problem. If anything, it will be overcome as the performance of the computer improves. Figure 4 shows the performance of the learning controller based on a linear model, which is obtained by linearizing the nonlinear model using the initial trajectory. The same design variables as those in the learning-based NLMPC are used for the purpose of performance comparison. Because the trajectory used in the linearization is identical with the initial trajectory in the previous case (learning-based NLMPC) and the same design variables are used, the performance of the controller at the first batch in this case is in agreement with that of the learning-based NLMPC in the previous case. After the second batch, however, the linear-modelbased learning controller still uses the linear model on the basis of the initial trajectory far from the region of concern, while the learning-based NLMPC relinearizes
The experimental system is shown schematically in Figure 5. The polymerization reactor has a capacity of 1 L. The reaction mixture was circulated by a diaphragm metering pump through the circulation line in which the online densitometer (Anton Paar, model DMA401YH) and viscometer (Sofraser, model MIVIADF) were installed. Measured properties were fed back, and via the estimation and optimization procedure, optimal inputs were calculated. The feed flow rate as a manipulated variable was regulated with a variable-speed, remote setpoint pump in accordance with the controller signal. As the second manipulated variable, the reaction temperature was manipulated by the heating and cooling water flowing through the jacket. The valve stem positions of the hot and cold water lines were regulated in a cascade temperature control configuration in such a way that the jacket inlet temperature was kept equal to the desired value specified by the master controller. In a separate line, the solvent was injected at a fixed rate of 0.5 mL/min to prevent a severe increase of the viscosity as the reaction proceeded.26 The online measured density was directly used in the measured output function of the state estimator, while the weight-average molecular weight Mw was calculated by the correlation equation describing the relationship between the viscosity and Mw. For the online measurement of Mw, the following correlation was used to
Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004 2743
Figure 5. Schematic diagram of the semibatch copolymerization reactor system.
calculate the intrinsic viscosity [η]:
density of the reaction mixture:
ηsp )
F) 0.8
0.8
kH[η]C 2kH[η]C C0.8[η] exp + (C0.8)2[η]2 exp (36) 0.8 1 - bC 1 - bC0.8
W W ) ) V VM1 + VM2 + VS + Vp W (38) M1V M2V SV WM1 + WM2 + WS + Vp FM1 FM 2 FS
where ηsp represents the specific viscosity obtained by using the measured viscosity and C denotes the mass concentration (g/mL) of the copolymer in the reactor that can be obtained by using the online densitometer data (C ) F × solid content). The parameters kH and b were determined as 0.58 and 0.8, respectively, by fitting the experimental data when the viscosities were given with units of centipoise (cP). The weight-average molecular weight was calculated by solving the Mark-Houwink equation
Note that this equation includes the volume of the copolymer Vp, which requires the calculation of the copolymer density Fp (cf. eq 11). Because the information on the copolymer density is rather scarce, unlike in the case of homopolymers, the equation for the hydrodynamic volume was modified and then extended to the copolymer density with the assumption of uniform copolymer composition and dilute solution:
[η] ) KMwa
1/Fp ) [w1(1/Fp1)2/3 + w2(1/Fp2)2/3]3/2
(37)
Here, the Mark-Houwink constants, K and a, were given as 5 × 10-6 and 1.10, respectively. 5. Experimental Results and Discussion In this section, the validity of the proposed controller is evaluated in an experimental study. For state estimation, the EKF28 was used as a state estimator. Because the procedure for the implementation of the EKF is a straightforward extension of the Kalman filter, the detailed procedure will not be presented here. It is worth mentioning that the linear time-varying model in the learning-based NLMPC is not used in the EKF, but the nonlinear model is used, which means that the learning algorithm is not involved in the EKF. For the measured outputs to be used in the EKF, the density and weight-average molecular weight were chosen by using online measurement as discussed in the previous section. Equation 14 was used as the measured output function for Mw, while the following equation was suggested for another measured output, that is, the
(39)
where wi and Fpi denote the weight fraction of monomer i in the copolymer chain and the density of the homopolymer of monomer i, respectively. In our previous work,26 the above equation was compensated for with the unmeasured disturbance model in order that the equation should be valid even when composition drift occurs. This disturbance model, however, was not used in the application of the learning-based NLMPC because the controller could eliminate the effect of the disturbance effectively by virtue of the learning mechanism involved in the prediction equation. The reactor was initially charged with a recipe of 350 mL of MMA, 50 mL of MA, 400 mL of toluene, and 0.4 g of AIBN. The feed was composed of 0.5 g of AIBN and 1 L of MMA. The state variables estimated by the EKF were moles of initiator, two monomers, and solvent, zeroth and second moments of dead polymer concentrations, and copolymer volume. These variables were normalized to reduce the differences among the orders of magnitude of the individual variables. Process and measurement noise covariance matrices and the state error covariance matrix were specified as 1 × 10-4I7×7,
2744 Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004
Figure 6. Estimation performance of the EKF in the initial batch.
1 × 10-2I2×2, and diag(1 × 10-3, 1 × 10-6, 1 × 10-6, 1 × 10-3, 1 × 10-5, 1 × 10-6, 1 × 10-4), respectively, and they are the same for all batches. Concerning the measurement noise covariance, we analyzed the measurement signal and tried to make the values have the same order of magnitude as the actual measurement noise. The process noise measurement covariance was determined with the standard that it should be as small as possible because we assumed that the degree of model mismatch is weak. The initial state error covariance matrix was determined by a trial-and-error method. Figure 6 shows the experimental result under constant inputs, that is, when the feed flow rate was fixed at 0.5 mL/min while the reaction temperature was maintained at 80 °C. For the initial period of 30 min, only the density was used in the state estimation because the online measurement correlation equation for Mw is not valid in this period as a result of the low viscosity as discussed in our previous work29 as well as shown in diagram b of Figure 6. After the viscosity became high enough for the correlation to be valid, the weight-average molecular weight showed the validity of the online measurement correlation and also a satisfactory performance of the state estimation when compared to the offline measurements. These measurements were obtained by analyzing the samples taken from the reactor every 20 min by gel permeation chromatography. Though not shown in the figure, the estimated value of the density, which is one of the two measurements, followed the online measured values very closely. For a detailed procedure for the application of the EKF, one may refer to our previous work.26,29 The mole fraction f1 of MMA in the remaining monomers was calculated by using the estimated state from the EKF, and the effectiveness of the estimator was proven by comparison between the calculated value and the offline measurement by gas chromatography as depicted in diagram a of Figure 6. Because online measurement of the density is used from the beginning
of the reaction, the moles of two monomers are estimated effectively, and thus the calculated values of f1 are in good agreement with the offline measured values. The control aim in this study is to produce a copolymer with a uniform copolymer composition and a desired weight-average molecular weight. In general, composition drift occurs in a batch or semibatch reactor under the constant input conditions as a result of the different reactivities of two monomers. Research works have been focused on the production of copolymers with uniform composition because the copolymer chains with a uniform composition have end-user properties better than those of copolymers with various compositions.30 Therefore, f1 was selected as a controlled variable in this study. In most of the research results concerned with control of the average molecular weight in a batch or semibatch polymerization reactor, the aim was to produce a polymer with a desired Mw at the final time of a batch. In this study, however, the performance index was selected in such a way that Mw was regulated for the entire reaction course. Hoffman et al.31 showed that the regulation of Mw minimizes the breadth of the molecular weight distribution and a narrow distribution improves several properties such as the tensile strength and elongational property. The feed flow rate and the reaction temperature were taken as the control inputs. As the feed consists of the more reactive monomer, the composition drift is compensated for by the feed flow rate. Also, the flow rate affects Mw by exercising an effect on the growth rate of chain length. In thermally initiated polymerization system, Mw is strongly dependent on the reaction temperature and the rate of the composition drift changes as the temperature varies. The trajectories of state and input in Figure 6 were used as the initial trajectories, and then the learningbased nonlinear model predictive controller was applied to the first learning batch operation. The sampling time, the prediction horizon, and the control horizon were specified as 1 min, 10, and 5, respectively. Weighting matrices were used in diagonal form for convenience and determined by trial and error as Λy ) diag(20, 100) and Λu ) diag(0.01, 0.1) for outputs (f1 and Mw) and inputs (qf and Tr), respectively. The constraints on the input magnitude and the input changes in terms of batch and time indices were considered as follows:
Constraints on the input magnitude 0 mL/min e qf e 10 mL/min and 55 °C e Tr e 90 °C Constraints on the input change in terms of the batch index -2 mL/min e ∆qf e 2 mL/min and -10 °C e ∆Tr e 10 °C Constraints on the input change in terms of the time index -0.5 mL/min e δqf e 0.5 mL/min and -1 °C e δTr e 1 °C In this batch, we did not impose any intentional disturbance on the system but assumed that the system naturally includes the periodic disturbances. For example, the initial charge error that may occur during the preparation of the initial charge is inevitable
Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004 2745
Figure 7. Regulatory performance of the learning-based NLMPC in the first batch.
although one may try to prepare the reactant as carefully as one can. Another example would be the model mismatch between the nonlinear model and the plant caused by the violation of the assumption for perfect mixing due to high viscosity or kinetic parameter uncertainty. However, we assumed that this kind of disturbance is periodic and not strong in our system, and thus the disturbance model was excluded to examine the effect of the learning algorithm incorporated in the algorithm. Figure 7 shows the regulatory performance of the learning-based NLMPC at the first batch when the setpoints of f1 and Mw were specified as 0.85 and 175 000, respectively. The reactor was operated in an open-loop manner during the first 30 min, and the state estimation by using the density was carried out. At 30 min, the EKF started to use both of the measurements (density and Mw) and the controller came into action. After the optimal inputs were used, both of the controlled outputs were driven to their respective setpoints within 50 min although no disturbance model was included in the prediction equation. As soon as the optimization started, the feed flow rate was decreased to its lower bound to drive f1 to the setpoint quickly and then increased to prevent composition drift. Without control action, Mw would be decreased because of the polymerization itself (cf. Figure 6) and the continuous injection of the initiator by the feed. Based on the predicted behavior, Mw was maintained at its setpoint satisfactorily. The inputs and outputs were converged just in one iteration because the constant inputs imposed on the system at the initial batch were close to the input trajectories at the first batch. In our previous work in which there had been no disturbance intentionally imposed on the system, the EKF-based NLMPC required the augmentation of the unmeasured disturbance model while the algorithm in this study showed
Figure 8. Regulatory performance of the learning-based NLMPC when the initiator concentration in the feed is unexpectedly doubled in the third batch.
no less remarkable performance by virtue of the learning algorithm. In the sense of learning, the learning algorithm improves the control performance as the batch is repeated. In our study, because the initial trajectories of inputs in Figure 6 were not much different from those in the current batch, the inputs were updated after only one iteration. In addition, because the model obtained in the current batch was calculated by using the trajectories close to the current ones, the model could predict the system well and only one iteration was needed for learning. Except for a short period in the early stage of the reaction, the manipulated variables did not reach their respective lower and upper bounds. Hence, the optimal inputs calculated on the basis of the optimal learning gain in the unconstrained case give rise to a good control performance. The control performance was also investigated with a strong disturbance imposed at the second batch. For the realization of a disturbance, the initiator concentration in the feed was doubled. Because the increased amount of the initiator brings about a high consumption rate of monomer 1, the controller made the feed flow rate increase, as shown in Figure 8. The effect of the disturbance on Mw was also eliminated satisfactorily. As stated in the previous section, the learning algorithm shows a good control performance in the presence of a periodic disturbance but cannot completely eliminate a unique disturbance in the current batch, as observed in Figure 8. Therefore, a disturbance model that accurately reflects the disturbance should be augmented for more satisfactory results. A large deviation for Mw was observed in the latter part of the reaction course. A high reaction rate caused by a large amount of the initiator is responsible for this result. Despite the continuous injection of solvent
2746 Ind. Eng. Chem. Res., Vol. 43, No. 11, 2004
through a separate line, the viscosity increased so severely that the online measurement correlation, eq 36, became ineffective. The problem related to the high viscosity was reported in Park et al.29 6. Conclusions A successive linearization method for a nonlinear MPC was modified to incorporate the learning algorithm into the prediction equation for a better performance of the controller in a semibatch copolymerization reactor. The main idea is to linearize a nonlinear model by using the previous batch data at every batch in such a way that the prediction equation is a function of the increment of inputs between two consecutive batches. The controller was applied to a semibatch MMA/MA copolymerization reactor for the production of a copolymer with a uniform composition and a desired weightaverage molecular weight. Without employment of a disturbance model, a satisfactory performance of the controller for the regulation was clearly observed from both the simulation and experimental results when there is no unique disturbance at each batch. It was also shown by comparison with the linear-model-based learning controller that the relinearization algorithm in the proposed controller is needed. However, the augmentation of a disturbance model was found to be necessary for the elimination of a strong and unique disturbance. In conclusion, the proposed learning-based nonlinear MPC is an effective strategy for the control of nonlinear batch or semibatch processes. Acknowledgment The financial support from the Brain Korea 21 Project is gratefully acknowledged. Literature Cited (1) Richalet, J.; Rault, A.; Testud, J. L.; Papon, J. Model Predictive Heuristic Control: Application to Industrial Processes. Automatica 1978, 14, 413. (2) Cutler, C. R.; Ramaker, B. L. Dynamic Matrix ControlsA Computer Control Algorithm. Proceedings of the Joint Automatic Control Conference, San Francisco, CA, 1980; Paper WP5-B. (3) Garcia, C. E.; Morari, M. Internal Model Control: 1. A Unifying Review and Some New Results. Ind. Eng. Chem. Process Des. Dev. 1982, 21, 308. (4) Gattu, G.; Zafiriou, E. Nonlinear Quadratic Dynamic Matrix Control with State Estimation. Ind. Eng. Chem. Res. 1992, 31, 1096. (5) Lee, J. H.; Ricker, N. L. Extended Kalman Filter Based Nonlinear Model Predictive Control. Ind. Eng. Chem. Res. 1994, 33, 1530. (6) Henson, M. A. Nonlinear Model Predictive Control: Current Status and Future Directions. Comput. Chem. Eng. 1998, 23, 187. (7) Qin, S. J.; Badgwell, T. A. An Overview of Industrial Model Predictive Control Technology. Proc. Int. Conf. Chem. Proc. Control, 5th 1997, 232. (8) Ahn, S.-M.; Chang, S.-C.; Rhee, H.-K. Application of Optimal Temperature Trajectory to Batch PMMA Polymerization Reactor. J. Appl. Polym. Sci. 1998, 69, 59. (9) Yabuki, Y.; MacGregor, J. F. Product Quality Control in Semibatch Reactors Using Midcourse Correction Policies. Ind. Eng. Chem. Res. 1997, 36, 1268.
(10) Gattu, G.; Zafiriou, E. A Methodology for On-line Setpoint Modification for Batch Reactor Control in the Presence of Modelling Error. Chem. Eng. J. 1999, 75, 21. (11) Mutha, R. K.; Cluett, W. R.; Penlidis, A. Nonlinear Modelbased Predictive Control of Nonaffine Systems. Automatica 1997, 33, 907. (12) Soroush, M.; Zambare, N. Nonlinear Output Feedback Control of a Class of Polymerization Reactors. IEEE Trans. Control Syst. Technol. 2000, 8, 310. (13) Crowley, T. J.; Harrison, C. A.; Doyle, F. F., III. Batch-tobatch Optimization of PSD in Emulsion Polymerization using a Hybrid Model. Proc. Am. Control Conf. 2001, 981. (14) Arimoto, S.; Kawamura, S.; Miyazaki, F. Bettering Operation of Robots by Learning. J. Robotic Syst. 1984, 1, 123. (15) Fang, Y.; Chow, T. W. S. Iterative Learning Control of Linear Discrete-Time Multivariable Systems. Automatica 1998, 34, 1459. (16) Amann, N.; Owens, D. H.; Rogers, E. Iterative Learning Control for Discrete-Time Systems with Exponential Rate of Convergence. IEE Proceedings: Control Theory Appl. 1996, 143, 217. (17) Moore, K. L. A Nonstandard Iterative Learning Control Approach to Tracking Periodic Signals in Discrete-time Non-linear Systems. Int. J. Control 2000, 73, 955. (18) Longman, R. W. Iterative Learning Control and Repetitive Control for Engineering Practice. Int. J. Control 2000, 73, 930. (19) Hamamoto, K.; Sugie, T. An Iterative Learning Control Algorithm within Prescribed Input-output Subspce. Automatica 2001, 37, 1803. (20) Bone, G. M. A Novel Iterative Learning Control Formulation of Generalized Predictive Control. Automatica 1995, 31, 1483. (21) Amann, N.; Owens, D. H.; Rogers, E. Predictive Optimal Iterative Learning Control. Int. J. Control 1998, 69, 203. (22) Lee, J. H.; Lee, K. S.; Kim, W. C. Model-based Iterative Learning Control with a Quadratic Criterion for Time-varying Linear Systems. Automatica 2000, 36, 41. (23) Lee, K. S.; Chin, I.-S.; Lee, H. J.; Lee, J. H. Model Predictive Control Technique Combined with Iterative Learning for Batch Processes. AIChE J. 1999, 45, 2175. (24) Hwang, W.-H.; Yoo, K.-Y.; Rhee, H.-K. Modeling of Bulk Copolymerization Reactor Using Chain-Length-Dependent Rate Constants. J. Appl. Polym. Sci. 1997, 64, 1015. (25) Park, M.-J.; Ahn, S.-M.; Rhee, H.-K. Kinetic Parameter Estimation for the MMA/MA Copolymerization System. J. Appl. Polym. Sci. 2000, 78, 2554. (26) Park, M.-J.; Rhee, H.-K. Property Evaluation and Control in a Semibatch MMA/MA Solution Copolymerization Reactor. Chem. Eng. Sci. 2003, 58, 603. (27) Valappil, J.; Georgakis, C. Systematic Estimation of State Noise Statistics for Extended Kalman Filters. AIChE J. 2000, 46, 292. (28) Grewal, M. S.; Andrews, P. Kalman FilteringsTheory and Practice; Prentice Hall: Englewood Cliffs: NJ, 1993. (29) Park, M.-J.; Hur, S.-M.; Rhee, H.-K. Online Estimation and Control of Polymer Quality in a Copolymerization Reactor. AIChE J. 2002, 48, 1013. (30) Arzamendi, G.; Asua, J. M. Copolymer Composition Control of Emulsion Copolymers in Reactors with Limited Capacity for Heat Removal. Ind. Eng. Chem. Res. 1991, 30, 1342. (31) Hoffman, F. F.; Schreiber, S.; Rosen, G. Batch Polymerization: Narrowing Molecular Weight Distribution. Ind. Eng. Chem. 1964, 56, 51.
Received for review February 13, 2003 Revised manuscript received January 12, 2004 Accepted March 3, 2004 IE0301379