Ind. Eng. Chem. Res. 2010, 49, 8209–8218
8209
Simple Nonlinear Predictive Control Strategy for Chemical Processes Using Sparse Kernel Learning with Polynomial Form Yi Liu,†,‡ Yanchen Gao,†,§ Zengliang Gao,‡ Haiqing Wang,*,† and Ping Li† State Key Laboratory of Industrial Control Technology, Institute of Industrial Process Control, Zhejiang UniVersity, Hangzhou 310027, P. R. China, Institute of Process Equipment and Control Engineering, Zhejiang UniVersity of Technology, Hangzhou 310032, P. R. China, and Qingdao Mesnac Co., Ltd., Qingdao 266045, P. R. China
A simple nonlinear control strategy using sparse kernel learning (SKL) with a polynomial kernel form is presented and applied to chemical processes. The nonlinear process is first identified by SKL with a polynomial kernel, and then a predictive control performance index is formulated. This index is characterized as an evendegree polynomial function of the manipulated input and has the benefit that the input can be separated from the index because of its special structure. Consequently, the optimal manipulated input can be efficiently obtained by solving a simple root problem of an odd-degree polynomial equation. Moreover, the control parameter directly relates to its performance and can be tuned in a guided manner. All these attributes result in a practicable solution for real-time process control. The novel controller is applied to two chemical processes to evaluate its performance. The obtained results show the superiority of the proposed method compared to a well-tuned proportional-integral-derivative controller in different situations. 1. Introduction Many chemical processes, such as exothermic chemical reactions, bioreactor systems, and batch processes, are nonlinear and time-varying in nature, where the mechanism between the controlled variable and the manipulated variable depends strongly on the operating conditions. For the control of such complex processes, the state feedback control and some advanced linear control techniques might not obtain satisfactory performance. Although a properly tuned proportional-integral-derivative (PID) controller is widely accepted in industrial practice and can usually maintain stable control over a wide range of operating conditions, it is not straightforward to optimize the performance of such a controller. Furthermore, tuning the PID controller to simultaneously achieve the three critical controller performance attributes (i.e., robustness, set-point tracking, and disturbance rejection) is not an easy task despite the availability of more than 200 tuning methods.1,2 Model predictive control (MPC), as another popular process control scheme, is widely employed for multivariable processes.3,4 Unfortunately, MPC tuning parameters are difficult to choose because of the control scheme’s complex structure.1,2 Moreover, developing MPC for general nonlinear processes is still a great challenge.4 Therefore, it is not surprising that interest in the development of advanced control strategies to improve safety and increase profitability has been growing in recent years.1,2 Neural networks (NNs) are useful for process modeling, and thus, many NNbased control (neurocontrol) methods for nonlinear processes have been proposed.5 They involve NN-based internal model control schemes,6,7 one-step-ahead predictive control (OPC) strategies,8-10 MPC methodologies,5,11 adaptive controllers,12 and many others. However, training a NN is time-consuming, * To whom correspondence should be addressed. Tel.: +86-57187951442-810. E-mail:
[email protected]. † State Key Laboratory of Industrial Control Technology, Institute of Industrial Process Control, Zhejiang University. ‡ Institute of Process Equipment and Control Engineering, Zhejiang University of Technology. § Qingdao Mesnac Co., Ltd.
and many training examples are acquired. In addition, NN models are generally not parsimonious, and thus, the related neurocontrol schemes encounter problems for online adapting of the weights, as few efficient approaches are available.5,12 Therefore, despite the existence of many nonlinear control strategies in theory, designing a suitable controller for complex processes is still a challenge in practice. The support vector machine (SVM), which is a novel machine learning method for pattern recognition and function approximation problems,13-16 has found increasing applications in the datadriven process modeling and monitoring fields.17-19 More recently, some SVM model-based nonlinear control algorithms have been proposed.20-25 For example, Zhong et al.20 proposed an SVM-based OPC strategy with a quadratic polynomial kernel function and showed its superiority to the NN-based OPC approach.9 However, Zhong et al.’s strategy20 will become unreliable when an SVM model with a quadratic polynomial kernel cannot describe the nonlinear process well. Zhang and Wang21 employed the least-squares SVM (LS-SVM) model and the gradient-descent algorithm to deduce an OPC approach. Nevertheless, the control descent step is difficult to select, even though some related methods can guarantee its convergence in NN-based OPC approaches.8-10 Combinations of SVM models and MPC approaches have also been investigated recently.22-24 However, most of them have large computation loads. Moreover, these SVM models adopt a Gaussian kernel.21,23-25 Thus, the explicit formulation of the control law cannot be obtained in a straightforward manner. For this reason, process control engineers might not understand why these new control schemes can work well and be prone to reject using them. For process control applications, it is desirable to keep the control scheme as simple as possible.8 Therefore, the aim of this work is to develop a general methodology for designing a simple kernel controller for nonlinear processes. First, an OPC framework using the sparse kernel learning model (i.e., SKLOPC framework) is presented. Then, a polynomial kernel is used to derive a novel control strategy as an extension of SKL-OPC
10.1021/ie901548u 2010 American Chemical Society Published on Web 07/21/2010
8210
Ind. Eng. Chem. Res., Vol. 49, No. 17, 2010
and meanwhile to overcome the disadvantage of quadraticpolynomial-kernel- and the Gaussian-kernel-based control algorithms. The remainder of this article is organized as follows: After a brief summary of the SKL identification method, the main structure of SKL-OPC framework is described in section 2. Then, the polynomial-kernel-function-based control scheme is deduced as an algorithmic implementation of the SKL-OPC. Applications of the new control strategy to a simple two-input process and a highly nonlinear continuous stirred tank reactor (CSTR) process are illustrated in section 3. A detailed comparison to other controllers is also discussed. Finally, conclusions are drawn in section 4. 2. Sparse-Kernel-Learning-Based Predictive Control Strategy 2.1. Sparse Kernel Learning Identification Model. For simplicity, the following deduction is first limited to singleinput-single-output (SISO) nonlinear processes, as a typical practitioner will not consider applying the SISO control strategy for perhaps only 5-10% of control loops.1 Many SISO nonlinear processes can be described by the following discretetime model y(k + 1) ) f [y(k), ..., y(k - ny + 1), u(k), ..., u(k - nu + 1)] (1) where k is the discrete time index; f( · ) is a general nonlinear function; y(k) and u(k) represent the controlled output and the manipulated input, respectively; and ny and nu denote the corresponding process orders. Equation 1 can be rewritten compactly as y(k + 1) ) f [Y(k), u(k), U(k - 1)] ) f [x(k)]
(2)
where Y(k) ) [y(k) · · · y(k - ny + 1)] and U(k - 1) ) [u(k 1) · · · u(k - nu + 1)] are vectors consisting of the past process outputs and inputs, respectively. As a representative empirical modeling method, NNs have been intensively studied in the field of nonlinear modeling and control of chemical processes.5-12 However, as mentioned previously, NNs still retain some disadvantages. SVM, being representative of the general SKL methodology, has been recognized as a promising method for solving classification and regression problems. This approach makes predictions based on a linear combination of kernel functions defined on a subset of training samples called support vectors. This sparse kernel representation avoids overfitting to the training data and increases the generalizability of the identification model. Hence, SKL has attracted more attention with the development of SVMs.13 The basic idea of the SKL framework for regression problems is first to project the input vectors onto a high-dimensional feature space, namely, the so-called reproducing kernel Hilbert space, by a nonlinear mapping and then to perform a linear regression in this feature space. The “kernel trick” is employed to counter the curse-of-dimensionality problem.13-16 After identification with the training set SN ) {(x1, y1),.. ., (xN, yN)}⊂ X × R, the process prediction ym(k + 1) can be formulated as NSV
ym(k + 1) ) SKL[x(k), r] )
∑ R K〈x(i), x(k)〉 + b i
(3)
i)1
where r ) [R1 · · · RNSV]T is an NSV-dimension weight vector and NSV (which is always much less than N) is the number of
subsets of the training samples; 〈 · , · 〉 denotes the dot product and K〈x(i), x(k)〉 is a kernel function that handles the inner product in feature space, so that the explicit form of nonlinear mapping does not need to be known; and b is the bias term, which is not always necessary.13-16 The SVM regression (denoted simply as SVR) and sparse LS-SVM regression can both be described as being in this form. Additionally, the relevance vector machine,13 as an alternative SKL algorithm with Bayesian technique, can also be formulated in eq 3. Thus, these SKL algorithms can be utilized to identify the process model. Sparseness is generally regarded as a good feature for the learning machine. Using a sparse model, the predictions for the new samples depend only on the kernel function evaluated with a subset of the training samples. This means that the memory requirement is small and, correspondingly, that the computation time on the test samples is very short.13 As a result, a controller based on the SKL model will be simple and easy to implement. Remark 1. To implement SVR and other SKL algorithms, the regularization parameter and kernel parameters have to be selected. The parameters play a key role in the model’s performance. An offline reliable SKL model can be obtained using some efficient model selection methods, such as crossvalidation and Bayesian learning.13-17,26 Unfortunately, until now, there have been few parameter selection algorithms aimed for online learning.18,19 For this reason, most SVR-based modeling and control applications employ an offline model. Without loss of generality, an offline SVR model obtained through cross-validation is utilized as the basic implementation of SKL in this work. 2.2. Control Strategy Formulation with Polynomial Kernel. Let yr(k) be the desired process output at time k. It can be directly employed as the set point ysp or other formulations.4 Then, the control law can be obtained by minimizing the onestep-ahead weighted predictive control performance index (CPI)8 min J[u(k)] ) [yr(k + 1) - y(k + 1)]2 + λ[u(k) - u(k - 1)]2 (4) where λ > 0 is the control effort weighting factor, representing a penalty factor associated with the increment in the manipulated variable ∆u(k) ) u(k) - u(k - 1). Actually, y(k + 1) can be substituted by the estimation ym(k + 1) to form a new CPI. A simple and useful strategy to compensate for model mismatch by the SKL identification model and other unknown disturbances is employed here.4 That is, by adding the error e(k) ) y(k) - ym(k) to the predictive output ym(k + 1),4 a corrected prediction yp(k + 1) is obtained yp(k + 1) ) ym(k + 1) + he(k)
(5)
where h is the error correction coefficient and, in most cases, h ) 1.4 Then, the CPI becomes J[u(k)] ) [E(k + 1)]2 + λ[u(k) - u(k - 1)]2
(6)
where E(k + 1) ) yr(k + 1) - ym(k + 1) - e(k) is the total error of the SKL model at time k. The basic idea and general framework of SKL-OPC is to obtain the SKL model and other related terms first and then to achieve the manipulated input u(k) by minimizing the CPI in eq 6. It is difficult to find the optimal solution for the CPI because it generally needs to be solved by a nonlinear optimization method. For process control, it is desirable to obtain a simple and explicit expression of the control law, even though it might
Ind. Eng. Chem. Res., Vol. 49, No. 17, 2010
be suboptimal, so that the computation requirement can be reduced for real-time implementation.8 Two traditional methods for designing simple controllers have been proposed in the literature.8,10 Tan and Cauwenberghe10 suggested an approximate gradient descent method to iteratively optimize the CPI and then obtain the solution β1 ∂E(k + 1) u(k) ) u(k - 1) E(k + 1) 1 + λβ1 ∂u(k)
|
u(k) ) u(k - 1) +
| [ | ∂f ∂u(k)
∂f λ+ ∂u(k)
ym(k + 1) )
(7)
]
2
) )
ny+1(i)
u(k) + 〈x¯(i), x¯(k)〉 + 1]p + b
NSV
∑ C ∑ R [x
ny+1(i)
∑ C ∑ R [x
ny+1(i)]
j p
j)0 p
i
u(k)]p-j[〈x¯(i), x¯(k)〉 + 1]j + b
i)1 NSV
j p
j)0
i
[〈x¯(i), x¯(k)〉 + 1]j[u(k)]p-j + b
p-j
i)1
(11) where the combination Cjp is given by Cjp ) p!/[j!(p - j)!]. For simplicity, some terms in eq 11 are denoted as
u(k))u(k-1)
βj ) Cjp
{yr(k + 1) -
∑ R [x
p-j [〈x¯(i), x¯(k)〉 ny+1(i)]
i
i)1
+ 1]j,
0ejep-1
(12) f[x˜(k)]} (8)
∑ R exp[-||x(i) - x(k)|| /σ ] + b 2
2
i
(9)
i)1
For the polynomial kernel, namely, K(xi,xj) ) (〈xi,xj〉 + 1)p, where the integer p denotes the polynomial degree, the prediction is NSV
∑ R [〈x(i), x(k)〉 + 1] i
i
NSV
u(k))u(k-1)
NSV
ym(k + 1) )
∑ R [x i)1 p
u(k))u(k-1)
where β2 is an adjustment parameter to compensate the approximate error. f[x˜(k)] is the quasi-one-step-ahead predictive output with x˜(k) defined as x˜(k) ) [Y(k),u(k - 1),U(k - 1)], and ∂f/[∂u(k)]|u(k))u(k-1) is the input-output sensitivity function. To implement this control law, ∂f/[∂u(k)]|u(k))u(k-1) and f[x˜(k)] must be calculated online. Gao et al.8 employed a diagonal recurrent NN as the identification model and also to estimate these two quantities for the design of the controller. These two approaches can be employed under the SKL-OPC framework to develop their corresponding “kernel-like” control strategies and will also be investigated in the following sections. However, both of these approaches can be regarded as indirect methods (i.e., using some kind of approximation to avoid numerical difficulty and obtain the control strategy), regardless of the NN or SKL model used. To overcome this shortcoming, a direct method to design the simple control strategy also under the SKL-OPC framework is proposed herein. The Gaussian kernel and the polynomial kernel are two common kernel functions.15,16 Using the Gaussian kernel, namely, K(x1, x2) ) exp(-||x1 - x2||/σ2), where σ is the kernel width, the SKL model can be formulated as ym(k + 1) )
in the following section. Therefore, the polynomial kernel is used here to derive an SKL-OPC algorithm as a dominating extension for practical implementation. Let xj(k) ) [Y(k),U(k - 1)] and denote xny+1(i) as the (ny + 1)-th item of x(i) vector. Then, eq 10 can be reformulated as NSV
where β1 > 0 is the optimizing step. Tan and Cauwenberghe10 employed a forward NN to identify the process and then investigated its convergence and stability analysis. Gao et al.8 utilized the Taylor linearization method with respect to the argument u(k) at the point u(k - 1) to obtain an approximate control law as follows β2
8211
p
+b
(10)
i)1
When applying the SKL method to obtain the identification model, it is difficult to get the optimal and analytical solution for eq 6 when the Gaussian kernel is utilized because its kernel dot product makes it difficult to separate the manipulated variable from the kernel function. To overcome this problem, an approximate method should be applied.21,24 On the contrary, using the polynomial kernel, the solution is easier to obtain because of its special formulation, as will be further discussed
NSV
βp )
∑ R [〈x¯(i), x¯(k)〉 + 1]
p
i
+ b + he(k) - yr(k + 1)
i)1
(13) Consequently, the CPI at time k is formulated as
{∑
}
2
p
J[u(k)] )
βj[u(k)]p-j
j)0
+ λ[u(k) - u(k - 1)]2 (14)
Interestingly, this CPI is an even-degree polynomial function of the input u(k), and thus, it is easy to derive the optimal solution. Let ∂J[u(k)]/∂u(k) ) 0. Then, the solution is obtained as 2p-1
∂J[u(k)]/∂u(k) )
∑ A [u(k)]
2p-1-l
l
)0
(15)
l)0
{
where the corresponding quantities are defined as Al )
∑ β β (p - m), ) ∑ β β (p - m) + λ ) ∑ β β (p - m) - λu(k - 1) j m
l ) 0, ..., 2p - 3
j+m)l
A2p-2
j m
j+m)2p-2
A2p-1
j m
j+m)2p-1
(16)
Notice that eq 15 is a real polynomial equation of odd degree and thus has at least one real root. The desired manipulated input at each sampling time can be directly obtained because efficient numerical methods are available.27 Consequently, the convergence of this control strategy can be guaranteed in the view of numerical computation. The candidates of the manipulated input at time k can be expressed as u(k) ) {ui(k),i ) 1, ..., L}, where ui(k) is the ith real root of eq 15 and L is the total number of the real roots. The experimental study indicates a polynomial degree of p ) 2 or p ) 3 is always approving. This means that at most a degree-5 polynomial equation is to be solved, which can be easily achieved by reliable numerical methods.27
8212
Ind. Eng. Chem. Res., Vol. 49, No. 17, 2010
Figure 1. SKL-OPC framework and its primary implementation with the polynomial kernel.
In practical applications, the manipulated input and its increment are subject to some constraints
{
umin(k) e u(k) e umax(k) ∆umin(k) e ∆u(k) e ∆umax(k)
(17)
eq 15) is extended to solve a series of odd-degree polynomial equations for MIMO processes. A simple two-input process is investigated in section 3.1 to demonstrate the validity of the SVR-PKR controller for multi-input process. 2.3. Related Control Strategies under the SKL-OPC Framework. As mentioned before, the two approaches for designing controllers8,10 can be employed under the SKL-OPC framework to form two corresponding control strategies. Both of them are also developed here using the SVR model and are denoted as SVR-GD (the gradient descent optimization method) and SVR-TL (the Taylor linearization method). These two control strategies are formulated below in comparison with SVRPKR. Using the Gaussian kernel, the corresponding terms in eqs 7 and 8 can be obtained as NSV
E(k + 1) ) yr(k + 1) -
2
2
i
i)1
By considering these constraints, the optimal manipulated input at time k can be formulated as
b - e(k) (19)
NSV
∑ R exp(-||x(i) -
{
∂f 2 ) 2 ∂u(k) u(k))u(k-1) σ
umin(k) ) max[u(k - 1) + ∆umin(k), umin(k)] umax(k) ) min[u(k - 1) + ∆umax(k), umax(k)]
Using the polynomial kernel, the related terms in eqs 7 and 8 can also be obtained as
umin(k), for ui(k) < umin(k) u(k) ) arg min(J[ui(k)]), for ui(k) ∈ [umin(k), umax(k)] umax(k), for ui(k) > umax(k) where
∑ R exp(-||x(i) - x(k)|| /σ ) -
{
i
i)1
x(k)||2 /σ2)[xny+1(i) - u(k - 1)] ) -
(18)
How these inequality constraints can be combined into the CPI directly, however, is beyond the scope of this work and not discussed here. To some extent, the weighting factor λ in eq 14 can be set relatively large to overcome this obstacle. The whole SKL-OPC diagram is shown in Figure 1, where TDL denotes a tapped delay line and GTDL is defined as a general TDL, through which x(k) can be obtained. SKL-OPC consists of two primary modules: One is the SKL identification module that provides the one-step-ahead prediction of the process, and the other is the controller. The flowchart of this simple strategy is as follows: A corrected prediction, yp(k + 1), is first obtained by adding the latest error e(k) to the prediction of SKL, ym(k + 1). Then, the polynomial CPI is formulated, and finally, the process manipulated input u(k) is obtained. When a polynomial degree of p ) 2 and SVR are used, the proposed control strategy under the SKL-OPC framework is equivalent to the strategy proposed by Zhong et al.20 Thus, the proposed SKL-OPC method is more general, and other extended algorithms can be derived based on this framework. From many results reported in the literature, SKL is suitable for the modeling of nonlinear processes. In addition, the training of SKL is much simpler than that of NNs and other modeling methods. Another advantage of SKL is that it has a sparse solution, which makes the controller easy to design and implement. The proposed control strategy adopts a polynomial kernel and utilizes the rooting method to obtain the optimal process input u(k), thus this method is denoted as SVR-PKR for abbreviation. Because the global optimal solution is obtained by the SVR-PKR method, the convergence of the proposed control strategy can also be guaranteed. Because some SKL modeling methods are available for multiinput-multi-output (MIMO) cases,18,19 the extension of the proposed control strategy to MIMO processes can be formulated in a straightforward manner. Solving a simple root problem of an odd-degree polynomial equation for SISO processes (i.e.,
∂E(k + 1) (20) ∂u(k) u(k))u(k-1)
NSV
E(k + 1) ) yr(k + 1) -
∑ R [〈x(i), x(k)〉 + 1] i
p
- b - e(k)
i)1
(21) NSV
∑
∂f )p Ri[〈x(i), x(k)〉 + τ]p-1xny+1(i) ) ∂u(k) u(k))u(k-1) i)1 ∂E(k + 1) (22) ∂u(k) u(k))u(k-1) All three proposed control strategies under the SKL-OPC framework, namely, SVR-PKR, SVR-GD, and SVR-TL, are shown in Figure 2 for comparison. In the left part of Figure 2 is the SVR-PKR control strategy, where only the polynomial kernel can be used. SVR-GD and SVR-TL, as the other two approximate methods, are shown in the right part of Figure 2. Both the Gaussian kernel and the polynomial kernel can be utilized for these methods, and altogether four control algorithms can be formulated. SVR-PKR is an optimal control strategy because the desired manipulated input obtained originally by minimizing the CPI can be directly transformed to solve a simple root problem of an odd-degree polynomial equation. Conversely, the other two controllers are approximate methods and, thus, are suboptimal. Moreover, as for SVR-TL and SVR-GD, the selection of the optimizing step β1 (as well as the adjustment parameter β2) is important because it influences the controller’s convergence and performance. Consequently, the SVR-TL and SVR-GD controllers need to tune two parameters simultaneously, which is troublesome in practice. Based on the analysis, SVR-PKR is generally superior and more efficient than both of the other methods. Thus, in the following section, a strong emphasis is placed on the investigation of the SVR-PKR controller to show its advantages for process control. 3. Illustrative Examples and Discussions The key characteristics of the proposed SVR-PKR controller can be summarized as follows:
Ind. Eng. Chem. Res., Vol. 49, No. 17, 2010
8213
Figure 2. Comparison of different SVR-based control strategies within the SKL-OPC framework.
(1) The SVR-PKR controller simple and reliable. It is easier to obtain a reliable nonlinear predictive model by an SKL modeling method, for example SVR, than by traditional empirical methods. Additionally, the sparse model allows the controller to be implemented and updated efficiently. (2) It provides an optimal control strategy. The optimal control strategy can be obtained because of the special structure of the polynomial kernel, compared to the Gaussian kernel. The convergence of SVR-PKR can be guaranteed inherently as a root problem. Moreover, SVR-PKR can be simply implemented because efficient numerical approaches are available for solving the roots of a polynomial equation. (3) It offers transparent tuning. After the identification model has been obtained, only one parameter, namely, λ, is userdefined. Other related methods, such as SVR-GD and SVRTL, require the tuning of at least two parameters simultaneously. The effect of λ on the controller performance is transparent. A larger λ value implies more penalties on ∆u(k), and vice versa. Therefore, λ can be easily tuned. Two processes are investigated herein to illustrate the implementation of the SVR-PKR control strategy. All three critical controller performance attributes of robustness, set-point tracking, and disturbance rejection are evaluated through various simulations. The simulation environment is Matlab V7.1 with a CPU main frequency 1.7 GHz and 512 MB of memory. The Matlab function root.m is utilized to solve the roots. To assess the identification performance more insightfully, two performance indices, namely, the root-mean-square error (RMSE ) N {∑k)1 [ym(k) - y(k)]2/N}1/2) and the maximum absolute error [MAE ) max|ym(k) - y(k)|,for k ) 1, ..., N) are considered. Also, the integral absolute error (IAE) of set-point tracking is employed to quantify the performance characteristics of all controllers. 3.1. Simple Two-Input Process. A simple two-input process is first illustrated to demonstrate the SVR identification procedure and the performance of the SVR-PKR controller. This twoinput process is described by28 y(k + 1) )
0.8y(k) + u1(k) + 0.2u2(k) [1 + y(k)]2
(23)
The output and input vectors are chosen as y(k + 1) and x(k) ) [y(k) u1(k) u2(k)], respectively. A sequence of 100
Figure 3. Training performance comparison of SVR and NN models.
samples constitutes the training set, where both inputs are generated randomly over a large space. In addition, a sequence of 150 samples constitutes the testing set over a larger space, where u1(k) follows a sinusoidal wave and u2(k) follows a cosine wave. In this case, both training and testing sets are noise-free. An offline SVR identification model is obtained using the 10-fold cross-validation approach.15 Using the polynomial kernel, the regularization parameter γ ) 500, the tolerance of ε-loss function ε ) 0.05, and the polynomial degree p ) 3 are chosen, and altogether, 37 samples are support vectors. It takes about 20 min to obtain the SVR model. The well-known multilayer perceptron NN is utilized here for comparison. A three-layer network with five units in the hidden layer is obtained using the efficient Levenberg-Marquardt training method.11 Detailed comparison results, including the RMSE and MAE indices, of SVR and NN for the training set are shown in Figure 3. The SVR and NN methods can both model the training set well, and the NN method can achieve a slightly better identification performance than SVR. However, the SVR method can obtain much better performance than NN for the test set, according to the comparison results of SVR and NN shown in Figure 4. As mentioned
8214
Ind. Eng. Chem. Res., Vol. 49, No. 17, 2010
Figure 4. Testing performance comparison of SVR and NN models using different input sequences and a larger space.
previously, the correlation structure for the training data is very different than that of the the test data. Thus, the two methods do not perform well compared to training. The performance of the NN degrades more when the samples are beyond the range of the training data. The extrapolation of SVR is better than that of the NN because NN models are often overfitting and suffer difficulties with generalization. In addition, the SVR modeling procedure can be implemented easily. Therefore, compared to other data-driven modeling methods, SVR is a more suitable empirical modeling approach. The SVR-PKR control strategy can be designed based on the trained SVR model. A single parameter is tuned for SVRPKR, namely, λ. The SVR-PKR controller yields more conservative performance when λ increases, and vise versa. Thus, the SVR-PKR controller can be tuned in a guided manner. The parameter λ ) 0.001 is used for this process by simulation. To provide a suitable comparison with traditional techniques, a well-tuned PID controller28 is used. A sequence of Gaussian noise is also introduced into the process output. The set-point tracking results and IAE indices of the SVR-PKR and PID controllers are shown in Figure 5. The SVR-PKR controller has better tracking performance than does the PID controller. In addition, both of the manipulated inputs of SVR-PKR are acceptable because no excessive moves exist. Hence, based on the obtained results of the simple two-input process, the SVR identification method is superior to the NN approach and the SVR-PKR controller generally outperforms a PID controller. 3.2. Highly Nonlinear CSTR Process. To further evaluate the reliability and practicability of the proposed SVR-PKR controller, a challenging CSTR process,6 which is known for its significant nonlinear behavior and exhibits multiple steady states, is considered in this section. Three scenarios are explored: tracking of normal and low concentrations, tracking of high concentrations, and rejection of unmeasured disturbances. This complex CSTR process has been studied with other control strategies, including an NN-based nonlinear internal model control,7 an NN adaptive controller,12 an SVR-MPC strategy,23 and a multilinear model control strategy.29 Figure 6 presents a schematic of this CSTR process, where an exothermic irreversible first-order reaction takes place. The concentration, Ca, of product leaving the reactor is controlled by manipulating the coolant flow qc through the
Figure 5. Tracking performance comparison of SVR-PKR and PID for a two-input process.
Figure 6. Schematic of the CSTR process.
jacket. Under mass and energy balances, the dynamics of this CSTR process can be described as follows
{
dCa(t) q -E ) [Ca0(t) - Ca(t)] - k0Ca(t) exp dt V RT(t) k0∆H q dT(t) -E ) [T0 - T(t)] C (t) exp + (24) dt V FCp a RT(t)
[
{
[
-ha FcCpc qc(t) 1 - exp FCpV qc(t)FcCpc
]}
[
]
]
[Tc0 - T(t)]
The nominal conditions for a product concentration of Ca ) 0.1 mol/L are T ) 438.54 K and qc ) 103.41 L/min. The nominal values of the variables in the above equations and other information refer to Nahas et al.6 Under the input constraint, that is, 90 L/min e qc e 110 L/min, the control objective is to regulate Ca by manipulating qc. A well-tuned PID controller by Nahas et al.6 with parameters (Kc, Ti, Td) ) (190, 0.056, 0.827) is used here for comparison. 3.2.1. Modeling Performance: Training and Test. A training set of only 100 samples, which is much less than the number used by the NN method,7 consists of a series of open-loop step changes generated with a sampling period of 6 s. According to the literature,6,7,23 the input vector x(k) ) [Ca(k), Ca(k - 1), Ca(k - 2), qc(k), qc(k - 1), qc(k - 2)] is chosen. To simulate the real process as closely as possible, significant changes in the feed rate (q), the feed temperature (T0), the feed concentration (Ca0), and the coolant temperature (Tc0) are considered in the modeling. Also, the measured process output (i.e., the product concentration Ca) is corrupted by Gaussian
Ind. Eng. Chem. Res., Vol. 49, No. 17, 2010
Figure 7. Training performance comparison of SVR, Zhong et al., and NN models for a CSTR.
Figure 8. Testing performance comparison of SVR, Zhong et al., and NN models for a CSTR.
noise. Moreover, to evaluate the performance of empirical modeling approaches, a sequence of 100 samples is generated. The test set covers a larger space than the training set, and the changes in those variables and the measurement noise are more significant. With the polynomial kernel of SVR, the parameters [γ, ε, p] ) [200, 0.002, 3] are selected. Altogether, 41 samples are defined as support vectors. Two other data-based modeling methods are provided here for comparison. One is SVR with the quadratic polynomial kernel employed by Zhong et al.20 (denoted as the Zhong et al. method), and the other is the NN11 as previously utilized. The corresponding identification results of the SVR, Zhong et al., and NN methods for the training set are shown in Figure 7. All three data-based empirical models can fit the training data in the error-in-variable environment. Nevertheless, as shown in Figure 8, the test results of all of the methods are degraded markedly, compared to the training shown in Figure 7. The predictions might be inaccurate when the testing samples are beyond the training range. From all of the modeling results of the SVR, Zhong et al., and NN methods shown in Figures 7 and 8, one can conclude
8215
Figure 9. Changes in process variables for tracking of normal, low, and high concentrations.
Figure 10. Tracking performance comparison for normal and low concentrations.
that the SVR modeling method is better than the NN method because its ability to extrapolate is better. Therefore, the SVRbased empirical modeling method can improve the limitations of traditional NN-based methods. 3.2.2. Scenario I: Tracking of Normal and Low Concentrations. As investigated by Nahas et al.,6 this CSTR process exhibits extremely nonlinear dynamical behavior. Moreover, the steady-state gain for +10% changes is about 55% greater than for -10% changes. This means that the normal and low zones of Ca can be relatively easily controlled. The changes of process variables are shown in Figure 9, where the magnitudes are almost the same as in the test data. Under these operating conditions, tracking results of the SVRPKR and PID controllers for the normal and low zones are shown in Figure 10. In this case, λ ) 1 × 10-5. The SVRPKR controller yields an overall better performance for all set points. In contrast, the response of the PID controller usually exhibits oscillatory behavior for set-point changes. In addition, the manipulated input of SVR-PKR does not perform excessive moves, whereas the input of the PID controller changes frequently. Details about the control performance indices are presented in Table 1. For this CSTR process, a sparse identification model
8216
Ind. Eng. Chem. Res., Vol. 49, No. 17, 2010
Table 1. Control Performance Comparison of SVR-PKR and Other Strategies for the CSTR Process operating conditions normal and low zonesa
a
high zonesb
noise and disturbance rejectionc
control strategy
IAE
CPU time (s)
IAE
CPU time (s)
IAE
CPU time (s)
SVR-PKR Zhong et al. PID
1.397 1.613 1.582
0.13 0.14 0.03
1.870 2.609 2.342
0.13 0.14 0.03
0.835 1.442 1.211
0.13 0.14 0.03
Plotted in Figure 10. b Plotted in Figure 11. c Plotted in Figure 13.
Figure 11. Tracking performance comparison for high concentrations.
with a 41% sparsity rate is obtained. Consequently, the control action moving one step takes the SVR-PKR and PID controllers about 0.13 and 0.03 s, respectively, both of which are much shorter than the sampling period (i.e., 6 s). The SKL modelbased controller makes its implementation with a small computation load and, thus, is applicable for process control. 3.2.3. Scenario II: Tracking of High Concentrations. As mentioned in the preceding section, this CSTR process exhibits more severely nonlinear behavior in the high-concentration zone. The changes of process variables for this case are also almost the same as those shown in Figure 9. As can be observed in Figure 11 and Table 1, the SVR-PKR controller still yields better tracking performance than the PID controller, despite the fact that this set-point tracking problem is more difficult than scenario I. The tracking ability of the PID controller becomes degraded when the product concentration is larger than 0.11 mol/L. Additionally, the SVR-PKR controller shows favorable performance because the moves of manipulated variables are acceptable. The results again verify that the SVR-PKR controller can perform well when several types of uncertainties are considered simultaneously. Here, the tuning parameter λ is chosen as 3 × 10-5, which is somewhat greater than that used for scenario I. Thus, the controller can be easily designed by adjusting a suitable parameter λ, resulting in more practical implementation for process engineers. In contrast, the tuning of the PID controller’s parameters is more complex, especially for severely nonlinear processes. 3.2.4. Scenario III: Rejection of Unmeasured Disturbances. To mimic the unmeasured disturbances in real processes, two process variables are varied (i.e., the feed rate q becomes larger and the feed concentration Ca0 becomes smaller) in this scenario. In addition, the changes in the feed temperature (T0) and the measurement noise in the product concentration are more significant. As shown in Figure 12, changes of these two
Figure 12. Changes in process variables to show an unmeasured disturbance.
Figure 13. Performance comparison of unmeasured disturbance rejection.
variables (i.e., q and Ca0) take on the role of the unmeasured disturbance, unlike in scenarios I and II, which are both shown in Figure 9. The responses of three controllers (SVR-PKR, Zhong et al., and PID) are shown in Figure 13. The SVR-PKR controller again achieves the best performance both in noise effects and in unmeasured disturbance rejection. The parameter λ ) 5 × 10-4 is larger than the values employed in both scenario I and scenario II. Therefore, it is suggested that a larger λ value should be selected to reject noise and disturbances. From these three scenarios, it can be confirmed that the adjustment of λ is systematic and simple. Because the identification model with the quadratic polynomial kernel is not very suitable for this process (mainly shown
Ind. Eng. Chem. Res., Vol. 49, No. 17, 2010
in Figure 8), the response of the Zhong et al. controller does not perform well, especially when the unmeasured disturbance becomes marked. Therefore, the controller developed by Zhong et al.20 is valid only when the identification model with the quadratic polynomial kernel can describe the nonlinear process well. From all of the obtained results shown in Figures 9-13 and Table 1, it is demonstrated that the proposed SVR-PKR controller not only achieves better system response than the conventional PID controller, but also ensures that the desired attributes are obtained when unmeasured disturbances exist. Therefore, the SVR-PKR method provides an alternative solution for process control. Remark 2. The objective of this work was to show the main attributes, namely, robustness, set-point tracking, disturbance rejection, and practical implementation, of the simple SVR-PKR controller. Thus, the well-known PID controller suitable for process control (with related well-tuned parameters reported in the literature) was mainly utilized for comparison. Even though some other controllers with better performance than PID controllers might exist, they were not considered here. 4. Conclusions This work has addressed the subject of designing efficient and simple controllers for chemical processes, especially those with severe nonlinearities and unmeasured disturbances. A reliable SKL model can be obtained more easily using less training data than NN methods. Then, a nonlinear control framework named as SKL-OPC is proposed, and different strategies are developed. The proposed SVR-PKR control strategy is a natural extension and special implementation of the SKL-OPC framework. Because of the special structure of the polynomial kernel, the manipulated input can be separated out from the CPI. Consequently, the optimal manipulated input can be directly obtained by minimizing the CPI and solving the roots of an odd-degree polynomial equation using existing efficient numerical methods. Furthermore, the tuning parameter directly related to the performance attributes can be transparently adjusted. The obtained results for a simple two-input process and a severely nonlinear CSTR process demonstrate that the SVRPKR controller generally outperforms PID and other related control strategies also under the SKL-OPC framework. The three critical controller performance attributes of robustness, set-point tracking, and disturbance rejection were all evaluated. The aspect of practical implementation is also investigated. Therefore, this simple SVR-PKR controller provides an alternative control approach for nonlinear chemical processes. Acknowledgment This work was supported by the National High Technology R&D Program (“863” Program) of China (No. 2009AA04Z126), the National Natural Science Foundation of China (No. 20776128), and the Alexander von Humboldt Foundation of Germany (Dr. Haiqing Wang). The authors also thank the reviewers for helpful comments in improving the quality of this article greatly. Abbreviations CPI ) control performance index CSTR ) continuous stirred tank reactor GD ) gradient descent IAE ) integral absolute error
8217
LS-SVM ) least-squares support vector machine MAE ) maximum absolute error MIMO ) multi-input-multi-output NN ) neural network OPC ) one-step-ahead predictive control PID ) proportional-integral-derivative PKR ) polynomial kernel and rooting SISO ) single-input-single-output SKL ) sparse kernel learning SVM ) support vector machine SVR ) support vector regression RMSE ) root-mean-square error TL ) Taylor linearization
Literature Cited (1) Ogunnaike, B. A.; Mukati, K. An alternative structure for next generation regulatory controllers. Part I: Basic theory for design, development and implementation. J. Process Control 2006, 16, 499–509. (2) Mukati, K.; Rasch, M.; Ogunnaike, B. A. An alternative structure for next generation regulatory controllers. Part II: Stability analysis, tuning rules and experimental validation. J. Process Control 2009, 19, 272–287. (3) Froisy, J. B. Model predictive control: Building a bridge between theory and practice. Comput. Chem. Eng. 2006, 30, 1426–1435. (4) Qin, S. J.; Badgwell, T. A. A survey of industrial model predictive control technology. Control Eng. Practice 2003, 11, 733–764. (5) Himmelblau, D. M. Accounts of experiences in the application of artificial neural networks in chemical engineering. Ind. Eng. Chem. Res. 2008, 47, 5782–5796. (6) Nahas, E. P.; Henson, M. A.; Seborg, D. E. Nonlinear internal model control strategy for neural network models. Comput. Chem. Eng. 1992, 16, 1039–1057. (7) Lightbody, G.; Irwin, G. W. Nonlinear control structures based on embedded neural system models. IEEE Trans. Neural Networks 1997, 8, 553–567. (8) Gao, F. R.; Wang, F. L.; Li, M. Z. A simple nonlinear controller with diagonal recurrent neural network. Chem. Eng. Sci. 2000, 55, 1283–1288. (9) Kambhampati, C.; Mason, J. D.; Warwick, K. A stable one-step-ahead predictive control of non-linear systems. Automatica 2000, 36, 485–495. (10) Tan, Y. H.; Cauwenberghe, A. V. Nonlinear one-step-ahead control using neural networks: Control strategy and stability design. Automatica 1996, 32, 1701–1706. (11) Nørgaard, M.; Ravn, O.; Poulsen, N. K.; Hansen, L. K. Neural Networks for Modelling and Control of Dynamic Systems; Springer-Verlag: London, 2000. (12) Krishnapura1, V. G.; Jutan, A. A neural adaptive controller. Chem. Eng. Sci. 2000, 55, 3803–3812. (13) Bishop, C. M. Pattern Recognition and Machine Learning; Springer-Verlag: New York, 2006. (14) Suykens, J. A. K.; Van Gestel, T.; De Brabanter, J.; De Moor, B.; Vandewalle, J. Least Squares Support Vector Machines; World Scientific: Singapore, 2002. (15) Scho¨lkopf, B.; Smola, A. J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond; MIT Press: Cambridge, MA, 2002. (16) Vapnik, V. The Nature of Statistical Learning Theory; SpringerVerlag: New York, 1995. (17) Yan, W. W.; Shao, H. H.; Wang, X. F. Soft sensing modeling based on support vector machine and Bayesian model selection. Comput. Chem. Eng. 2004, 28, 1489–1498. (18) Liu, Y.; Hu, N. P.; Wang, H. Q.; Li, P. Soft chemical analyzer development using adaptive least-squares support vector regression with selective pruning and variable moving window size. Ind. Eng. Chem. Res. 2009, 48, 5731–5741. (19) Wang, H. Q.; Li, P.; Gao, F. R.; Song, Z. H.; Ding, S. X. Kernel classifier with adaptive structure and fixed memory for process diagnosis. AIChE J. 2006, 52, 3515–3531. (20) Zhong, W. M.; He, G. L.; Pi, D. Y.; Sun, Y. X. SVM with polynomial kernel function based nonlinear model one-step-ahead predictive control. Chin. J. Chem. Eng. 2005, 13, 373–379. (21) Zhang, H. R.; Wang, X. D. Nonlinear systems modeling and control using support vector machine technique. In Computer SciencesTheory and Applications; Grigoriev, D.; Harrison, J.; Hirsch, E. A., Eds.; Lecture Notes in Computer Science Series; Springer-Verlag: New York, 2006; Vol. 3967, pp 660-669.
8218
Ind. Eng. Chem. Res., Vol. 49, No. 17, 2010
(22) Bao, Z. J.; Pi, D. Y.; Sun, Y. X. Nonlinear model predictive control based on support vector machine with multi-kernel. Chin. J. Chem. Eng. 2007, 15, 691–697. (23) Iplikci, S. Support vector machines-based generalized predictive control. Int. J. Robust Nonlinear Control 2006, 16, 843–862. (24) Zhang, R. D.; Wang, S. Q. Support vector machine based predictive functional control design for output temperature of coking furnace. J. Process Control 2008, 18, 439–448. (25) Kulkarni, A.; Jayaraman, V. K.; Kulkarni, B. D. Control of chaotic dynamical systems using support vector machines. Phys. Lett. A 2003, 317, 429–435. (26) Cawley, G. C.; Talbot, N. L. C. Preventing over-fitting during model selection via Bayesian regularisation of the hyper-parameters. J. Mach. Learn. Res. 2007, 8, 841–861.
(27) Karris, S. T. Numerical Analysis Using MATLAB and Spreadsheets, 2nd ed.; Orchard Publications: Fremont, CA, 2004. (28) Liu, J. K. AdVanced PID Control and MATLAB Simulation; Publishing House of Electronics Industry: Beijing, 2004. (29) Du, J. J.; Song, C. Y.; Li, P. Multilinear model control of Hammerstein-like systems based on an included angle dividing method and the MLD-MPC strategy. Ind. Eng. Chem. Res. 2009, 48, 3934–3943.
ReceiVed for reView October 2, 2009 ReVised manuscript receiVed April 30, 2010 Accepted July 2, 2010 IE901548U