Multiple-Model Based Linear Parameter Varying Time-Delay System

Jun 2, 2014 - Local models varying from one operating point to another are first described by finite impulse response (FIR) models. To handle missing ...
0 downloads 0 Views 671KB Size
Article pubs.acs.org/IECR

Multiple-Model Based Linear Parameter Varying Time-Delay System Identification with Missing Output Data Using an Expectation-Maximization Algorithm Weili Xiong,†,‡ Xianqiang Yang,¶,‡ Biao Huang,*,‡ and Baoguo Xu† †

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, Jiangsu 214122, China ‡ Department of Chemical and Materials Engineering, University of Alberta, Edmonton, Alberta T6G 2G6, Canada ¶ Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin, Heilongjiang 150080, China ABSTRACT: This paper is concerned with the identification problems of the linear parameter varying (LPV) system with missing output in the presence of the time-delay. A multiple-model approach is adopted. Local models varying from one operating point to another are first described by finite impulse response (FIR) models. To handle missing output and time-delay, the expectation-maximization (EM) algorithm is utilized to estimate the unknown parameters and the time-delay simultaneously. Output Error (OE) models are widely used in controller design. Therefore, the auxiliary model principle is employed to recover the OE models based on the initially identified FIR models. The EM algorithm is then used again to refine the unknown parameters of the OE models with the complete data set to obtain the final global model. Simulation examples are presented to demonstrate the performance of the proposed method. directly to estimate the unknown parameters. Laurain et al.5 transformed the LPV Box−Jenkins model into a multiple-input single-output (MISO) LTI model, and the refined instrumental variable (RIV) method was extended to estimate the model parameters. However, the work mentioned above typically requires the sufficient manipulation of the input signal within the whole operating range which can be costly or even not possible to realize in practice. The multiple-model approach has been proposed to handle this problem.6 In the multiple-model approach, one local linear model is identified for the process operating around each preselected working point and the global model is constructed by aggregating the outputs from the local models using a certain composition strategy. Meanwhile, by assuming smooth transitions between different operating points, the input signal is only required to excite the local process dynamic persistently which is easier to implement in practice. Venkat et al.7 considered the identification and control problems of the complex nonlinear processes. The multiplemodel approach was employed to identify a model for the process, and a fuzzy aggregation strategy was proposed for model interpolation which enabled smooth transitions between working points. Zhu and Xu8 put forward a method to identify a multiple-model LPV model, which was suitable for model predictive control (MPC) of the nonlinear process. Cubic spline functions were used for model interpolation to derive a global LPV model and the parameters of the cubic spline functions were estimated based on all the data by minimizing the output error cost function. The work by Jin and Huang6 is

1. INTRODUCTION The growing scale and complexity as well as automation degrees of industrial processes to meet various performance requirements, such as economic benefits, product quality, and safety as well as reliability, have made modern industrial process modeling and control a challenging task.1 Owing to the inherent nonlinearity and parameter varying property of modern processes, only one locally linearized model is not sufficient to describe the process operating in a wide range.2 The process model changes with time as the working point shifts from one point to another in order to fulfill different production tasks. For process analysis and controller design, it is vital to derive a model describing the global process dynamics. To achieve this target, great efforts have been made to develop the modeling methodology that makes the derived process model effective within a wider operating range. Among the available nonlinear modeling approaches in the literature, the linear parameter varying (LPV) modeling approach, in which the nonlinear system is approximately described by a model with parameters that vary as a function of the working points or scheduling variables, has drawn great attention.2 Many LPV modeling approaches have been reported in the literature. An LPV modeling method was proposed by Shamma et al.3 in their studies of gain scheduled control for LPV plants. The main idea was to estimate a local linear time-invariant (LTI) model for each selected working point in the range of the process dynamics, and the parameters were interpolated to account for the dynamics between neighboring working points. The final global LPV model was a linear model with varying model parameters. Bamieh and Giarre4 proposed to rewrite the LPV model with polynomial dependence of the parameters in inner product form so that the least mean squares (LMS) and the recursive least-squares (RLS) algorithms could be used © 2014 American Chemical Society

Received: Revised: Accepted: Published: 11074

January 13, 2014 April 23, 2014 June 2, 2014 June 2, 2014 dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083

Industrial & Engineering Chemistry Research

Article

to have a finite impulse response (FIR) time-delay model structure and the identification problems are formulated using the EM scheme. Both missing data and time-delay are considered here, and the local model parameters, validity widths of the weighting functions and the time-delay are estimated simultaneously in this algorithm. Since the order of the FIR time-delay model is typically high and the model coefficients are sensitive to the noise, the estimated global FIR model is only an intermediate model. Output Error (OE) models are widely used in controller design.9 Therefore, the EM algorithm is used again to estimate parameters of the local time-delay OE models based on the estimated FIR models and process data. The auxiliary model principle is adopted and the estimated global FIR model serves as the auxiliary model to give the noise-free output data in OE modeling. The remainder of this paper is organized as follows: The identification problem of LPV time-delay systems with missing data is described in section 2. A brief review of the EM algorithm and the description of LPV time-delay model identification with local FIR model structure and missing data are given in section 3. Section 4 presents the mathematical derivations of a global LPV time-delay OE model based on the identified LPV model in section 3. Section 5 employs three numerical examples to demonstrate the effectiveness of the proposed method. Conclusions are drawn in section 6.

an important step toward multiple-model identification of LPV systems. The modeling problems were handled using the expectation-maximization (EM) algorithm in which the model identity of the output was treated as a hidden variable. The local models were assumed to have an ARX model structure and the normalized exponential function was used for model interpolation. Parameters of each local model and the validity width of the exponential functions were estimated simultaneously in this algorithm. The missing data problem and the time-delay in data are commonly encountered in the process industry and should be handled carefully because of their negative effects on process modeling. The missing data problem could result in the incomplete data set and loss of information which may further lead to the bias estimation in the system identification. The improper treatment of the time-delay in identification process may lead to an overparameterization problem, ill-conditioned matrix, and biased estimates and so on. Moreover, the controller designed based on the process model with inaccurate time-delay may not meet the performance requirement of the control system or even make the closed loop system unstable. The conventional multimode modeling methods cannot be applied to the imperfect data set caused by missing data and the time-delay problem directly. There are many reasons behind the missing data problem such as sudden mechanical faults, hardware measurement failures, data transmission malfunctions, and losses in network communication.10 The common ways to deal with missing data are case deletion, mean substitution, EM algorithm, last observation carried forward method, etc.11 Among them, due to the good feature of statistical properties and simplicity to realize, the EM algorithm has been widely studied.6 For example, Deng and Huang10 addressed the identification problem of an LPV system. The local model was taken to have a nonlinear state space model structure and the missing data was handled using the EM algorithm. A particle filter was applied to compute the approximations of the expectation functions in the E-step. Because of the use of particle filtering, the method is complicated and time-consuming. Another factor that is commonly experienced in the modeling process is the time-delay which can be caused by material transport, signal transmission, measurement delay, and so on. The common methods to handle time-delay are nonparametric methods (such as step test, correlation analysis, etc.), grid searching method, and so on. Essentially, time-delay and other model parameters are estimated separately, not simultaneously, in these methods. Xie et al.12 proposed a new EM algorithmbased approach to estimate an FIR model for multirate processes. The input data were sampled in fast rate while the output were only available in slow rate with time-delay. Their method estimated model parameters and the time-delay of the measurements simultaneously. The estimated FIR model was then used to generate estimates of the unknown noise-free output based on which an OE model was recovered by using the EM algorithm again. However, they only considered the multiple sampling rate time invariant or parameter invariant system with no missing data. The current article extends the work of Xie et al.12 by considering the identification of multimode LPV systems with constant time delay and missing data that can be used to approximate practical nonlinear systems. The multiple-model approach is adopted here and the normalized exponential function is chosen as the weighting function which allows for smooth transitions between neighboring working points. The local models are first assumed

2. PROBLEM STATEMENT A large number of chemical and biological processes often operate on a predesigned operating trajectory with multiple working points. The variable of the working points determines the current dynamic behavior of the process and it can be referred to as the scheduling variable. Here, we use {wt,t = 1,···,N} to denote the measured value of the scheduling variable at each time instant and use {Ti,i = 1,···,M} to denote the predesigned working points, where M is the number of working points and N is the sampling number. Consider the following multiple mode LPV input−output model: M

yt =

∑ ηi(wt )fi (ut− τ) + et

(1)

i=1

where f i(·) denotes the ith local deterministic model, ηi(·) is the corresponding weighting function, and et is the Gaussian white noise with zero mean and variance σ2. To achieve smooth transitions between working points, the normalized exponential function is selected as the weighting function which is defined as

(

exp ηi(wt ) =

M

2 −(wt − T) i

2(oi)2

(

∑i = 1 exp

)

2 −(wt − T) i

2(oi)2

)

(2)

where oi denotes the validity width of the ith local model bounded from omin to omax. Since part of the output data is missing completely at random, the FIR model is first selected as the model structure of the local model at each working point. Therefore, the local deterministic model f i(ut−τ) can be expressed by 11075

dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083

Industrial & Engineering Chemistry Research

Article

can be constructed as Cobs = {Yobs,U,W} and the missing data set Cmis can be constructed as Cmis = {Ymis,I,τ}. The likelihood function of the complete data set is first decomposed based on the chain rule:

fi (ut − τ ) = a1iu(t − τ − 1) + ··· + aniu(t − τ − n) = [u(t − τ − 1)···u(t − τ − n)][a1i···ani]T = ϕτT(t )θi

(3)

p(Cobs , Cmis|Θ) = p(U , Y , W , I , τ |Θ)

where

= p(Y |U , W , I , τ , Θ)p(U , W , I , τ |Θ)

ϕτ (t ) ≜ [u(t − τ − 1)···u(t − τ − n)]T ∈ n × 1

= p(Y |U , W , I , τ , Θ)p(I |U , W , τ , Θ)

θi ≜ [a1i···ani]T ∈ n × 1

× p(τ |U , W , Θ)p(U , W |Θ)

Here, ϕτ(t) denotes the information vector, θi is the parameter of the ith local FIR model and n is the order of the local FIR model. Since the measurement delay is considered here, all the local models have a common time-delay. We assume that the time-delay, τ, follows a prior uniform distribution in the range of [τmin, τmax] and we have p (τ = d ) =

1 , τmax − τmin + 1

d ∈ [τmin , τmax ]

Since τ follows a uniform distribution and it is independent of input U, scheduling variable W, and model parameters Θ, the probability of τ taking any value in [τmin, τmax] is a constant; the input U and the scheduling variable W are measurable and they are independent of model parameters Θ, p(U,W|Θ) is also a constant. Therefore, eq 7 can be simplified as p(U , Y , W , I , τ |Θ)

(4)

= p(Y |U , W , I , τ , Θ)p(I |U , W , τ , Θ)C

Therefore, the LPV time-delay model identification problem is transformed into estimating all the unknown parameters Θ = {θi,oi,σ2}i=1,···,M and time-delay τ based on the measured input and output data.

3. M-step: Calculate the new parameter estimate Θ maximizing Q(Θ|Θs) with respect to Θ; that is, Θ s + 1 = arg max Q (Θ|Θ s)

p(Y |U , W , I , τ , Θ) = p(y1: N |u1: N , w1: N , I1: N , τ , Θ) = p(yN |yN − 1:1 , u1: N , w1: N , I1: N , τ , Θ) × p(yN − 1:1|u1: N , w1: N , I1: N , τ , Θ) = p(yN |yN − 1:1 , u1: N , w1: N , I1: N , τ , Θ) × p(yN − 1|yN − 2:1 , u1: N , w1: N , I1: N , τ , Θ)··· × p(y2 |y1 , u1: N , w1: N , I1: N , τ , Θ) × p(y1|u1: N , w1: N , I1: N , τ , Θ)

N

by

p(Y |U , W , I , τ , Θ) =

∏ p(yt |ut− 1:1, It , τ , Θ) t=1

(6)

Similarly, p(I|U,W,τ,Θ) can be decomposed into

The procedure is carried out iteratively until the change in parameters after each iteration is within a specified tolerance level. The convergence of the EM algorithm has been proven by Wu.15 Identification Problem of LPV Time-Delay Model Based on EM Algorithm. The EM algorithm is employed to identify the LPV time-delay model given in section 2. We denote the input sequence U = {ut,t = 1,···,N}, the output sequence Y = {yt,t = 1,···,N} and the scheduling variable sequence W = {wt,t = 1,···,N}. We also introduce a hidden variable I = {It,t = 1,···,N} in which It denotes the model identity of yt. Since parts of the output data are missing completely at random, the output data Y is divided into two parts: the observed output Yobs = {yt,t = t1,···,tα} and the missing output Ymis = {yt,t = m1,···,mβ}. Therefore, the observed data set Cobs

p(I |U , W , τ , Θ) = p(I1: N |u1: N , w1: N , τ , Θ)

Θ

(9)

Based on eqs 1 and 3, the output yt is only dependent on the previous input data ut−1:1, time-delay τ, the model identity It and the model parameters Θ. Therefore, eq 9 can be rewritten as

(5) s+1

(8)

where C ≜ p(τ|U,W,Θ)p(U,W|Θ) which does not play a role in the following derivations. The density function p(Y|U,W,I,τ,Θ) can be further decomposed into

3. PARAMETER ESTIMATION FOR LPV TIME-DELAY MODEL WITH LOCAL FIR MODEL STRUCTURE USING EM ALGORITHM EM algorithm revisited. The EM algorithm is an effective approach to derive the maximum likelihood estimates of process model parameters in dealing with the incomplete-data problem. The core idea behind the EM algorithm is to introduce hidden or missing variables to make the maximum likelihood estimates tractable.13 We denote Cmis as the missing data set and Cobs as the observed data set. To find new estimates, the expectation step (E-step) and the maximization step (M-step) should be applied iteratively. The EM algorithm proceeds as follows:14 1. Initialization: initialize the value of the model parameter vector Θ0 2. E-step: Given the parameter estimate Θs obtained in the previous iteration, calculate the Q-function Q (Θ|Θ s) = ECmis|Cobs , Θs{log p(Cobs , Cmis)|Θ)}

(7)

(10)

= p(IN |IN − 1:1 , u1: N , w1: N , τ , Θ) × p(IN − 1|IN − 2:1 , u1: N , w1: N , τ , Θ)··· × p(I2|I1 , u1: N , w1: N , τ , Θ) × p(I1|u1: N , w1: N , τ , Θ) N

=

∏ p(It|It− 1:1, u1:N , w1:N , τ , Θ) t=1 N

=

∏ p(It|wt , Θ) t=1

11076

(11)

dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083

Industrial & Engineering Chemistry Research

Article

Q (Θ|Θ s) =



× log p(yt |ut − 1:1 ,It = i , τ = d , Θ) mβ

τmax

+



s

×



∑ log p(yt |ut− 1:1, It , τ , Θ) mβ

N

+

t=1



= EYmis , I , τ|Cobs , Θ {∑ log p(yt |ut − 1:1 ,It , τ , Θ) s

t = t1 mβ

∑ log p(yt |ut− 1:1,It , τ , Θ) t = m1 N

p(yt |ut − 1:1 ,It = i , τ = d , Θ)

(13)

t=1

(15)

To calculate Q(Θ|Θs), the following unknown terms should be calculated: 1. p(yt|ut−1:1, It = i, τ = d, Θ) 2. p(τ = d|Cobs,Θs) 3. p(It = i|Cobs, τ = d, Θs) 4. ∫ p(yt|Cobs, τ = d, It = i, Θs) log p(yt|ut−1:1 It = i, τ = d, Θ) dyt. On the basis of eq 3 and the Gaussian white noise assumption, the probability p(yt|ut−1:1, It = i, τ = d,Θ) can be written as

Q (Θ|Θ s) = ECmis|Cobs , Θs{log p(U , Y , W , I , τ |Θ)}

∑ log p(It|wt , Θ) + log C}

t=1 i=1

+ log C

Then, the conditional expectation of the log complete data density Q(Θ|Θs) can be written as

⎧ (yt − ϕdT(t )θi)2 ⎫ ⎪ ⎪ ⎬ − exp⎨ 2 ⎪ ⎪ 2 2σ 2πσ ⎩ ⎭ 1

=

Since the time-delay τ and the model identity It are discrete variables, the conditional expectation can be first taken over It and τ:

(16)

The posterior probability of time-delay τ taking value d can be obtained as

M

p(τ = d|Cobs ,Θ s) ∑ ∑

d = τmin

p(τ = d|Cobs ,Θ ) ∑ ∑

× p(It = i|Cobs ,τ = d , Θ s) log p(It = i|wt ,Θ)

(12)





M

s

d = τmin

+ log C

τmax

N

τmax

∑ log p(yt |ut− 1:1, It , τ , Θ) + ∑ log p(It|wt , Θ) t = m1



∫ p(yt |Cobs ,τ = d , It = i , Θs)

× log p(yt |ut − 1:1 ,It = i , τ = d , Θ)dyt

t = t1

+

t = m1 i = 1

× p(It = i|Cobs ,τ = d , Θ s)

t=1

+

M

∑∑

p(τ = d|Cobs ,Θ )

d = τmin

= log ∏ p(yt |ut − 1:1 , It , τ , Θ)p(It |wt , Θ)C

Q (Θ|Θ s) =

t = t1 i = 1

× p(It = i|Cobs ,τ = d , Θ s)

N

+

M

p(τ = d|Cobs ,Θ s) ∑ ∑

d = τmin

log p(U , Y , W , I , τ |Θ)

=



τmax

The derivation of the last equation in eq 11 is based on the fact that the model identity is only determined by the scheduling variable wt and parameters Θ according to eq 2. By substituting eqs 10 and 11 into eq 7 and taking the missing output into account, we can derive the log likelihood function of the complete data set as

p(τ = d|Cobs ,Θ s)

t = t1 i = 1

=

s

× p(It = i|Cobs ,τ = d , Θ )

p(Yobs|U , W , Θ s ,τ = d)p(τ = d|U , W , Θ s) p(Yobs|U , W , Θ s ,τ = d)p(τ = d|U , W , Θ s)

τ ∑dmax = τmin t

× log p(yt |ut − 1:1 ,It = i , τ = d , Θ)

=

τmax

+ EYmis|τ , I , Cobs , Θs{



×

p(τ = d|Cobs ,Θ ) =

M

N

M

t

∏tα= t ∑i = 1 p(yt , It = i|ut − 1:1,τ = d , wt , Θ s) 1

M

∏tα= t ∑i = 1 p(yt |ut − 1:1,τ = d , It = i , Θ s)p(It = i|wt ,Θ s) 1

τ

∑dmax =τ

M

t

min

∏tα= t ∑i = 1 p(yt |ut − 1:1,τ = d , It = i , Θ s)p(It = i|wt ,Θ s) 1

(17)

M

s

The posterior probability p(It = i|Cobs,τ = d,Θ ) can be derived from

p(τ = d|Cobs ,Θ s) ∑ ∑

d = τmin

t=1 i=1

p(It = i|Cobs , τ = d , Θ s) = p(It = i|Yobs ,U , W ,τ = d , Θ s)

s

× p(It = i|Cobs ,τ = d , Θ ) log p(It = i|wt , Θ) + log C

1

τ

∑dmax =τ

t

=

× log p(yt |ut − 1:1 , It = i , τ = d , Θ)} τmax

1

M

∏tα= t ∑i = 1 p(yt , It = i|ut − 1:1,τ = d , wt , Θ s) min

∑ ∑ p(It = i|Cobs ,τ = d , Θs)



t

∏tα= t p(yt |ut − 1:1,τ = d , wt , Θ s)

t

t = m1 i = 1

+

1

τ

∑dmax =τ

min

s

d = τmin mβ

∏tα= t p(yt |ut − 1:1,τ = d , wt , Θ s)

(14)

Finally, the conditional expectation is further taken over the missing observation Ymis: 11077

=

p(Yobs|U , W , τ = d , It = i , M ∑i = 1 p(Yobs|U , W ,τ = d , It =

=

p(yt |ut − 1:1, τ = M ∑i = 1 p(yt |ut − 1:1, τ

Θ s)p(It = i|U , W ,τ = d , Θ s) i , Θ s)p(It = i|U , W ,τ = d , Θ s)

d , It = i , Θ s)p(It = i|wt , Θ s) = d , It = i , Θ s)p(It = i|wt , Θ s)

(18)

dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083

Industrial & Engineering Chemistry Research

Article

Based on the definitions of the first and second moment, the integral term in eq 15 can be calculated as

Therefore, the Q-function can be rewritten as τmax



Q (Θ|Θ s) =

p(yt |Cobs ,τ = d , It = i , Θ s)

∫ p(yt |Cobs , τ = d , It = i , Θs) log

1 2πσ



τmax

+



p(τ = d|Cobs , Θ s)

d = τmin

+

1 1 log(2πσ 2) − ((σ s)2 + (ϕdT(t )θis)2 ) 2 2σ 2 1 1 (ϕT(t )θi)2 + 2 (ϕdT(t )θi)(ϕdT(t )θis) − σ 2σ 2 d 1 1 ((σ s)2 = − log(2πσ 2) − 2 2 2σ =−

1 log(2πσ 2) 2

1 ((σ s)2 + (ϕdT(t )θi − ϕdT(t )θis)2 ) 2σ 2

}

τmax



N

M

p(τ = d|Cobs , Θ s) ∑ ∑ t=1 i=1

d = τmin s

× p(It = i|Cobs , τ = d , Θ ) log p(It = i|wt , Θ) + log C

ϕdT(t )θis)2 )

(20)

To calculate the parameter estimates, the derivative of the Qfunction Q(Θ|Θs) is taken with respect to {θi,σ2}i=1:M and set to zero. Then we have

(19) t

τ

θî =

{

× p(It = i|Cobs , τ = d , Θ ) − −

M

∑∑ t = m1 i = 1

s

×(yt − ϕdT(t )θi)2 dyt



t = t1 i = 1

× log p(yt |ut − 1:1, It = i , τ = d , Θ)

2



+

M

× p(It = i|Cobs , τ = d , Θ s)

⎛ −(y − ϕT(t )θ )2 ⎞ i d t ⎟ dy × exp⎜⎜ ⎟ t 2 σ 2 ⎝ ⎠ 1 1 2 p(yt |τ = d , It = i , Cobs , Θ s) = − log(2πσ ) − 2 2 2σ

(ϕdT(t )θi



p(τ = d|Cobs , Θ s) ∑ ∑

d = τmin

× log p(yt |ut − 1:1 , It = i , τ = d , Θ) dyt =



∑dmax {p(τ = d|Cobs , Θ s)[∑tα= t p(It = i|Cobs , τ = d , Θ s)ϕd(t )y(t ) =τ min

1

N

τ

∑dmax {p(τ = d|Cobs , Θ s)[∑t = 1 p(It = i|Cobs , τ = d , Θ s)ϕd(t )ϕdT(t )]} =τ min

m

∑t =β m p(It = i|Cobs , τ = d , Θ s)ϕd(t )(ϕdT(t )θis)]} 1

+

N

τ

∑dmax {p(τ = d|Cobs , Θ s)[∑t = 1 p(It = i|Cobs , τ = d , Θ s)ϕd(t )ϕdT(t )]} =τ min

σ̂2 =

+

τ t M ∑dmax p(τ = d|Cobs , Θ s) ∑tα= t ∑i = 1 p(It = i|Cobs , τ = d , Θ s)(y(t ) − ϕdT(t )θi)̂ 2 =τ

τ ∑dmax = τmin

min

1

N p(τ =

m M d|Cobs , Θ s) ∑t =β m ∑i = 1 p(It 1

= i|Cobs , τ = d , Θ s)((ϕdT(t )θis − ϕdT(t )θi)̂ 2 + (σ s)2 ) N

τ ̂ = arg max p(τ = d|Cobs , Θ s) d

4. PARAMETER ESTIMATION FOR LPV TIME-DELAY MODEL WITH LOCAL OE MODEL STRUCTURE USING EM ALGORITHM Considering that the order of the FIR model is typically high and the model coefficients are sensitive to noise, the FIR models derived in the previous section are only suitable as intermediate models. Since the OE models are widely used in controller design, we will estimate the global LPV model, in which the local model has an OE model structure, based on the estimated global LPV model with local FIR model structure. Consider the following time-delay OE model

(23)

It is not possible to derive the analytical expressions for calculating the optimal local model validity width {Oi}i=1,2,···, M. The optimization problem searching for the optimal value of validity widths can be expressed as τmax



oi , i = 1,2, ···, M d = τ min N M

×

p(τ = d|Cobs , Θ s)

∑ ∑ p(It = i|Cobs , τ = d , Θs) log p(It = i|wt , Θ)

xt =

t=1 i=1

s.t.

omin ≤ oi ≤ omax ,

(22)

This constrained optimization problem can be solved using a function “fmincon” in the optimization toolbox of Matlab software.6

The time-delay can be obtained as

max

(21)

i = 1, 2, ··· , M

B(z −1) ut − τ A(z −1)

yt = xt + et

(24) 11078

(25)

dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083

Industrial & Engineering Chemistry Research

Article

where xt denotes the noise-free output. A(z−1) and B(z−1) are defined as

where ηi(wt) is calculated using eq 2 with the estimated {Oi}i=1,2,···, M determined in the previous section. Substituting xt with x̂t in φτ(t) and denoting the derived new regressor as φ̂ τ(t), we have

A(z −1) = 1 + a1z −1 + a 2z −2 + ··· + anaz −na

M

B(z −1) = b1z −1 + b2z −2 + ··· + bnbz −nb

(26)

yt ̂ =

The model parameters Θ̃ = {ϑi, σ̃2}i=1,2,···, M are estimated again using the EM algorithm. Since we have derived estimates of validity width O = {oi}i=1,2,···, M and time-delay τ in the previous section, these parameters are set to their estimates {Ô i}{i=1,2,···, M} and τ̂, respectively. Similar to eq 20, the Q-function is given by

xt = −a1xt − 1 − ··· − anaxt − na + b1ut − τ − 1 + ··· + bnbut − τ − nb = [−xt − 1··· − xt − naut − τ − 1···ut − τ − nb][a1···anab1···bnb]T

yt =

φτT(t )ϑ

s Q (Θ̃|Θ̃ ) =

+ et



(28) +

M

{

∑ ∑ p(It = i|Cobs , τ ̂, Ô , Θ̃s) t = m1 i = 1

− +



1 log(2πσ ̃ 2) 2

1 ((σ ̃ s)2 + (φ̂τT̂ (t )ϑi − φ̂τT̂ (t )ϑis)2 ) 2σ ̃ 2

}

N

M

∑ ∑ p(It = i|Cobs , τ ̂, Ô , Θ̃s) log p(It = i|wt , Ô) t=1 i=1

+ log C

M

∑ ηi(wt )φτT̂ (t )ϑî

(32)

The posterior probability of the model identity It can be calculated by

(29)

i=1

However, the noise-free output xt in φτ (t) is unmeasurable, so we cannot calculate ŷt by using eq 29 directly. Here, we adopt the auxiliary model principle to handle this problem and the estimated global LPV model from the previous section is used as the auxiliary model to predict the noise-free output xt. On the basis of eqs 1 and 3, we have

s p(It = i|Cobs , τ ̂, Ô , Θ̃ )

=

s p(yt |ut − 1:1, τ ̂, It = i , Θ̃ )p(It = i|wt , Ô ) s M ∑i = 1 p(yt |ut − 1:1, τ ̂, It = i , Θ̃ )p(It = i|wt , Ô )

(33)

Then, the parameter estimates of {ϑi, σ̃ }i = 1,2,···,M are determined to be eq 34 and 35, which are obtained by maximizing the new Q-function 32. 2

M

∑ ηi(wt )ϕτT̂ (t )θî

(30)

i=1

ϑî =

s s m t ∑tα= t p(It = i|Cobs , τ ̂, Ô , Θ̃ )φ̂τ (̂ t )yt + ∑t =β m p(It = i|Cobs , τ ̂, Ô , Θ̃ )φ̂τ (̂ t )(φ̂τT̂ (t )ϑis) 1

1

N ∑t = 1 p(It

2 σ∼̂ =

s = i|Cobs , τ ̂, Ô , Θ̃ )φ̂τ (̂ t )φ̂τT̂ (t )

1

M t ∑i = 1 ∑tα= t p(It 1

N s ̂ = i|Cobs , τ ̂, O , Θ̃ )(yt − φ̂τT̂ (t )ϑî )2

1 + a(wt )q−1

u t − τ + et

(35)

N

a(wt ) = −0.95 + 0.5wt + 0.1wt2

5. SIMULATION EXAMPLES A Numerical Simulation. Consider the LPV system described by the following discrete time LPV time-delay model: b(wt )q−1

(34)

s M m ∑i = 1 ∑t =β m (It = i|Cobs , τ ̂, Ô , Θ̃ )((φ̂τT̂ (t )ϑis − φ̂τT̂ (t )ϑî )2 + (σ ̃ s)2 )

+

yt =

M

× log p(yt |φ̂τ (̂ t ), It = i , Θ̃)

where φτ(t) = [−xt−1 ··· −xt−na ut−τ−1 ··· ut−τ−nb] . For the multiple mode LPV input−output model 1, f i(ut−τ) is assumed to have an OE model structure. Since the local model parameters are different for each local model, we use ϑi to denote the parameter vector for the ith local OE model. Therefore, the prediction ŷt of the process output can be expressed as

xt̂ =



∑ ∑ p(It = i|Cobs , τ ̂, Ô , Θ̃s) t = t1 i = 1

(27)

T

yt ̂ =

(31)

i=1

where na and nb are the order of the denominator and numerator polynomials and they are assumed to be known a priori. The time-delay OE model expressed in eq 25 can be rewritten as

= φτT(t )ϑ

∑ ηi(wt )φ̂τT̂ (t )ϑî

b(wt ) = 0.5 − 0.4wt + 0.01wt2

(37)

and the true time-delay τ is 10. The input signal is selected as the pseudo-random binary signal (PRBS). The system is tested at three operating points: wt = 0.1, wt = 0.5, and wt = 0.9; 1300 data samples are generated. The zero mean Gaussian white noise is added to the output and the signal-to-noise ratio

(36)

where 11079

dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083

Industrial & Engineering Chemistry Research

Article

∼̂ bias (BN) ∥Θ̃ − E(Θ)∥2 and variance (VN) of the estimates ∼̂ ∼̂ ∥E(Θ − E(Θ))∥2, are calculated and the results are presented in Table 3. It can be seen from this table that the bias of the

Table 3. Norm of the Bias and Variance of the Parameter Estimates at Different SNRs

(SNR) is 15dB. The scheduling variable data are presented in Figure 1. The proposed method is used to estimate an LPV time-delay model with local OE model structure. The estimated parameter estimates with output data missing at different percentages after 20 iterations are presented in Table 1. The relative error (RE)

0%

12.5%

25%

50%

a11 b11 a21 b21 a31 b31 τ RE

−0.8990 0.4601 −0.6750 0.3025 −0.4190 0.1481 10 X

−0.9233 0.4598 −0.6628 0.3061 −0.4408 0.1445 10 0.0265

−0.9158 0.4514 −0.6494 0.3102 −0.4504 0.1443 10 0.0343

−0.9180 0.4468 −0.6763 0.3123 −0.4650 0.1503 10 0.0395

−0.9357 0.4839 −0.7150 0.3084 −0.4406 0.1512 10 0.0478

Table 4. Mean and Standard Deviation of Estimates under Color Noise Condition a11 b11 a21 b21 a31 b31

between the estimates and the true value of the parameters is ∼̂ calculated by using the equation ∥Θ − Θ̃∥2/∥Θ̃∥2. It can be seen from this table that the estimates approach the true value of the parameters and the convergence speed decreases with the increase of the percentage of the missing output data. To test the performance of the proposed method under different noise levels, 100 Monte Carlo simulations are performed on the complete data set with different SNRs (5dB, 10dB, and 15dB). In each simulation, different noise realization is added to the output data. The mean value and standard deviation of the parameter estimates at SNR = 5dB are given in Table 2. In

true value

mean

SD

−0.8990 0.4601 −0.6750 0.3025 −0.4190 0.1481

−0.9251 0.4552 −0.6926 0.3145 −0.3518 0.1476

0.0156 0.0208 0.0356 0.0188 0.0779 0.0180

true value

mean

SD

−0.8990 0.4601 −0.6750 0.3025 −0.4190 0.1481

−0.9470 0.4586 −0.6682 0.3231 −0.4861 0.1298

0.0385 0.0349 0.0758 0.0356 0.1818 0.0403

It can be seen that the proposed method can also provide satisfying parameter estimates under color noise condition. Continuous Stirred Tank Reactor (CSTR). The CSTR is typically used as a benchmark example for the nonlinear process modeling problem. The first principle model of the CSTR can be expressed as2 ⎛ −E ⎞ q(t ) dCA(t ) = (CA0(t ) − CA(t )) − k 0CA(t ) exp⎜ ⎟ dt V ⎝ R T(t ) ⎠ (39)

⎛ −E ⎞ q(t ) (ΔH )k 0CA(t ) dT(t ) (T0(t ) − T(t )) − exp⎜ = ⎟ dt V ρCp ⎝ R T(t ) ⎠

Table 2. Mean and Standard Deviation of Estimates at SNR = 5 dB a11 b11 a21 b21 a31 b31

VN 0.0007 0.0022 0.0062

is added to the output. Here, the variance of white noise et is set to the same value as the simulations that generated the results shown in Table 2. 100 Monte Carlo simulations are performed with different realizations of vt. The mean value and standard deviation of the parameter estimates are presented in Table 4.

Table 1. Parameter Estimates with Data Missing at Different Percentage after 20 Iterations true value

BN 0.0356 0.0373 0.0755

estimates is close to zero, which indicates that the estimates converge to the true value of the parameters. To test the performance of the proposed method under color noise condition, the filtered white noise vt described by2 1 vt = et −1 1 − q + 0.2q−2 (38)

Figure 1. Scheduling variable data.

parameters

SNR 15 dB 10 dB 5 dB

+

⎧ ⎛ − hA ⎞⎫ ⎪ ⎪ ⎜⎜ ⎟⎟⎬ − qc(t )⎨ 1 exp ⎪ ⎪ q t C ( ) ρCpV ρ ⎝ c p ⎠⎭ ⎩ ρc Cpc

× (Tc0(t ) − T(t ))

(40)

where T(t) is the temperature of the reactor, CA(t) denotes the product concentration, and q(t) and qc(t) denote the inlet flow and the coolant flow rate, respectively. The values of the CSTR model parameters can be seen in Table 5. CA(t) and qc(t) are chosen to be the output and input variable, respectively. The objective is to estimate an LPV model between the coolant flow rate and product concentration. The scheduling variable is the

order to evaluate the quality of the estimates in the Monte Carlo simulations, two performance indexes, the norm of the 11080

dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083

Industrial & Engineering Chemistry Research

Article

Table 5. Steady State Values of CSTR Model Parameters10 parameters

steady state value

process flow rate (q) feed concentration of component A, CA0 feed temperature, T0 specific heats, Cp, Cpc liquid density, ρ, ρc heat of reaction, ΔH activation energy term, E/R reaction rate constant, k0 heat transfer term, hA the reactor volume, V inlet coolant temperature, Tc0

100 L/min 1 mol/L 350.0 K 1 cal/(g K) 1 × 103 g/L −2 × 105 cal/mol 1 × 104 K 7.2 × 1010 min−1 7 × 105 cal/(min K) 100 L 350.0 K

Table 6. Steady State Working Conditions of the CSTR2 working point 1: working point 2: working point 3:

Figure 3. Self-validation of the estimated global LPV model.

qc = 97, CA = 0.0795, T = 443.4566 qc = 100, CA = 0.0885, T = 441.1475 qc = 103, CA = 0.0989, T = 438.7763

coolant flow rate. Three working points are chosen and the steady state working conditions are given in Table 6. Since it takes time to measure the product concentration, a measurement delay of 6 min is added to the output. The process data for identification are generated by simulation using eqs 39 and 40. The measurement noise with zero mean and variance 4 × 10−7 is added to the output. The simulation data are shown in Figure 2. The proposed method is

Figure 4. Normalized weight of each local model.

Figure 2. Input and output data.

used to identify a multiple mode LPV model for this process. In this simulation, 25% output data is missing and the range of time-delay is set to [2, 10]. The estimated time-delay is 6 min which is consistent with the true time-delay. The self-validation results are presented in Figure 3 and the corresponding normalized weight is shown in Figure 4. The actual step responses and step responses of the estimated local models are presented in Figure 5. Note that the time-delay is not drawn in this figure. To further verify the accuracy of the estimated LPV model, cross-validation, in which three other operating points, T1 = 98 L/min, T2 = 101 L/min, T3 = 104 L/min are chosen, is performed and the results are shown in Figure 6. The comparisons are performed with the casewise deletion (CD) method and the regression substitution (RS) method. Since these two methods can only be used to estimate the LTI model,

Figure 5. Step responses of the estimated local models. The solid lines present the actual step responses and the dashed lines show the estimated step responses.

data at the operating point T1 = 97 L/min are used for model identification and the self-validation results are shown in Figure 7. It can be seen from this figure that these three methods perform well in self-validation. The cross-validation is then performed to test the prediction capability of the models when the operating points shifting happens and the results are shown in Figure 8. The data at operating point T1 = 101 L/min are used in cross-validation. It can be seen from this figure that the identified LPV model by using the proposed method 11081

dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083

Industrial & Engineering Chemistry Research

Article

described by the following system of nonlinear differential equations: dC b/dt = −FdC b + μC b dCs/dt = Fd(Cfs − Cs) − μC b/YCb/Cs dCp/dt = −FdCp + (2.2μ + 0.2)C b

(41)

where Cb, Cs, Cp denote biomass, substrate and product concentration, respectively. The dilution rate Fd, and the concentration of feed substrate Cfs, are available as manipulated inputs. μ is the specific growth rate which is defined as μ=

Figure 6. Cross-validation of the estimated global LPV model.

μm (1 − Cp/Pm)Cs k m + Cs + Cs2/K i

(42)

The description of the process variables and the values are summarized in Table 7. The biomass concentration (Cb) is Table 7. Initial Setting Conditions and Model Parameters parameters

value

the dilution rate, Fd feed substrate concentration, Cfs Concentration of biomass, Cb Concentration of substrate, Cs Concentration of production, Cp yield of cell mass, YCb/Cs

input1 input2 output1 output2 output3 0.4 g/g

maximum specific growth rate, μm product saturation constant, Pm substrate saturation constant, km substrate inhibition constant, Ki

0.48 h−1 50 g/L 1.2 g/L 22 g/L

selected as the output variable while the input variable is the feed substrate concentration (Cfs). The objective is to model the system dynamics within the whole operating region from 0 g/L to 40 g/L. The dilution rate Fd is fixed at the value of 0.1636 h−1. The measurement delay with a period of 7.5 h is added to the output. Five working points are selected according to Venkat et al.7 which are 6.6 g/L, 16.5 g/L, 22.7 g/L, 28.1 g/L, 35.7 g/L, respectively. Assume the output data are disturbed by white noise with zero mean and variance of 0.01. The training set data are shown in Figure 9. The proposed method is used to

Figure 7. Performance comparison of different missing data treatment methods in self-validation.

Figure 8. Performance comparison of different missing data treatment methods in cross-validation.

achieves better prediction performance compared with the models identified by using the other two methods. Continuous Fermentation Reactor. The fermentation process is another nonlinear biochemical process, and the fermenter is a common unit in fermentation industry. It is assumed that the volume of the fermenter is constant, and the contents are mixed sufficiently by a completely stirred agitation system.16 The continuous fermentation process can be

Figure 9. Input and output data in this simulation.

estimate a multiple-model LPV model for this process. In this simulation, 20% output data are missing and the delay 11082

dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083

Industrial & Engineering Chemistry Research



Article

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Tel.: +1-780-492-9016. Fax: +1780-492-2881. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors acknowledge financial support from the National Natural Science Foundation of China (Nos: 21206053, 21276111, 61273131), and partial support by the 111 Project (B12018), International Joint Research Laboratory for Process Control at the Jiangnan University, and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD). We also gratefully acknowledge the support of NSERC and AITF.

Figure 10. Self-validation of the estimated global LPV model.



searching range is set to [1.5, 15]. The estimated time-delay is 7.5 h which is same as the true measurement delay. The results of the self-validation are presented in Figure 10. Crossvalidation is also performed and five other operating points, T1 = 9.5 g/L, T2 = 18 g/L, T3 = 21 g/L, T4 = 29.1 g/L, T5 = 37 g/L, are selected. The cross-validation results are shown in Figure 11. All of them demonstrate good performance of the proposed identification approach.

REFERENCES

(1) Yin, S.; Luo, H.; Ding, S. X. Real-time implementation of faulttolerant control systems with performance optimization. IEEE Trans. Ind. Electron. 2013, 61 (5), 2402−2411. (2) Zhao, Y.; Huang, B.; Su, H.; Chu, J. Prediction error method for identification of LPV models. J. Process Control 2012, 22 (1), 180−193. (3) Shamma, J.; Athans, M. Guaranteed properties of gain scheduled control for linear parameter varying plants. Automatica 1991, 27 (3), 559−564. (4) Bamieh, B.; Giarre, L. Identification of linear parameter varying models. Int. J. Robust Nonlinear Control 2002, 12 (9), 841−853. (5) Laurain, V.; Gilson, M.; Toth, R.; Garnier, H. Refined instrumental variable methods for identification of LPV Box-Jenkins models. Automatica 2010, 46 (6), 959−967. (6) Jin, X.; Huang, B.; Shook, D. S. Multiple model LPV approach to nonlinear process identification with EM algorithm. J. Process Control 2011, 21 (1), 182−193. (7) Venkat, A. N.; Vijaysai, P.; Gudi, R. D. Identification of complex nonlinear process based on fuzzy decomposition of the steady state space. J. Process Control 2003, 13 (6), 473−488. (8) Zhu, Y.; Xu, Z. A method of LPV model identification for control. 17th IFAC World Congress, Seoul, Korea, July 6−11, 2008, pp 5018− 5023. (9) Xu, Z.; Zhao, J.; Qian, J.; Zhu, Y. Nonlinear MPC using an identified LPV model. Ind. Eng. Chem. 2009, 48 (6), 3043−3051. (10) Deng, J.; Huang, B. Identification of nonlinear parameter varying systems with missing output data. AIChE J. 2012, 58 (11), 3454−3467. (11) Khatibisepehr, S.; Huang, B. Dealing with irregular data in soft sensors: Bayesian method and comparative study. Ind. Eng. Chem. Res. 2008, 47 (22), 8713−8723. (12) Xie, L., Yang, H., Huang, B. (2013). FIR Model Identification of Multirate Processes with Random delays using EM algorithm. AIChE J. DOI 10.1002/aic.14147. Published online in Wiley Online Library (wileyonlinelibrary.com). (13) Dempster, A. P.; Laird, N. M.; Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc., Ser. B 1977, 39 (1), 1−38. (14) McLachlan, G. J.; Krishnan, T. The EM Algorithm and Extensions, 2nd ed.; John Wiley and Sons, Inc.: Hoboken, NJ, USA, 2008. (15) Wu, J. On the convergence properties of the EM algorithm. Ann. Stat. 1983, 11 (1), 95−103. (16) Henson, M. A.; Seborg, D. E. Nonlinear control strategies for continuous fermenters. Chem. Eng. Sci. 1992, 47 (4), 821−835.

Figure 11. Cross-validation of the estimated global LPV model.

6. CONCLUSION The multiple-model identification of LPV time-delay systems with missing output data is considered. The identification problem is solved using the EM algorithm. The parameter varying process property, time-delay, and missing data are handled simultaneously. The local model is first taken as an FIR model structure. The estimated local FIR models are aggregated using the normalized exponential functions to construct a global LPV model. Owing to the advantage of the OE model to represent process dynamics, the estimated local FIR models are utilized as a starting point for the OE modeling. Thereafter, OE models are recovered using the EM algorithm again. In the simulations, two benchmark examples are adopted to illustrate the effectiveness of the proposed method. The results show that the proposed method provides satisfactory results for identifying the LPV time-delay systems with missing output data. 11083

dx.doi.org/10.1021/ie500175r | Ind. Eng. Chem. Res. 2014, 53, 11074−11083