Article pubs.acs.org/IECR
Detecting Model−Plant Mismatch of Nonlinear Multivariate Systems Using Mutual Information Gui Chen,† Lei Xie,*,† Jiusun Zeng,*,‡ Jian Chu,† and Yong Gu† †
State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou 310027, People's Republic of China College of Metrology and Measurement Engineering, China Jiliang University, Hangzhou 310018, People's Republic of China
‡
ABSTRACT: For model-based control systems such as model predictive control (MPC), the model plays a key role in controller design and tuning. Performance of the controller heavily relies on the model quality so that detection of the model− plant mismatch (MPM) becomes an important task. For linear systems, many approaches have been proposed while little work has been done for MPM detection of nonlinear systems. Considering the widespread nonlinearities, mutual information (MI), as a general correlation measure, is utilized to detect MPM in this paper. The MI between dithering signals and model residuals is directly estimated to reveal process MPM. It is found that changes in the disturbance dynamics will not affect the MI, so that changes in the plant and disturbances can be easily distinguished. For large scale muliple-input−multiple-output (MIMO) systems, reidentification of the model is costly as keeping a large number of inputs in a perturbed or excited state leads to loss of normal production time. Hence it would be highly desirable to locate the subsets of mismatch so that only subsystems have to be perturbed for model updating. In this work, an MI matrix corresponding to the model error is defined to unveil mismatched submodels. This provides important information to narrow the number of potential root causes. The performance of the proposed approach is illustrated by application studies involving two simulation examples and an industrial polypropylene (PP) production process.
1. INTRODUCTION
In a real industrial process, both the process and the disturbance dynamics may change from time to time. How to distinguish plant change from disturbance change is a challenging problem.6 A method that distinguishes between plant and disturbance changes can sufficiently narrow the number of causes of performance deterioration. Model reidentification is unnecessary if only disturbance change is detected, and the disturbance characteristics can be estimated from normal operational data during system maintenance. For this purpose, the two-model-divergence algorithm initially developed by Basseville and Nikiforov7 was used for model validation by Jiang et al.6 and Huang and Tamayo.8 A method through the analysis of Kalman’s innovation to discriminate the disturbance and plant−model mismatch was also proposed by Harrison and Qin9 under the state space framework. Once a significant gap between the plant and model is detected, the model should be updated through reidentification. In practice, it would be highly desirable to identify the part or subsystem of the plant (or model) where significant mismatch occurs so that only inputs or set points related to that part need to be perturbed if the demand for reidentification arises.10 The cross-correlation method proposed by Stanfelj et al.11 was extended to muliple-input−multiple-output (MIMO) systems12 to detect specific mismatched submodels. Partial correlation was used to decorrelate a specific channel and the rest of the manipulated variables (MVs),10 which successfully removed the influence of interrelationship among MVs in MIMO systems.
Control performance assessment (CPA) or monitoring plays an important role in modern control systems. Many methods have been developed during the past two decades since the groundbreaking study by Harris.1 An excellent overview was published by Jelali2 including a summary of the available commercial packages. Roughly speaking, CPA finds the gap between the current control performance and the optimal or user-acceptable performance. Once the unacceptable or decreased performance is detected, root cause diagnosis will be an essential step for the improvement of performance. In model-based controllers, models are commonly employed to approximate the character of interest of the controlled process. However, process characteristics may change after a certain time so that the original model cannot capture the feature of the process, which leads to mismatch between the model and the plant. Hence it is important to detect the model−plant mismatch (MPM) efficiently in model-based control systems. In plants running on advanced control schemes, MPM has now become an essential factor in the performance of the control loops.3 In MPM detection, a benchmark specific to model predictive controllers was presented by Patwardhan and Shah,4 which compared the achieved and designed objective functions of the model predictive controller. In contrast, prediction errors before and after controller detuning were compared to detect MPM by Kesavan and Lee.5 It was found that the prediction error will be affected in the case of MPM while remaining unaffected by the controller detuning. However, none of these methods can distinguish between process mismatch and disturbance model mismatch. © 2013 American Chemical Society
Received: Revised: Accepted: Published: 1927
March 8, 2012 December 30, 2012 January 3, 2013 January 5, 2013 dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
For Gaussian random variables, the mutual information reduces to17
The methods mentioned above work well in linear systems, but none of them were designed for nonlinear systems. In many cases, nonlinear models are commonly used to improve the performance of nonlinear model predictive control (MPC) systems. On the other hand, even for linear MPCs, optimization over inputs subject to hard constraints leads immediately to nonlinear control.13 For some kinds of MPM, e.g., MPM caused by valve stiction, a nonlinear relationship may exist between the inputs and the residuals even when the process and model are both linear. To deal with system nonlinearity, a method based on mutual information is proposed in this study. This method has the following properties: (i) it is designed for closed loop conditions; (ii) it is sensitive to change in plant dynamics and insensitive to change in disturbance dynamics; (iii) the subset of mismatched models can be found through the proposed procedures. The rest of the article is organized as follows. Section 2 gives the preliminaries of mutual information (MI), its estimation, and determination of a critical value for MI. In section 3, we develop a MPM detection and localization method for nonlinear MIMO systems, and an intuitive example of MI as a nonlinear correlation measure is also presented. After two numerical examples in section 4, section 5 presents an industrial application. Finally, concluding remarks appear in section 6.
1 I(X ; Y ) = − log(1 − ρ2 ) 2
where ρ denotes the correlation coefficient between X and Y. If ρ = 0, X is independent of Y, and the mutual information is 0. If ρ = ±1, X and Y are perfectly correlated and the mutual information is infinite. 2.1. MI Estimation. Due to the extensive usage of MI, its estimation attracts much attention. The most commonly used estimators include k-nearest neighbor (KNN),18 kernel density estimator (KDE),19 and adaptive partitioning of the XY plane. A detailed review can be found in Khan et al.20 and Papana and Kugiumtzis,21 where the performances of different estimators are compared and analyzed. Khan et al. argues that KNN is the best for short data at relatively low noise levels. Papana and Kugiumtzis21 conclude that KNN is the most stable method. Compared to other estimators, it is insensitive to methodspecific parameters. The basic idea of KNN is to use the distance of k-nearest neighbors to estimate the joint and marginal densities by estimating the entropy value from the average distance to the knearest neighbor, averaged over all data points. For two random variables X and Y, let ε(i)/2 be the distance between (xi,yi) and its kth neighbor denoted by (xki ,yki ). Let εx(i)/2 and εy(i)/2 denote the distances between the same points projected into the X and Y subspaces. We have ε(i) = max{εx(i),εy(i)}. nx(i) and ny(i) are the numbers of points with ∥xi − xj∥ ≤ εx(i)/2 and ∥yi − yj∥ ≤ εy(i)/2. Then MI can be estimated as18
2. PRELIMINARIES In many areas such as engineering, economics, and physics, it is important to measure the correlation between variables. One of the most commonly used measures is Pearson’s correlation coefficient, which can reflect the linear relationship between two variables. For nonlinear relationships, Pearson’s correlation coefficient does not work well. Zhang et al.14,15 argued that there are four types of nonlinearities and used omnidirectional cross-correlation functions (ODCCFs) to detect nonlinear relationships. However, since there are many kinds of nonlinearity, restricting them to four types is problematic. In this study, we consider a more general measure, i.e., mutual information, for the purposes of MPM detection and localization. Given two random variables X and Y, the samples of which are given as {x(1),x(2),...,x(t)} and {y(1),y(2),...,y(t)}, the mutual information between X and Y is defined as16 I (X ; Y ) = H (X ) + H (Y ) − H ( X , Y )
I (̂ X ; Y ) = ψ (k) − + ψ (N )
∫ p(x) log(p(x)) dx
H (Y ) = −
∫ p(y) log(p(y)) dy
H (X , Y ) = −
1 1 − k N
N
∑ [ψ (nx(i)) + ψ (ny(i))] i=1
(4)
Here ψ(x) is the digamma function, ψ(x) = Γ(x)−1 dΓ(x)/dx, where ψ(x + 1) = ψ(x) + 1/x and Γ(x) is the ordinary gamma function. The function ψ(x) satisfies ψ(1) = −C, where C = 0.5772156... is the Euler−Mascheroni constant. As is shown by Kraskov et al.,18 KNN is data efficient, is adaptive, and has minimal bias. Moreover, it is computationally efficient. In our experiment, it took less than 1 s to compute the normalized MI value between two variables with a data length of 2000 on a PC (with an Intel Core Duo 2.5 Hz). 2.2. Determination of Critical Value for MI. In practice, the estimated mutual information values between two independent variables will be close to but not exactly zero. Hence, similar to the case of the Pearson linear correlation coefficient, a critical value should be determined before using mutual information as a test statistic. The problem of critical value determination is closely related to the independence test. Much work has been done on this issue.22,23 For example, Palus and Vejmelka23 used a surrogate data method for the independence test. Roughly speaking, surrogate data are artificially generated data that preserve statistical properties of the original data but are randomized so that any possible coupling is removed. For surrogate time series construction, the iterative amplitude adjusted Fourier transform (iAAFT)24,25 was used in this paper. The surrogate data generated by this method preserve the power spectrum as well as the distribution of the original data set.24
(1)
where H(·) denotes the Shannon entropy or joint entropy with corresponding (joint) probability density functions (PDFs) p(x), p(y), and p(x,y) as H (X ) = −
(3)
(2)
∫ p(x , y) log(p(x , y)) dx dy
In eq 2 “log” means the logarithm with base 2 without any annotation in this paper. The mutual information defined in eq 1 is the average information (in the unit of bits) of X that can be predicted by Y. Note that MI corresponds to the intersection of the information in X with the information in Y. It measures the dependence in the sense that I(X;Y) ≥ 0 with equality if and only if X is independent of Y. 1928
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
Figure 1. Closed loop system: IMC structure.
For a time series {x(t), t = 1, ..., N}, the linear properties are specified by the squared amplitudes of the Fourier transform: N
⎛ i2πkt ⎞ ⎟ ∑ x(t ) exp⎜⎝ N ⎠
1 N
2
| S(k )| =
IJoe(X ; Y ) =
(5)
NMI =
The algorithm starts with a random shuffle of x(t) denoted by r0(t). At each iteration stage n(n = 0,1, ...), there are three steps: Step 1. Perform the Fourier transform on rn(t): R n(k) =
N
1 N
⎛ i2πkt ⎞ ⎟ N ⎠
∑ rn(t ) exp⎜⎝ t=1
1 N
N
⎛ i2πkt ⎞ ⎟ N ⎠
k=1
(7)
Step 3. Derive cn(t) by sorting sn(t) by magnitude in ascending order; let
rn + 1(t ) = cn(t )
(8)
Repeat the above steps until a convergence criterion is reached, i.e., rn+1(t) = rn(t) under a certain accuracy. Compared to the standard AAFT algorithm, the iAAFT algorithm requires heavier computation; however, it can reduce the false rejections of the null hypothesis caused by autocorrelation in the data. Hence it is more suitable for dynamic data commonly confronted in process industries. Surrogate time series of x(t) and y(t) can be derived using the above algorithm. Denoting the surrogate time series Xs and Ys, the MI between Xs and Ys can be estimated using the KNN method. For Ms couples of surrogate time series, we have Is(m) with m = 1, ..., Ms. The confidence interval for the independence test between X and Y is then defined as S(X ; Y ) = [0, I(X ; Y ) − μs + 6σs]
(11)
3. MPM DETECTION FOR NONLINEAR MIMO SYSTEMS For model-based control systems, an internal model control (IMC) structure (Figure 1) is usually considered since it can be conveniently transformed into MPC. In Figure 1, G and G0 represent the plant and model, respectively. Q is the controller, ud is the dithering signal injected into the system, and v is the unmeasured disturbance. r is the set point which can be assumed to be zero in regular control systems. To detect MPM, dithering signals and correlation analysis are commonly employed.10−12 In other literature, system identification algorithms are used when sufficient excitation is necessary.6,8,29 In the IMC structure, designing of the dithering signal is an important issue. For linear systems, pseudorandom binary sequence (PRBS) signals are commonly used. However, for nonlinear systems, the case is different. Strictly speaking, the chosen signal depends on the properties of nonlinearity that wanted to be excited. From the view of system identification, it is difficult to give sufficient input signal conditions for identification of arbitrary (nonpolynomial) nonlinear system models.30 In general, the excited signal should be adequately rich in amplitude content as well as spectrally. For nonlinear systems, PRBS or modified PRBS signals can also be used; see, for example, refs 31 and 32. Other signals can also be used, e.g., generalized multiple-level noise (GMN) and pseudorandom multilevel sequences (PRMS)33 and filtered white uniform noise.34 Without any special design, bounded identical and independent distributed (iid) random process, usually uniform distributed signals35,36 can be a common choice. However, this does not mean that PRBS signals have to be rejected absolutely in nonlinear system excitation. The use of PRBS for nonlinear system identification is advocated by Sutter.37 Zhu38 argued that, for Wiener model identification, binary test signals applied on the process input can also be used. The choice of excited signals depends on what kind of nonlinearity users are concerned about. We will demonstrate the effects of different excited signals on the estimated values of mutual information in section 3.1. For the sake of simplicity, PRBS signals are used in this paper throughout the application studies.
(6)
∑ exp(iφn(k))|S(k)| exp⎜⎝−
I (̂ X ; Y ) min{Ĥ (X ), Ĥ (Y )}
This measure was modified by Vinh et al.28 for feature selection.
Step 2. Perform the inverse transform, replace the actual amplitudes by the amplitudes of the original data x(t), and retain the phases exp(iφn(k)) = Rn(k)/|Rn(k)|: sn(t ) =
(10)
Other normalized mutual information (NMI) proposed by Estévez et al.27 can also be a candidate. The NMI is proposed as
2
t=1
1 − exp[−2I (̂ X ; Y )]
(9)
where μs and σs are the mean and standard deviation of Is(m), respectively. A 6σ threshold is selected to give a robust decision rather than a 2σ or 3σ test as discussed by Schreiber.25 Determination of the critical value using the surrogate data method is somewhat computationally expensive. Similarly to linear correlation, the estimated mutual information value is normalized in the range [0 1]. This value can give an intuitive conclusion before carrying out the surrogate data to produce an accurate critical value. Joe26 proposed a linear-correlation-like measure for MI, which scales from 0 to 1, given as 1929
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
3.1. Intuitive Demonstration of Nonlinear/Linear Correlation. In this subsection, we give a simple demonstration of the function of MI. As a comparison, the timedelayed cross-correlation functions are also presented. Consider a nonlinear dynamic system as follows (model 1). y(t ) = −0.8y(t − 1) + 0.08(y 2 (t − 1) + 4.5u(t − 1) + u(t − 1) u(t − 2)) + d(t )
(12)
The following four residual equations possibly resulted from poor model structure selection or inadequately estimated model parameters: ε1 = 0.7d(t − 1) + d(t ) ε2 = 0.06u(t − 1) u(t − 2) + d(t ) ε3 = 0.2y(t − 1) + d(t )
Figure 3. Mutual information between input and residual functions for model 1.
(13)
ε4 = 0.03y 2 (t − 1) + d(t )
To further elaborate the KNN-based MI estimator, the relationship between data length and the estimation of mutual information is also considered. Using the same model and considering residual ε2 only, the length of data was increased from 512 through the case of 2n (n = 9, 10, ..., 15). A total of 100 Monte Carlo simulations were carried out for each case. The time delay of the time series was set as 1. The mean and the standard deviation of MI are presented in Figure 4. From
where u(t) is PRBS signals with amplitudes from −1 to 1 as a system input sequence. The disturbance is an ARIMA(2,0,0) process: d (t ) =
e(t ) 1 − 1.6q−1 + 0.8q−2
(14)
Here e(t) is a white noise sequence with zero mean and variance of 0.01. All these data sequences have the length of 2047. The cross-correlation functions (CCFs) and MI between input sequence and residual process εn = 1, 2, 3, 4 are plotted and shown in Figures 2 and 3, respectively. Estimation of MI is
Figure 4. Mean values and standard deviation of estimated values with different lengths of data.
Figure 4 it can be seen that the standard deviation reduces as the data length increases. Furthermore, the mean values increase as the data length becomes larger. Hence it can be concluded that, as the data length increases, the estimation of MI becomes more stable. In the meantime, the effect of noise on the estimated values is presented in Figure 5. From Figure 5 we can see that the estimated values get close to zero rapidly when the signal-to-noise ratio is greater than 80%. A more detailed analysis shows that the MI estimator performs well even for highly contaminated signals, i.e., signals with a signalto-noise of 50%. Different excited signals may have different exciting abilities for all kinds of nonlinearities. Two important conclusions obtained by Leontaritis et al.40 stated that white noise with a Gaussian distribution is optimal if the power of the signal is constrained, and that white noise having a uniform distribution will be the optimal test signal if the amplitude is constrained. We give a simple simulation here, to evaluate which excited signal will be an appropriate candidate. White noise with a Gaussian distribution and white noise with a uniform
Figure 2. Cross-correlation between input and residual functions for model 1.
implemented with the help of OpenTSTOOL (version 1.2 on Windows 32 bit).39 It can be seen from Figure 2 that no significant correlation is observed between u(t) and ε1, ε2, and ε4, since all the CCF values are within the control limits. Only between u(t) and ε3 can some violations be observed. However, according to eq 13, there is a clear relation between u(t) and ε2 and ε4 which is nonlinear. This shows that CCF fails to measure the nonlinear correlation between u(t) and ε2 and ε4. In contrast, Figure 3 gives a different picture. Significant violations can be observed between u(t) and all the residuals except ε1, which is in perfect accordance with eq 13. This shows the superiority of MI over CCF as a nonlinear correlation measure. 1930
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
the submodel of a channel from the ith manipulated variable (MV) to the jth controlled variable (CV). Here gij(ui(t)) is assumed to be an invertible nonlinear function. Without any confusion, we abbreviate gij(ui(t)) to gij. It is also important to point out that using gij to denote the submodel is more convenient for subsequent analysis. In many situations, some submodel may correspond to a specific physical part in practice. It must be noted that the nominal model G0 can represent not only explicit models such as the transfer function matrix, but also implicit models which are difficult to be quantified by parameters, e.g., valve constraints. From Figure 1 we can get e = G(u) + v − G0(u) Figure 5. Effect of signal-to-noise ratio on estimated mutual information.
(15)
Let ΔG(u) ≜ G(u) − G0(u) be the model−plant mismatch or model error. ΔG(u) will be written as ΔG in the following. This indicates that only when G and G0 present the same properties, ΔG will be zero. 3.3. Problem Definition of MPM for MIMO Systems. For the MIMO control system shown in Figure 1, the controlled object is actually the model error ΔG and the disturbance v. We have reason to assume that the model−plant mismatch which can be defined as follows has the same form of the model:
distribution, as well as a PRBS signal, are used as the excited signal u. All the signals have unit variance. Figure 6 displays the results of different excited signals and the corresponding estimated mutual information values. With the same variance of excited signals, the biggest values are derived when white noise with a uniform distribution is used. This means uniformly distributed white noise may give our method more sensitivity. However, we still argue that this result does not mean everything. Optimality is meaningful under a concrete system structure and known nonlinearity. Usually, uniformly distributed white noise can be a good candidate as used in much of the literature. On the other hand, a PRBS signal is easy to generate and simple in practice. 3.2. MPM Detection and Localization for Nonlinear MIMO Systems. Suppose that G0 is the nominal model representing the m × n MIMO plant G. Usually, the model G0 can be presented as a matrix whose element gij(ui(t)) denotes
⎡ Δg Δg12 ··· Δg1n ⎤ ⎢ 11 ⎥ ⎢ Δg ⎥ ··· Δ g 1n ⎥ ΔG ≜ ⎢ 21 ⎢⋮ ⎥ ⋮ ⎢ ⎥ ⎣ Δgm1 Δgm2 ··· Δgmn ⎦
(16)
Note that Δg = g − g0 here because system G and ΔG0 are in parallel.
Figure 6. Different excited signals with the corresponding estimated mutual information values. 1931
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
Figure 7. Three cases of model−plant mismatch: (a) no model error; (b) ith (i = 1, ..., n) column of model matrix has no model error; (c) jth (j = 1,...,m) row of model matrix has no model error.
relationship can present the information transfer clearly in systems. In this paper, the following index is defined to detect MPM.
With the dithering signals, two parts of the signals operate the system at time instant t, ud(t), and the operating signal given by the controller is denoted uc(t): uc(t ) = Q(r(t − 1), r(t − 2), ..., e(t − 1), e(t − 2), ...)
Iij̅ =
(17)
If some model−plant mismatch exists, the model residual e(t) contains some information about the past of the signal ud (t − 1), ud(t − 2), ... and the disturbance v(t), v(t − 1), .... Therefore, the controller output at time instant t can be written as a function of the past of r, ud, and v. Then eq 17 can be written as
tmax
Δt − tmin + Δt
(18)
(20)
(21)
Assume that tf has been appropriately chosen following Abarbanel’s method. Consequently, the MI matrix defined in eq 21 can be used as a mathematical expression to reveal model−plant mismatch in nonlinear multivariate systems. In section 3.4, the element under the critical value is replaced by zero. 3.4. MPM Detection and Localization. Proposition 1. If the MI matrix is not a zero matrix, then ΔG ≠ 0. Proof. As exhibited in Figure 7a, if ΔG = 0, then the model residuals become e = G(uc, ud) − G0(uc, ud) + v = ΔG(uc, ud) + v = v
(22)
Intuitively, the model residuals e contain no information about the dithering signals ud. ■ Proposition 2. If all the elements in one or some of the columns in ΔG are zero, then the elements of the corresponding columns of the MI matrix equal zero. Proof. As exhibited in Figure 7b, suppose all the elements of the ith column of ΔG are zero; then the channel from the ith manipulated variable to all the model residuals can be written as
t max
∑
i
τ= 0
⎡ I11̅ I12̅ ··· I1̅ n ⎤ ⎥ ⎢ ⎥ ⎢ I21 ̅ I22 ̅ MI = ⎢ ⎥ ⋱ ⋮ ⎥ ⎢⋮ ⎥ ⎢ ··· Imn ̅ ⎦ ⎣ Im̅ 1
From eq 18 we can see that only the past of the information of ud can be transferred to the output as a part of the future of the model residual e. For dynamic systems, the value of mutual information at a certain time lag cannot give full information about the full relationship between two variables. With some abuse of notation but without any confusion, I(x(t),y(t)) denotes the estimated value of mutual information from the samples x(t) and y(t). In this paper, a general MI concept, termed as the “norm of mutual information” or NMI,41 is employed. For assessing the dependence of x(t) and y(t), we compute the mutual information I(x(t);y(t + tmax)) and find a tmax so that for t′ ≥ tmax, I(x(t);y)(t + t′)) ≈ 0. That is always feasible because the signals of x(t) and y(t + tf) become independent in a practical sense when tf becomes large.42 Here y(t + tmax) is the time series sampled from y(t) with time ahead of tmax. Then the NMI can be defined as I ̅ (X ; Y ) =
t max
∑ I(ud (t ); ej(t + τ))
Therefore, the following MI matrix can be obtained:
uc(t ) = Q(r(t − 1), r(t − 2), ..., ud(t − 2), ud(t − 3), ..., v(t − 1), v(t − 2), ...)
1 tmax + 1
I(x(t ); y(t + τ ))
τ= t min
(19)
with tmin = Δt = 1 sample as the usual choice. It should be pointed out that the following relationship holds due to the asymmetry of mutual information.
ej = gji(uci , udi) − g0 (uci , udi) + vj ji
= Δgij(uci , udi) + vj
I(x(t ); y(t − τ )) ≠ I(x(t ); y(t + τ ))
= vj
This is an important property for our application since we are interested in whether the future of the residuals contain any information of the past dithering signals. This time ordering
(j = 1, ..., m)
(23)
This implies that all the model residuals contain no information of the dithering signals udi, which is equivalent to I(udi;e) = 0. ■ 1932
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
three input and two output variables. The plant can be expressed as follows.
Due to the impact of the feedback controller, the MI matrix is not completely respondent to the model error matrix ΔG. More precisely, one element in the MI matrix may significantly depart from zero even the corresponding subsystem has no MPM. But an exclusive method can be employed to locate the submodels which have a mismatch. Suppose that the ith column of ΔG is zero, then udi is eliminated through the multiplication of ΔGud. In other words, the information of the ith dithering signal cannot transfer through the model error Δgij(j = 1,...,m). Hence no significant mutual information can be observed between udi and all of the model residuals. Proposition 3. If all the elements in one or some of the rows in ΔG are zero, then the elements of the corresponding rows of the MI matrix equal zero. Proof. As exhibited in Figure 7c, assume all the elements in the jth row of ΔG is zero, that is, Δgij = 0 for all i = 1, ..., n; then the model residual ej can be written as
⎡0 0 ⎡ y1 ⎤ s+2 ⎢ ⎢ ⎥= 2 3 s + 2s + 1 ⎢1 2 ⎣ y2 ⎦ ⎣ s +s+ ⎡ a ⎤ ⎢s + 1 ⎥ ⎢ ⎥u 2 + ⎢ a(s + 2) ⎥ 1 ⎢⎣ 2 ⎥ s + 2s + 1 ⎦
(25)
where a = 0.2 in the nominal condition. It should be noted that the second term on the right-hand side of eq 27 is nonlinear. The following colored noise was added to the system. ⎡ 1 1 ⎤ v = diag⎢ , e ⎣ 40s + 1 20s + 1 ⎥⎦
(26)
where e is a 2 × 1 column vector of mean zero white noise, and the covariance matrix Σe = diag(0.1,0.1). The plant was controlled by an MPC controller using a linearized model linearized by the MATLAB function linearize. An MPM is simulated by changing the parameter a from 0.2 to 0.8, so that MPM is only introduced into the channel from u1 to y1 and y2. This is a mild change in light of the introduced disturbance. For an open loop test, the system input is white Gaussian noise with mean 0 and variance 1. Under this case, the contributions of variance on y1 and y2 of the parameter change are 0.0623 and 1.1788. For the disturbance, the values are 0.0001 and 1.03455. When considering the prediction error which influences the control performance directly, the impacts of parameter change are 0.0374 and 0.0918, which are smaller than that (0.1) introduced by the disturbance. Similar to the study of model 1 (eq 12), PRBS signals with amplitudes from −1 to 1 were used as dither signals. Again, MI and cross-correlation functions between the dithering signals and model residuals were estimated and are plotted in Figure 8. The horizontal axis denotes the lag of
ej = Δgj1(uc1 , ud1) + ... + Δgjn(ucn , ud n) + vj = vj
⎤ ⎥⎡ u 2 ⎤ ⎥⎢⎣ u3 ⎥⎦ 1⎦
(24)
This implies that no function exists for transferring the information of dithering signals to the model residual ej. Thus, we have I(ud;ej) = 0. ■ In some cases, dithering signals only need to be added into part of the MVs since operators may concern only part of the channels. In other cases, some MVs cannot be excited by external signals for practical reasons. For example, in the polymerization process, one of the manipulated variables, i.e., the jacket temperature of the reactor, cannot bear a high magnitude oscillation, so this MV cannot be excited. In this case, the above propositions are still true for the subsystem that does not contain the excluded MVs. Proposition 4. Assume that all the operation variables are held constant except ui and only udi is excited; if Δgji = 0, then no mutual information exists between udi and ej. Proof. If all the operation variables are held constant except ui and only udi is excited, then ej = Δgji(ui , udi) + vj
If Δgji = 0, because the others MVs are held constant, the information of udi can not be propagated to the controller output and then the model residual e through feedback. Therefore no mutual information can be detected between udi and ej. ■ Remark. Proposition 4 shows that no mutual information between udi and ej would be observed if Δgji = 0. On the contrary, if the mutual information between udi and ej is not equal to 0, then Δgji ≠ 0 and mismatch can be detected in the channel from the ith MV to the jth CV since ui and udi are independent. ud is independent of the disturbance term v and the set point r. Hence, a change in the disturbance cannot affect the relationship between ud and e. From this point of view, the proposed MPM detection method is not sensitive to the change in the disturbance.
Figure 8. Mutual information plots for the continuous system.
inputs while the vertical axis corresponds to the MI and CCF between inputs and model residuals. Figure 8 shows that the proposed method based on MI correctly detects the mismatch between the model and the plant, as the MI values between ud1 and e1 and e2 violate the control limits. On the contrary, crosscorrelation functions fails, as is shown in Figure 9, where none of the CCFs exceed the control limits.In this example, the nonlinear map from manipulated variable u1 to e1,2 is a parabola, which makes the CCF method fail.
4. SIMULATION EXAMPLES 4.1. Nonlinear System with Linear MPC. The first simulation example is a continuous nonlinear process with 1933
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
4.2. Nonlinear System with Nonlinear MPC. To further illustrate the characteristics of the method, a nonlinear process
with three input and three output variables is considered. The plant can be expressed as follows.
⎡ 0.1q−1 + 0.2q−2 ⎤ 0.3q−1 + 0.2q−2 q −1 ⎡ ⎢ ⎥ ⎢ 1 − 0.8q−1 ⎢ 1 − 1.2q−1 + 0.35q−1 1 − 0.7q−1 ⎥ ⎢1 − 2 ⎡ ⎤ ⎡ y1 ⎤ ⎢ ⎥ u1 1 2 1 2 − − − − ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ q + 0.5q −0.11q − 0.058q ⎥ u 1 ⎢ 2 ⎢ ⎥ + ⎢ y2 ⎥ = ⎢ e ⎥ −1 1 − 0.48q−1 1 − 1.05q−1 + 0.136q−2 ⎥⎢ ⎥ ⎢1 + ⎢ y ⎥ ⎢ 1 + 0.4q ⎣ 3⎦ ⎢ ⎥ ⎢ ⎥⎣ sin(u3)⎦ ⎢ −1 −1 −1 ⎢ 0.59 0.85 0.098 q q q ⎢ ⎥ ⎢⎣ 1 − 1 1 1 − − − ⎢⎣ 1 − 0.607q ⎥⎦ 1 − 0.717q 1 − 0.95q
g31 =
q −1 1 − 0.7q
−1
⇒
0.59q−1 1 − 0.607q
−1
2q−1 1 − 0.2q−1 ⇒
0.59 1 − 0.607q−1
(27)
As is shown in eqs 28 and 29, MPMs of two subsystems exist in the system. Again, PRBS signals with magnitudes of −1 and 1 are used. The values of mutual information between the dithering signals and model residuals were estimated and are plotted in Figure 10. A total 2047 samples were used in this simulation. The horizontal axis denotes the lag of inputs while the vertical axis corresponds to the MI values between dithering signals and model residuals. Figure 10 shows that the proposed method based on MI successfully detects the mismatch between the model and the plant. As a comparison, the linear correlation coefficients between dithering signals and model residuals were calculated and are plotted in Figure 11. As shown in Figure 11, linear correlation coefficients failed to detect the mismatch of the subsystem g21. After the MPM was detected, controller retuning was performed, and the results are shown in Figure 12. From Figure 12 it can be seen that tuning both channels results in better improvement in the controller performance. This further elaborates the superiority of our method to the CCF method. To validate the effect of time-varying disturbance, we further changed the dynamics of the colored noise as
Equation 27 is a simplified Hammerstein model with strong nonlinearity. Colored noise is added to the system, a is a 3 × 1 column vector of zero mean white noise, and the variance for each channel is 0.1. The plant was controlled by a nonlinear MPC controller using YANE software.43 Now consider that the submodels change as follows. g12 =
⎤T 1 −1 ⎥ 0.5q ⎥ ⎥ 1 ⎥ a −1 0.8q ⎥ ⎥ 1 ⎥ 0.1q−1 ⎥⎦
(28)
(29)
⎡ ⎤ 1 1 1 v=⎢ −1 −1 − 1 ⎥a 1 − 0.2q 1 + 0.9q ⎦ ⎣ 1 + 0.7q
(30)
The results are presented in Figure 13. From Figure 13, the same conclusion can be drawn and only the values of the estimated mutual information emerge with slight differences and no violation of confidence limits is observed. This clearly
Figure 9. Cross-correlation plots for the continuous system.
Figure 10. Mutual information plots for the discrete system. 1934
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
Figure 11. linear correlation coefficients plots for the discrete system.
decorrelate the effect of other MVs in such a typical nonlinear system. If the process nonlinearity can be ignored, the wellestablished methods proposed by Webber and Gupta12 and Badwe et al.10 can be employed for their simplicity.
5. INDUSTRIAL CASE STUDY The application of the proposed method to a polypropylene (PP) production process is introduced in this section. The schematic graph of the considered PP process is given in Figure 14. During the process, fresh catalyst (containing TEAL and DONOR) is fed to reactor R200 through precontacting tank D200. Propylene and hydrogen are fed to the prehomopolymerization reactor R200 simultaneously. The reactant and catalyst from R200 with the majority of catalyst, propylene, and hydrogen are fed to reactor R201. Then a portion of the slurry together with pure propylene is injected into reactor R202. After a series of chemical reactions, the final product, homopolymerization, is delivered to reactor D301 to produce copolymers.
Figure 12. Performance comparison among the cases: before MPM, with MPM, and after controller detuning.
shows that the proposed method is not sensitive to the change of disturbance dynamics. Remark. It should be mentioned that the partial correlation method by Badwe et al.10 without any modification cannot be used in such an example due to the introduced strong nonlinearity. It is hard to use a proper linear dynamic model to
Figure 13. Mutual information plots for the discrete system with changed disturbance dynamics. 1935
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
Figure 14. Schematic of the process flow of polypropylene production process.
As is shown in many research articles, the process is very complicated with nonlinear dynamics, highly coupled, and is often dominated by time delays. As an advanced process control (APC) project for a plant at SINOPEC Inc., MPC controllers were designed for tubular reactors R201 and R202. The two MPC controllers were designed similarly, and the manipulated variables and controlled variables are listed in Table 1. Table 1. MVs and CVs of the MPC for Tubular Reactors variable
description
MVs MV1 MV2
flow of hydrogen flow of propylene monomer
CV1 CV2
concentration of hydrogen density of slurry
CVs
Figure 16. Mutual information plots for R201.
quality index depending on the density of slurry and can be calculated through a first principles model. After running online for 7 months, the two MPC controllers for R201 and R202 suffered performance deterioration. The variance of solid content increased significantly, as is shown in Figure 15.
Normally, the manipulation of reactor temperature is not desirable even if it is designed as a manipulated variable. In our system, the reactor temperature is controlled by a cascade control loop. During the process, solid content is an important
Figure 15. (a) Solid content (%) during the early APC project phase; (b) solid content (%) after 7 months. 1936
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
Article
The proposed method was then employed to identify the root cause of the performance decrease during the maintenance period. To guarantee the stability of the system, appropriate dithering signals were added by experienced operators. The data for MPM detection were sampled with a period of 5 min, and a total of 1458 samples were collected. First the test was carried out on R201, and then the same task was done on R202. The time delayed mutual information and cross-correlation functions were plotted in Figures 16−19. For R201, the channel from MV1 and MV2 to CV1 exhibited a significant MPM, which was correctly detected by the proposed MI-based technique, as is shown in Figure 16. In contrast, crosscorrelation functions detected mismatch only for the first submodel and mismatch for the second submodel remained undetected, as is shown in Figure 17. For the tubular reactor R202, no MPM was detected by both methods, as is shown in Figures 18 and 19. Although the performance decrease of the two MPC controllers emerged simultaneously, it is shown that MPM arises only in the first MPC. As the cascade reactors are seriously coupled, the fluctuation in R201 always performs as an increased disturbance for R202. It must be noted that the crosscorrelation function method used here is not only as a reference for comparison; the results can offer important diagnosis information for the operators. They can retune the MPC controllers based on the MPM detection results, both linear and nonlinear. However, for reidentification of the model, 1458 samples are insufficient. On the basis of the results obtained using the proposed MPM technique, maintenance was performed for R201 and the performance of the MPC controller for both R201 and R202 became much better, as is shown in Figure 20.
Figure 17. Cross-correlation plots for R201.
6. CONCLUSION In this paper, mutual information was proposed to detect model−plant mismatch of nonlinear MIMO systems. Dithering signals are used to excite the process, and mutual information between dithering signals and model residuals are used to detect MPM. For a MIMO system, an MI matrix can be obtained and a scheme is designed to indicate which subsets of models are the root cause of the performance deterioration and should be regarded as candidates for reidentification. The properties of MI as a detection tool for MPM have been discussed. The proposed detection method is then tested on two simulation examples and an industrial polypropylene production process. The application results show that, compared to cross-correlation methods in the literature, the MI-based detection approach can better detect MPM of systems exhibiting nonlinearity.
Figure 18. Mutual information plots for R202.
■
Figure 19. Cross-correlation plots for R202.
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected] (L.X.);
[email protected] (J.Z.). Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS The authors are grateful for financial support from the Natural Science Foundation of China (Grant No. 60904039, 60974100 and 61203088), Zhejiang Provincial Natural Science Foundation (Grant No. LQ12F03015) and the Fundamental Research Funds for the Central Universities.
Figure 20. Solid content (%) after system maintenance. 1937
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938
Industrial & Engineering Chemistry Research
■
Article
(41) Paluš, M.; Komárek, V.; Hrnčíř, Z.; Štěrbová, K. Phys. Rev. E 2001, 63, 046211. (42) Abarbanel, H. Analysis of Observed Chaotic Data; Springer Verlag: New York, 1996. (43) Grüne, L.; Pannek, J. Nonlinear Model Predictive Control Theory and Algorithms; Springer-Verlag: London, 2011.
REFERENCES
(1) Harris, T. J. Can. J. Chem. Eng. 1989, 67, 856−861. (2) Jelali, M. Control Eng. Pract. 2006, 14, 441−466. (3) Selvanathan, S.; Tangirala, A. K. Ind. Eng. Chem. Res. 2010, 49, 4210−4229. (4) Patwardhan, R. S.; Shah, S. L. J. Process Control 2002, 12, 413− 427. (5) Kesavan, P.; Lee, J. H. Ind. Eng. Chem. Res. 1997, 36, 2725−2738. (6) Jiang, H. L.; Huang, B.; Shah, S. L. J. Process Control 2009, 19, 644−655. (7) Basseville, M.; Nikiforov, I. Detection of Abrupt Changes: Theory and Application; Prentice-Hall: Englewood Cliffs, NJ, 1993. (8) Huang, B.; Tamayo, E. C. Chem. Eng. Sci. 2000, 55, 2315−2327. (9) Harrison, C. A.; Qin, S. J. J. Process Control 2009, 19, 1610−1616. (10) Badwe, A. S.; Gudi, R. D.; Patwardhan, R. S.; Shah, S. L.; Patwardhan, S. C. J. Process Control 2009, 19, 1305−1313. (11) Stanfelj, N.; Marlin, T. E.; Macgregor, J. F. Ind. Eng. Chem. Res. 1993, 32, 301−314. (12) Webber, J. R.; Gupta, Y. P. ISA Trans. 2008, 47, 395−400. (13) Rawlings, J. B. IEEE Control Syst. Mag. 2000, 20, 38−52. (14) Zhang, L. F.; Zhu, Q. M.; Longden, A. Int. J. Syst. Sci. 2007, 38, 47−60. (15) Zhang, L. F.; Zhu, Q. M.; Longden, A. IEEE Trans. Neural Networks 2009, 20, 1−13. (16) Shannon, C.; Weaver, W. Bell Syst. Tech. J. 1948, 27, 379−423. (17) Cover, T.; Thomas, J. Elements of Information Theory; 2nd ed.; Wiley-Interscience: Hoboken, NJ, 2004. (18) Kraskov, A.; Stögbauer, H.; Grassberger, P. Phys. Rev. E 2004, 69, 066138. (19) Moon, Y. I.; Rajagopalan, B.; Lall, U. Phys. Rev. E 1995, 52, 2318−2321. (20) Khan, S.; Bandyopadhyay, S.; Ganguly, A. R.; Saigal, S.; Erickson, D. J.; Protopopescu, V.; Ostrouchov, G. Phys. Rev. E 2007, 76, 026209. (21) Papana, A.; Kugiumtzis, D. Int. J. Bifurcation Chaos 2009, 19, 4197−4215. (22) Roulston, M. S. Physica D: Nonlinear Phenom. 1997, 110, 62− 66. (23) Paluš, M.; Vejmelka, M. Phys. Rev. E 2007, 75, 056211. (24) Schreiber, T.; Schmitz, A. Phys. Rev. Lett. 1996, 77, 635−638. (25) Schreiber, T.; Schmitz, A. Physica D: Nonlinear Phenom. 2000, 142, 346−382. (26) Joe, H. J. Am. Stat. Assoc. 1989, 84, 157−164. (27) Estéevez, P. A.; Tesmer, M.; Perez, C. A.; Zurada, J. A. IEEE Trans. Neural Networks 2009, 20, 189−201. (28) Vinh, L. T.; Lee, S.; Park, Y.-T.; d’Auriol, B. J. Appl. Intell. 2012, 37, 100−120. (29) Huang, B. J. Process Control 2001, 11, 699−715. (30) Nowak, R. D. Circuits, Syst., Signal Process. 2002, 21, 109−122. (31) Norquay, S. J.; Palazoglu, A.; Romagnoli, J. A. Chem. Eng. Sci. 1998, 53, 75−84. (32) Wilson, S. S. Proceedings of the IEEE Southeast Conference; IEEE: Piscataway, NJ, 2005; pp 39−44. (33) Nowak, R. D.; Vanveen, B. D. IEEE Trans. Signal Process. 1994, 42, 2124−2135. (34) Arefi, M. M.; Montazeri, A.; J., P.; Jahed-Motlagh, M. R. Chem. Eng. J. 2008, 138, 274−282. (35) Vörös, J. Syst. Control Lett. 2001, 44, 363−372. (36) Hasiewicz, Z.; Mzyk, G. IEEE Trans. Autom. Control 2004, 49, 1370−1375. (37) Sutter, E. Advanced Methods of Physiological Systems Modeling; University of Southern California: Los Angeles, CA, 1987; Vol. 1, pp 303−315. (38) Zhu, Y. Multivariable System Identification for Process Control; Elsevier Science: Oxford, 2001. (39) Merkwirth, C.; Parlitz, U.; Wedekind, I.; Engster, D.; Lauterborn, W. OpenTSTOOL, 2009. (40) Leontaritis, I.; Billings, S. Int. J. Syst. Sci. 1987, 18, 189−202. 1938
dx.doi.org/10.1021/ie303127c | Ind. Eng. Chem. Res. 2013, 52, 1927−1938