A New Subspace Identification Approach Based on Principal

Apr 19, 2015 - Consider the following linear time-invariant discrete system (1) (2)where ... Define the past and future stacked vectors and Hankel dat...
0 downloads 0 Views 2MB Size
Article pubs.acs.org/IECR

A New Subspace Identification Approach Based on Principal Component Analysis and Noise Estimation Ping Wu,*,† HaiPeng Pan,† Jia Ren,† and Chunjie Yang‡ †

Faculty of Mechanical Engineering & Automation, Zhejiang Sci-Tech University, Hangzhou 310018, Zhejiang, China State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou 310027, Zhejiang, China



ABSTRACT: In this paper, a new subspace identification approach based on principal component analysis (PCA) and noise estimation is developed for multivariable dynamic process modeling. In contrast to typical subspace identification methods based on standard PCA with instrumental variables, the noise term is first estimated and naturally eliminated in the proposed approach, and then a PCA procedure is used to determine system observability subspace and extract system matrices A, B, C, and D from the estimated observability subspace. For comparison with other typical subspace identification methods based on PCA, numerical simulation and activated sludge process benchmark modeling are included to demonstrate the superiority of the proposed approach and reveal the probable reason for unsatisfied B and D estimations derived by some subspace identification methods based on PCA.

1. INTRODUCTION Subspace identification methods have gained great interest from both academy and industry in the past decades.1−9 Compared with prediction error methods (PEM),10 the most attractive merits of subspace identification methods (SIM) are ease-ofextension for multi-input multi-output (MIMO) systems and simple-of-computation without nonlinear optimization through robust numerical linear algebra tools such as Singular Value Decomposition (SVD) and QR factorization.11 The general SIM procedures for open loop are as follows:12 (1) Identification of the extended observability matrix, block triangular Toeplitz matrix, or realized state sequence through projection and model reduction processes. (2) Estimation of the system parameters from estimate of extended observability matrix and block triangular Toeplitz matrix by regression. Recently, the works of analyzing statistical properties of several SIMs have been carried out.13−16 The dynamic principal component analysis (DPCA) method has been used for system identification.17 Subspace identification based on the indirect DPCA approach is provided for modeling the errors-in-variables (EIV) process.18 However, in ref 19, the authors pointed out that there are some limitations for SIM based on DPCA, such as all variables must have identical noise variance for deriving consistent estimate. A new subspace identification approach called SIMPCA or SMIPCA that employs the PCA procedure to determine the extended observability subspace and block triangular Toeplitz matrix is proposed in the literature.19 In the SIMPCA algorithm, standard PCA is modified with an instrumental variable consisting of past inputs and past outputs. The modified PCA is adopted to determine the observability matrix and the Toeplitz matrix from the parity space equation, and then system matrices A, B, C, and D are extracted through the estimated observability matrix and Toeplitz matrix. Later, another subspace identification-based PCA denoted as SIMPCA-Wc is developed by introducing column weighting for solving the closed-loop identification problem.20 The abovementioned algorithms SIMPCA and SIMPCA-Wc can be © 2015 American Chemical Society

formulated in a framework that includes different instrumental variables. In ref 21, the authors revised subspace identification methods based on PCA and developed a new closed-loop subspace identification algorithm (SOPIM) by extending the SIMPCA algorithm. Recently, the application of SIMPCA has been generalized from fault detection22 to fault-tolerant control.23 The SIMPCA approach requires instrumental variables to be highly correlated with future inputs and future outputs and uncorrelated to the noise term. However, the condition is not always stringently satisfied in the industrial process. The errors caused by adopting instrumental variables may be vital for estimating zeros of the identified process with finite data length. In this paper, we present a new subspace identification method based on basic SIMPCA. The proposed method does not use instrumental variables to remove the noise term. We first estimate the noise term and then the estimated noise sequences are treated as known variables similar to IEM algorithms.24 Then, the subsequent PCA procedure is implemented to determine the system observability subspace and Toeplitz matrix. The proposed method in this paper will be denoted as SIMPCA-E, which means the noise term is estimated and eliminated. The SIMPCA-E method can avoid the usage of instrumental variables. Simulations are conducted to demonstrate the performance of the SIMPCA-E method with comparison to other SIMs. This paper is focused on subspace identification based on PCA for open loop and organized as follows. In Section 2, the problem formulation and assumptions are introduced. Section 3 gives a brief review of the basic SIMPCA algorithm and presents the SIMPCA-E method. Simulations are given on the SIMPCA-E method compared with other methods in Section 4. Conclusions are provided in the final section. Received: Revised: Accepted: Published: 5106

December 17, 2014 March 16, 2015 April 19, 2015 April 19, 2015 DOI: 10.1021/ie504824a Ind. Eng. Chem. Res. 2015, 54, 5106−5114

Article

Industrial & Engineering Chemistry Research

2. PROBLEM FORMULATION AND ASSUMPTIONS Consider the following linear time-invariant discrete system x(k + 1) = Ax(k) + Bu(k) + w(k)

(1)

y(k) = Cx(k) + Du(k) + v(k)

(2)

3. THE PROPOSED METHOD 3.1. Brief Review of SIMPCA Algorithm. Equation 4 is the cornerstone in the subspace identification algorithm based on PCA [I − Hf ]Zf = Γf X f + Gf Wf + Vf

where Zf = [YTf UTf ]T. By denoting Ef = Gf Wf + Vf, we can derive

where x(k)ϵ9 n is the state variables, u(k)ϵ9 m and y(k)ϵ9 l are vector input and vector output, and w(k)ϵ9 n and v(k)ϵ9 l are the process noise and output measurement noise, respectively. System matrices A, B, C, and D are with appropriate dimensions. To establish the statistical consistency of the identification problem, we introduce the following assumptions: A1: The eigenvalues of A are strictly inside the unit circle. A2: (A, C) is observable. A3: The noise processes are assumed to be zero-mean white noises, and they are assumed to be statistically independent of the past input u(k). A4: These noise sequences are uncorrelated. The sampled observations from identification experiments are the measured input uk and output yk. The object is to estimate system matrices A, B, C, and D given finite measurement input− output sequences {u(k)} and {y(k)}. Define the past and future stacked vectors and Hankel data matrices ⎡ y(k − p) ⎤ ⎢ ⎥ ⎢ y(k − p + 1)⎥ yp (k) = ⎢ ⎥ ⋮ ⎢ ⎥ ⎢ ⎥ ⎣ y(k − 1) ⎦

[I − Hf ]Zf = Γf X f + Ef

Denote as the orthogonal complement of Γf with full column rank. Equation 5 can be transformed to eq 6 by multiplying (Γ⊥f )T on both sides (Γ ⊥f )T [I − Hf ]Zf = (Γ ⊥f )T Ef

(6)

The SIMPCA method developed in ref 19 applied an instrumental variable Iv to remove the noise term Ef (Γ ⊥f )T [I − Hf ]Zf Iv = (Γ ⊥f )T Ef Iv

(7)

The instrumental variable Iv needs to satisfy the following conditions ⎧ C1. lim Zf Iv ≠ 0 ⎪ N →∞ ⎨ ⎪ C2. lim Ef Iv = 0 ⎩ N →∞

(8)

On the basis of assumptions A3 and A4, the selected instrumental variables are Iv = 1/N[YTp UTp ] = 1/N ZTp , where ZTp = [YTp UTp ] in SIMPCA19 and Iv = 1/√N ZTp (ZpZTp )−1/2 in SIMPCA-Wc.20 Then, the PCA decomposition will be processed in SIMPCA algorithm 1 T Zf ZpT = PpTpT + PT r r N

Yp = [yp (k)yp (k + 1) ··· yp (k + N − 1)]

(9)

or in SIMPCA-Wc algorithm

Y f = [yf (k)yf (k + 1) ··· yf (k + N − 1)]

1 T Zf ZpT(ZpZpT )−1/2 = PpTpT + PT r r N

where p is the past horizon and f is the future horizon. The other matrices Up, Uf, Wf, and Vf are formulated similarly to Yp, Yf. By iterating eqs 1 and 2, it is straightforward to obtain the extended input−output model in Hankel form

(10)

where Pp and Tp are the loading matrix and score matrix of the principal subspace, and Pr and Tr are the loading matrix and score matrix of residual subspace, respectively. Upon assumption A2, we can derive the following equation from eqs 9 and 10

(3)

with Xf = [x(k) x(k + 1) ··· x(k + N − 1)]. Γf is the extended observability matrix with rank n according to the assumption A2

⎡ ⎤ Γ ⊥f ⎢ ⎥ = PM r ⎢ T ⊥⎥ ⎣ − (H f ) Γ f ⎦

Γf = [CT(CA)T ···(CA f − 1)T ]T

Two lower block triangular matrices Hf and Gf are formulated as follows, ⎡ D 0 ⎢ CB D Hf = ⎢⎢ ⋮ ⋮ ⎢ f 2 − ⎣CA B CA f − 3B ⎡ 0 0 ⎢ 0 C Gf = ⎢⎢ ⋮ ⋮ ⎢ f −2 ⎣CA CA f − 3

(5)

Γ⊥f

⎡ ⎤ y(k) ⎢ ⎥ ⎢ y(k + 1) ⎥ yf (k) = ⎢ ⎥ ⋮ ⎢ ⎥ ⎢ ⎥ ⎣ y(k + f − 1)⎦

Y f = Γf X f + Hf Uf + Gf Wf + Vf

(4)

(11)

where M is any constant nonsingular matrix. In most case, M is chosen as an identity matrix. Considering M=I, separate eq 11,

⋯ 0⎤ ⎥ ⋯ 0⎥ ⋱ ⋮⎥ ⎥ ⋯ D⎦ ⋯ 0⎤ ⎥ ⋯ 0⎥ ⋱ ⋮⎥ ⎥ ⋯ 0⎦

⎡ ⎤ Γ ⊥f ⎡ Pr Γ ⎤ ⎢ ⎥ ⎢ ⎥ P = = r ⎢ T ⊥⎥ ⎣ PrU ⎦ ⎣ − (H f ) Γ f ⎦

(12)

where PrΓ and PrU are matrices with appropriate dimensions. Then, we can estimate Γf and Hf,

5107

Γ ⊥f = Pr Γ

(13)

−(Hf )T Γ ⊥f = PrU

(14) DOI: 10.1021/ie504824a Ind. Eng. Chem. Res. 2015, 54, 5106−5114

Article

Industrial & Engineering Chemistry Research Y f − Ef̂ = Γf X f + Hf Uf

The remaining problem is to extract the system matrices A, B, C, and D from the estimate of Γf and Hf. Many methods are available for this purpose. We refer to ref 12 for a discussion on the detailed solution procedure. In this paper, we employ the system parameters extraction process in ref 19 for comparison purpose. 3.2. The Proposed SIMPCA-E Algorithm. As shown in the previous subsection, the most important step in the SIMPCA algorithm is the PCA procedure. In order to obtain a consistent estimate, instrumental variables are used to eliminate the effect of the noise term Ef. However, the estimate of system matrices B and D are less accurate, although the identified A and C are quite well. The probable cause is that the rigid conditions C1 and C2 are not met well for finite length data. We pursue to derive the loading matrix of residual subspace with small eigenvalues in order to extract the extended observability matrix and block triangular Toeplitz matrix. Because the noise term is not completely removed, this loading matrix of residual subspace may be not appropriate enough and accurate for extracting system matrices. Instead of using the instrumental variable to remove the noise term, we present an intuitive way to achieve the same goal. We first estimate the noise sequences and then directly eliminate the noise term. The advantage of this manipulation is that the errors brought by the PCA procedure can be reduced relative to the instrumental variable method. The noise term can be consistently estimated from the high-order autoregressive model with the exogenous input (ARX) model. 3.2.1. Estimate of Noise Term. For long past horizon p, it is straightforward to derive the following relation12

X f ≈ LzZp

Defining Ỹ f = Yf − Ê f, we obtain the following equation ⎡Y ̃ ⎤ f [I − Hf ]⎢ ⎥ = Γf X f ⎢Uf ⎥ ⎣ ⎦

⎡Y ̃ ⎤ f (Γ ⊥f )T [I − Hf ]⎢ ⎥ = 0 ⎢Uf ⎥ ⎣ ⎦ ⎡Y ̃ ⎤ T ⎢ f ⎥ = PTT + PT ̅ ̅ ⎢Uf ⎥ ⎣ ⎦

min J =

⎡ ⎤ ⎡ PΓ̅ ⎤ Γ ⊥f ⎢ ⎥ = PM ̅ =⎢ ⎥ ⎢ ⎥ T ⊥ ⎢⎣ PU̅ ⎥⎦ ⎣ − (H f ) Γ f ⎦

⎛ ⎞ 1 rank⎜ lim Z ZT ⎟ = mf + n ⎝N →∞ N f p ⎠

2

(17)

(18)

Then we derive, Ef̂ = R33Q 3

(24)

(25)

needs to be satisfied because PCA will perform on 1/N Zf ZTp to estimate Γf and Hf, which means 1/N Zf ZTp should have lf − n zero scores. For the high signal noise ratio (SNR), the condition will be easily satisfied with a large N. However, we can see that the rank condition is not strictly met, while including an instrumental variable for medium SNR with a less large N will lead to an unsatisfied result. On the contrary, the rank condition in SIMPCA-E is easier to meet. Remark 2. The algorithms SIMPCA and SIMPCA-Wc need to deal with conditions C1 and C2, and the input signals should not be white noise. A similar viewpoint has been proposed specific to SIMs with instrumental variables in ref 25. Through avoiding the usage of the instrumental variable to vanish the noise term before performing PCA, we could solve the identification problem as a deterministic case in the SIMPCA-E method. Therefore, there is no requirement for input signals. Remark 3. In general, subspace identification based on PCA would be considered as an approach for solving the errors-invariable (EIV) identification problem because the use of PCA naturally falls into the framework of EIV formulation.19 The noise term Ef could not be consistently estimated for an EIV formulation by ordinary least squares (OLS). The bias-eliminated least-squares (BELS) method26 would be a feasible alternative way to consistently estimate system parameters for improving the SIMPCA-E algorithm. But the SIMPCA-E method could

where ∥•∥2F denotes the Frobenius norm operator. Ê f is the leastsquares residual. QR factorization is a robust numerical tool widely used in the subspace identification method.11 In order to solve eq 17, we perform QR factorization ⎡ Zp ⎤ ⎡ R ⎤ ⎡Q 1⎤ ⎢ ⎥ ⎢ 11 ⎥⎢ ⎥ ⎢Uf ⎥ = ⎢ R 21 R 22 ⎥ ⎢Q 2 ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢⎣ Y f ⎥⎦ ⎣ R31 R32 R33 ⎦ ⎢⎣Q 3 ⎥⎦

(23)

where M is any constant nonsingular matrix. By choosing M as the identity matrix, the system parameters can be extracted from P̅Γ and P̅U as the procedure described in ref 19. Remark 1. The main difference between SIMPCA-E and SIMPCA or SIMPCA-Wc is that SIMPCA-E does not need instrumental variables to eliminate noise term. In SIMCPA, the rank condition

(16)

F

(22)

where P and T are the loading matrix and score matrix of the principal subspace, and P̅ and T̅ are the loading matrix and score matrix of the residual subspace, respectively. We can derive the following equation for estimating Γf and Hf

From assumption A1 and A4, Ef is uncorrelated with Uf and Zp in open loop, we can derive the unbiased estimate of the noise term Ef in a least-squares way through eq 16 as N tends to infinity. Then, we can derive the noise term estimation Ê f through minimizing the object J

[Γf LzHf ]

(Γ⊥f )T

Perform PCA

where Lz is defined as in ref 13. Then, we can obtain eq 16 by integrating eq 4 and eq 15

⎡ Zp ⎤ Y f − [Γf Lz Hf ]⎢ ⎥ ⎢⎣Uf ⎥⎦

(21)

By multiplying both sides of eq 21 by

(15)

Y f = Γf LzZp + Hf Uf + Ef

(20)

(19)

3.2.2. Estimate of Loading Matrix of Residual Subspace. Substitute Ê f in eq 16 and move Ê f to the left side 5108

DOI: 10.1021/ie504824a Ind. Eng. Chem. Res. 2015, 54, 5106−5114

Article

Industrial & Engineering Chemistry Research obtain acceptable results for experiments in which observations are sampled with medium to high signal-to-noise ratios. 3.2.3. Summary of SIMPCA-E Algorithm. The proposed algorithm “SIMPCA-E” can be summarized as follows: Step 1. Estimate the noise term Ê f from eqs 18 and 19. Step 2. Construct eq 21 from eq 20. Step 3. Perform PCA on the left term of eq 23 to derive the loading matrices of residual subspace. Step 4. Extract system parameters from eq 24 with the similar procedure in reference.19

4. SIMULATIONS In this section, a MIMO numerical example and an activated sludge process benchmark modeling example are used to evaluate the performance of the proposed method compared to other SIMs. 4.1. MIMO Numerical Example. The benchmark example is a MIMO dynamic system presented in the literature.19 The system matrices of the process are described as follows:

Figure 1. Singular values distribution in PCA procedure with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (“o”: singular value): Case 1.

⎡ 0.67 0.67 0 0 ⎤ ⎢ ⎥ −0.67 0.67 0 0 ⎥ ⎢ A= ⎢ 0 0 −0.67 − 0.67 ⎥ ⎢ ⎥ ⎣ 0 0 0.67 −0.67 ⎦ ⎡ 0.6598 −0.5256 ⎤ ⎢ ⎥ 1.9698 0.4845 ⎥ ⎢ B= ⎢ 4.3171 −0.4879 ⎥ ⎢ ⎥ ⎣−2.6436 −0.3416 ⎦ ⎡−0.5749 1.0751 − 0.5225 0.1830 ⎤ C=⎢ ⎥ ⎣−0.2977 0.1543 0.1159 0.0982 ⎦ ⎡−0.7139 − 0.1174 ⎤ D=⎢ ⎥ ⎣ −0.3131 0.2876 ⎦

In order to demonstrate the performance compared to SIMPCA and SIMPCA-Wc, here we consider two types of input signals. The first is white noise input, and the second signal is a combined sinusoidal wave input. In each experiment, we generate 2000 samples of input and output data to identify the system matrix. The system order can be determined by the characteristic polynomial (CP) method or reconstruction error (RE) method.19 Here, we assume the system order 4 is a prior known. The past horizon and future horizon are chosen as 5. A total of 50 experiments are conducted for each case. Case 1: White Noise Inputs with Low Variance Process Noise and Measurement Noise. In this case, w(k) and v(k) are Gaussian processes with zero mean and with standard deviation: σw = 0.01 and σv = 0.01. The input excitation signal is Gaussian processes with zero mean and unit variance. As shown in Remark 1, the PCA procedure would indicate lf − n (6 in this simulation) zero scores in SIMPCA, SIMPCAWc, and SIMPCA-E algorithms in eqs 9, 10, and 23. The comparison results of singular values distributions are show in Figure 1. From this figure, it is shown that all three algorithms can meet the rank requirement well and derive an accurate loading matrix of the residual subspace in the PCA procedure for low variance noises. Figure 2 shows that the estimated poles are accurate and close. The identified zeros are pictured in Figure 3. The SIMPCA-E method can derive more accurate zeros estimation.

Figure 2. Pole estimates with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (red “+”: model value. blue “x”: estimated value): Case 1.

Case 2: White Noise Inputs with Medium SNR. In this case, w(k) and v(k) are Gaussian processes with zero mean and with the large standard deviation: σw = 0.1 and σv = 0.4. The input excitation is the same as Case 1. The comparison results of the distribution of singular values of left term in eqs 9, 10, and 23 are shown in Figure 4. From this figure, it is shown that the proposed SIMPCA-E can meet the rank requirement well and derive a more accurate loading matrix of residual subspace in the PCA procedure than the SIMPCA and SIMPCA-Wc methods. The identified poles are pictured in Figure 5. Although the other two SIMs can derive accurate poles estimates, SIMPCA-E can derive more excellent results in zero estimates as shown in Figure 6. The results show that SIMPCA 5109

DOI: 10.1021/ie504824a Ind. Eng. Chem. Res. 2015, 54, 5106−5114

Article

Industrial & Engineering Chemistry Research

Figure 5. Pole estimates with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (red “+”: model value. blue “x”: estimated value): Case 2.

Figure 3. Zero estimates with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (red “+”: model value. blue “x”: estimated value): Case 1.

Figure 4. Singular values distribution in PCA procedure with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (“o”: singular value): Case 2. Figure 6. Zero estimates with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (red “+”: model value. blue “x”: estimated value): Case 2.

and SIMPCA-Wc would fail in zero estimates with medium SNR. The probable reason could be deduced from the less accurate extraction in the PCA procedure. Case 3: Sinusoidal Waves Inputs. In this case, w(k) and v(k) are Gaussian processes with zero mean and with the following standard deviation: σw = 0.1 and σv = 0.4.The excitation inputs are the combination of 10 sinusoidal waves with different frequencies and scaled to unit variance: u(1,k) = ∑10 i sin(0.3898πik), u(2,k) = ∑10 i cos(0.4536πik). As shown in Remark 2, compared with the white input, the results of the combination sinusoidal waves input can derive better performance. From Figure 7, it is shown that the

distribution of singular values makes it more feasible though it is not still perfect. The identified poles are pictured in Figure 8. The pole estimations are still quite accurate for all three algorithms. The comparison results of zeros estimation are shown in Figure 9, and the performance of zero estimation is improved compared to Case 2. But the proposed method SIMPCA-E still obtains better performance than the SIMPCA and SIMPCA-Wc algorithms. 5110

DOI: 10.1021/ie504824a Ind. Eng. Chem. Res. 2015, 54, 5106−5114

Article

Industrial & Engineering Chemistry Research

Figure 7. Singular values distribution in PCA procedure with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (“o”: singular value): Case 3. Figure 9. Zero estimates with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (red “+”: model value. blue “x”: estimated value): Case 3.

Figure 8. Pole estimation with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (red “+”: model value. blue “x”: estimated value): Case 3.

Figure 10. Singular values distribution in PCA procedure with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (“o”: singular value): Case 4.

Case 4: Sinusoidal Waves Inputs for EIV Problem. As shown in Remark 3, the proposed SIMPCA-E algorithm can achieve a fair performance for the EIV identification problem. In this case, w(k) and v(k) are Gaussian processes with zero mean and with the following standard deviation: σw = 0.1 and σv = 0.4. In this experiment, the noise-free input is the same as Case 3. The noise-free input is contaminated by a white noise with standard deviate 0.4. The comparison results of singular values distribution are show in Figure 10. From this figure, the performance for EIV problem is similar to Case 3. The identified poles are pictured in Figure 11. Figure 12 shows the estimated poles. The performance

falls slightly compared to Case 3. Generally speaking, subspace identification methods based on PCA can solve the EIV identification problem in an elegant way with medium to high SNR noises. The SIMPCA-E can gain better performance than the SIMPCA and SIMPCA-Wc methods. 4.2. Industrial Process Benchmark Example. In this work, the Activated Sludge Wastewater Treatment PlantUniversity of Sao Paulo (ASWWTP-USP) benchmark27 is used as a data generator. A three-order LTI discrete-time states pace multivariable model that describes the nitrate concentrations in the anoxic and aerobic zones around an operating point is estimated by SIMs. Because of ease-of-implementation and 5111

DOI: 10.1021/ie504824a Ind. Eng. Chem. Res. 2015, 54, 5106−5114

Article

Industrial & Engineering Chemistry Research

Figure 13. Process data of ASWWTP-USP benchmark27

Figure 11. Pole estimates with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (red “+”: model value. blue “x”: estimated value): Case 4.

Figure 12. Zero estimates with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms for 50 Monte Carlo experiments (red “+”: model value. blue “x”: estimated value): Case 4.

Figure 14. Comparison results for poles estimates (up) with SIMPCA, SIMPCA-Wc, and SIMPCA-E algorithms and singular values distribution (bottom) in PCA procedure.

similarity of white noises, the pseudorandom binary sequences (PRBS) signals are widely used in system identification modeling. In this case, we use PRBS as the excite signals. The preprocessed data are shown in Figure 13. We use first 1000 sampled data to estimate the model and keep 400 sampled data to validate. In the simulation, the order is chosen as 3, and the past and future horizons are chosen as 6.

The identified poles by SIMPCA, SIPMCA-Wc, and SIMPCA algorithms and singular values distribution in PCA are pictured in Figure 14. It shows that the poles estimates are close; however, the singular values distribution of SIMPCA and SIMPCA-Wc are not satisfied (lf − n should be 9). The SIMPCA-E can derive a better estimate of singular values distribution. 5112

DOI: 10.1021/ie504824a Ind. Eng. Chem. Res. 2015, 54, 5106−5114

Article

Industrial & Engineering Chemistry Research

decomposition and eliminated before the PCA procedure. Then, the identification problem can be handled in a noise-free circumstance. Compared with other SIMs based on PCA, the PCA procedure of the proposed method can obtain a more accurate loading matrix of residual subspace. Simulations based on benchmark problems are used to compare the proposed algorithm SIMPCA-E with other SIMs. The comparison results show the superiority of the proposed method. From the simulation results, we also noted that the cause of the unsatisfied results by SIMCA and SIMPCA-Wc would be the poor estimate of loading the matrix of the residual subspace in the PCA procedure, especially for a white noise input exciting signal. However, it should be noted that we do not use this SIMPCA-E method for a closed loop because the noise term would be biased under a closed loop. Future interests will focus on the statistical analysis of the SIMPCA-E method and modification for a consistent estimate for the EIV problem and closed-loop data.

However, as Remark 2 demonstrated, SIMPCA and SIMPCAWc are not similar enough to white noises-type excited inputs. Figure 15 shows the prediction outputs by the SIMPCA,



AUTHOR INFORMATION

Corresponding Author

*Tel: +86 15658086539. E-mail: [email protected]. Notes

The authors declare no competing financial interest.



Figure 15. Response comparison for output 1 and output 2 with SIMPCA-Wc, SIMPCA-E, and SIMPCA algorithms.

ACKNOWLEDGMENTS This work is supported by the Science Foundation of Zhejiang Sci-Tech University (ZSTU) under Grant No. 14022086-Y, Foundation of Zhejiang Educational Committee (Grant Y201326698), and National Natural Science Foundation of China (Grant 61203177).

SIMPCA-Wc, and SIMPCA-E methods. The comparison results show that SIMPCA-E can derive better performance than the other methods. SIMPCA-Wc and SIMPCA cannot capture the change of outputs, and SIMPCA-Wc gets better performance than SIMPCA. In order to demonstrate the performance of SIMPCA-E, we use the widely adopted mean relative squared error (MRSE) index in the literature27 as the performance indicator MRSE =

1 l

l

∑ i=1



(1) Van Overschee, P.; De Moor, B. Subspace Identification for Linear Systems Theory-Implementation-Applications; Kluwer Academic: Dordrecht, The Netherlands, 1996. (2) Van Overschee, P.; Favoreel, W.; De Moor, B. Subspace state space system identification for industrial processes. J. Process Control 2002, 10, 149−155. (3) De Moor, B.; Van Overschee, P. A unifying theorem for three subspace system identification algorithms. Automatica 1995, 31, 1853− 1864. (4) Larimore, W. E. Canonical Variate Analysis in Identification, Filtering and Adaptive Control. In Proceedings of the 29th Conference on Decision and Control: Honolulu, HI, 1990; pp 596−604. (5) Verhaegen, M.; Dewilde, P. Subspace model identification, part i: The output-error state-space model identification class of algorithms. Int. J. Control. 1992, 56, 1187−1210. (6) Van Overschee, P.; De Moor, B. N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems. Automatica 1994, 30, 75−93. (7) Huang, B.; Kadali, R. Dynamic Modeling, Predictive Control and Performance Monitoring: A Data-Driven Subspace Approach. In Lecture Notes in Control and Information Sciences; No.374; Springer Verlag: London, 2008. (8) Olofsson, K. E. J.; et al. Subspace identification analysis of RFX and T2R reversed-field pinches. Control. Eng. Pract. 2013, 21, 917−929. (9) Danesh, P. N.; Huang, B.; Shah, S. L. Subspace approach to identification of step-response model from closed-loop data. Ind. Eng. Chem. Res. 2010, 49, 8558−8567. (10) Ljung, L. System Identification − Theory for the User, 2nd ed; Prentice Hall: Upper Saddle River, NJ, 1999. (11) Katayama, T. Subspace Methods for System Identification; Springer Verlag: London, 2005. (12) Qin, S. J. An overview of subspace identification. Comput. Chem. Eng. 2006, 30, 1502−1513.

N

∑ j = 1 (yi (j) − yi ̂ (j))2 N

∑ j = 1 (yi (j))2

where yi is the ith real output, and the ŷi is the ith estimated output. In order to demonstrate the performance of the SIMPCA-E algorithm, we compare the results with other methods used in ref 27. The results are included in Table 1. Table 1. MSRE Indexes by Subspace Identification algorithm

identification (%)

validation (%)

SIMPCA-E MOESP5 N4SID1 uCCA4 cCCA29 DSR28 “robust” N4SID1

44.32 31.81 44.49 40.4417 40.1652 34.2450 34.8394

50.57 57.58 72.92 69.9404 69.2404 50.9904 57.7508

REFERENCES

From Table 1, we see that the proposed SIMPCA-E approach can produce a satisfied estimate model compared with other typical methods, especially for validation data.

5. CONCLUSIONS In this paper, a new subspace identification method (SIMPCA-E) based on PCA and noise estimation is proposed. The noise term is first derived through the least-squares method by QR 5113

DOI: 10.1021/ie504824a Ind. Eng. Chem. Res. 2015, 54, 5106−5114

Article

Industrial & Engineering Chemistry Research (13) Knudsen, T. Consistency analysis of subspace identification methods based on a linear regression approach. Automatica 2001, 37, 81−89. (14) Bauer, D. Asymptotic properties of subspace estimators. Automatica 2005, 41, 359−376. (15) Chiuso, A. On the asymptotic properties of closed loop CCA-type Subspace Algorithms: equivalence results and choice of the future horizon. IEEE Trans. Autom. Control 2010, 3, 634−649. (16) Chiuso, A. The asymptotic variance of subspace estimates. J. Econometrics 2004, 118, 257−291. (17) Ku, W. F.; Storer, R. H.; Georgakis, C. Disturbance detection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Syst. 1995, 30, 179−196. (18) Li, W.; Qin, S. J. Consistent dynamic PCA based on errorsinvariables subspace identification. J. Process Control 2001, 11, 661−678. (19) Wang, J.; Qin, S. J. A new subspace identification approach based on principal component analysis. J. Process Control 2002, 12, 841−855. (20) Wang, J.; Qin, S. J. Closed-loop subspace identification using the parity space. Automatica 2006, 42, 315−320. (21) Huang, B.; Ding, S. X.; Qin, S. J. Closed-loop subspace identification: an orthogonal projection approach. J. Process Control 2005, 15, 53−66. (22) Ding, S. X.; Zhang, P.; Naik, A. S.; Ding, E. L.; Huang, B. Subspace method aided data-driven design of fault detection and isolation systems. J. Process Control 2009, 19, 1496−1510. (23) Ding, S. X. Data-driven design of monitoring and diagnosis systems for dynamic processes: A review of subspace technique based schemes and some recent results. J. Process Control 2014, 24, 431−449. (24) Lin, W.; Qin, S. J.; Ljung, L. On Consistency of Closed-Loop Subspace Identification with Innovation Estimation. In 43rd IEEE Conference on Decision and Control: Lost Angeles, CA, 2004; pp 2195− 2200. (25) Chou, C. T.; Verhaegen, M. Subspace algorithms for the identification of multivariable dynamic errors-in-variables models. Automatica 1997, 33, 1857−1869. (26) Zheng, W. X. A bias correction method for identification of linear dynamic errors-in-variables models. IEEE Trans. Autom. Control 2002, 47, 1142−1147. (27) Sotomayor, O. A. Z.; Park, S. W.; Garcia, C. Multivariable identification of an activated sludge process with subspace-based algorithms. Control Eng. Pract 2003, 11, 961−969. (28) Di Ruscio, D.; Subspace system identification-theory and applications (6th ed.). Lectures notes; Telemark Institute of Technology, Norway, 2000. (29) Peternell, K.; Scherrer, W.; Deistler, M. Statistical analysis of novel subspace identification methods. Signal Process. 1996, 52, 161−177.

5114

DOI: 10.1021/ie504824a Ind. Eng. Chem. Res. 2015, 54, 5106−5114