Gross Error Detection in Serially Correlated Process Data. 2. Dynamic

Accepted September 12,1991. Gross Error Detection in Serially Correlated Process Data. 2. Dynamic. Systems. Chen-Shan Kao, Ajit C. Tamhane, and Richar...
0 downloads 0 Views 1MB Size
254

Ind. Eng. Chem. Res. 1992,31, 254-262

Cygnarowin, M. L.; Seider, W. D. Effect of retrograde solubility on the design optimization of supercritical extraction processes. Znd. Eng. Chem. Res. 1989,2%, 1497-1503. Dahl, S.; Rasmussen, P.; Fredenslund, Aa. The MHV2 model: A UNIFAC-based equation of state model for prediction of gas solubility and vapor-liquid equilibria at low and high pressures. Znd. Eng. Chem. Res. 1991,30,1936-1944. Duff, I. S. A set of FORTRAN subroutines for sparse unsymmetric linear equations. U.K.Atomic Energy Authority, Harewell Laboratory, [Report] AERE-R; AERE-R8730; Her Magisty’s Stationary Office, HMSO London, 1979. Fredenslund, Aa.; Gmehling, J.; Rasmussen, P. Vapor-Liquid Equilibria Using UNIFAC. Elsivier: Amsterdam, 1977. Gani, R.; Cameron, I. T. Extension of dynamic models of distillation columns to steady state sirnulation. Comput. Chem. Eng. 1989, 13 (3), 271-280.

Gani, R.; Perregaard, J.; Johanaen, H. Simulation strategies for design and analysis of complex chemical processes. Trans. Znst. Chem. Eng. 1990,68 (PartA), 407-417. Hillestad, M.; Hertzberg, T. Dynamic simulation of chemical engineering systems by the sequential modular approach. Comput. Chern. Eng. 1986,10,377-388. Pantelides, G. C.; Gritsis, D.; Morison, K. R.; Sargent, R. W. H. The mathematical modelling of transient systems with differentialalgebraic equations. Comput. Chem. Eng. 1988,12 (5), 449-454.

Perregaard, J.; Pedersen, B. S.; Gani, R. Steady state and dynamic simulation of complex chemical processes. Presented at the 4th International Symposium on Process Systems Engineering,Montebello, Canada, Aug 5-9, 1991. SkjoldJlargensen,S. Group contribution equation of state (GCEOS): A predictive model for phase equilibrium computations over wide ranges of temperature and pressures up to 30 MPa. Znd. Eng. Chem. Res. 1988,27,110-118. Soave, G. Equilibrium constants from a modified Redlich-Kwong equation of state. Chem. Eng. Sci. 1972,27,1197-1203. Smensen, E. L. Dynamic simulation of chemical processes with accurate thermodynamic models. Ph.D. Thesis, The Technical University of Denmark, September, 1991. Ssrensen, E. L.; Johansen, H.; Gani, R.; Fredenslund, Aa. A dynamic simulator for design and analysis of chemical processes. In Process Technology Proceedings;Bussemaker, H. Th.,Iedema, P. D., E&.; Elsivier: Amsterdam, 1990; Vol. 9, pp 13-18. Tiegs, D.; R m m w e n , P.; Fredenslund, Aa. Vapor-liquid equilibria by UNIFAC group contribution. 4. Revision and extension. Znd. Eng. Chem. Res. 1987,26, 159-161. Received for review March 11, 1991 Revised manuscript received September 3, 1991 Accepted September 12,1991

Gross Error Detection in Serially Correlated Process Data. 2. Dynamic Systems Chen-Shan Kao, Ajit C. Tamhane, and Richard S. H. Mah* Robert R. McCormick School of Engineering a n d Applied Science, Northwestern University, Evanst on, I1linois 60208- 3120

The importance of gross error detection and identification is widely acknowledged, but so far relatively few publications in the chemical engineering literature have addressed this problem for stochastic dynamic systems. Also, statistical process control (SPC) chart techniques have been developed and used extensively to detect shifts (a change in steady state) in the process means but have been hitherto ignored in applications involving detection of gross errors in chemical process data. In this paper a composite test procedure for detecting and identifying gross errors is proposed: we first make use of the CUSUM test (which underlies the CUSUM chart used in quality control applications) to detect the presence of gross errors. We then apply the generalized likelihood ratio (GLR) method to identify as well as to estimate the magnitudes of the gross errors. We concentrate our treatment on discrete-time, linear dynamic systems operated around a nominal steady state and corrupted by process and measurement noises, which are assumed to be Gaussian. These noises can be white or serially correlated. Computer simulation studies and some analytical results are presented. Methods for gross error detection and identification in steady-state chemical processes have been studied extensively in the past two decades (Tamhane and Mah, 1985; Mah, 1990). But in reality, even for steady-state operations, process conditions vary continually about a nominal steady state. A truly steady state is almost never attained. A dynamic system will be a better representation of a real process. A mathematical model which describes the dynamic behavior of the process under normal operating conditions based on physical information and statistical data is a very powerful tool in gross error detection and in process performance evaluation. Detection of gross errors in a dynamic system is more difficult than in a steady-state system since the effects of gross errors at any time depend on the dynamic behavior of the process, and thus may change with time. In a previous paper (Kao et al., 1990),we investigated the effect of serial correlations of process measurements

* To whom correspondence should be addressed. 0888-5885 f 92/2631-O254$03.Oo f 0

in the steady-state case. We showed the extreme sensitivity of the measurement test (MT) for gross error detection to serial correlations. We also proposed two methods which work for serially correlated data. The first one involves suitably adjusting the variance of the test statistics. The second, prewhitening, involves filtering out the serial correlations and then applying the desired test based on the independence assumption. In dynamic systems, serial correlations can arise from process and measurement noises. Therefore it is of interest to extend the previously developed methods of gross error detection (particularly the prewhitening method) to the dynamic case. This paper represents a major step in this direction. Only a few publications in the chemical engineering literature have addressed gross error detection for dynamic systems (Behgham and Lees, 1977;Watanabe and Himmelblau, 1982;Narasimhan and Mah,1988;Gertler and Luo, 1989). We begin by considering a discrete-time,linear dynamic system operated around a nominal steady state and cor@ 1992 American Chemical Society

Ind. Eng. Chem. Res., Vol. 31, No. 1, 1992 255 rupted by process and measurement noises, which are assumed to be Gaussian and white (serially uncorrelated). We assume that the system is completely observable and completely controllable and the noises are stationary. These conditions are necessary for a steady-state filter and a steady-state optimal feedback control action (Madregor, 1973; Maybeck, 1979,1982). In this case the innovation sequence is also white. We propose applying the different statistical process control (SPC)methods to this sequence to detect gross errors. Next we consider situations where the noises are not white. Serial correlations in the process and measurement noises can arise due to many reasons, e.g., misspecification of the dynamic structures of the process and measurement models, residual instrument errors, etc. We propose a prewhitening approach for serially correlated process and measurement noises, which can be modeled using the autoregressive moving average (ARMA) time series family of models (Box and Jenkins, 1976). After prewhitening, the innovation sequence becomes white and the standard SPC testa can be applied to it. Our proposed tests are evaluated using simulation methods applied to a liquid level control process example. Formulation of the Models for Dynamic Systems and Gross Errors In this paper we focus on the discrete-time linear dynamic systems in the following standard-state space form as follows: state equation: x(t+l) = Ax(t) + Bu(t) + w(t) (1) where x(t) is an n X 1 vector of state variables at time t , u(t) is an r X 1 vector of control input variables at time t, w(t) is an n X 1 vector of process noises at time t, A is an n X n state transition matrix, and B is an n X r input matrix. This equation describes how certain fundamental quantities such as mas, energy, etc. vary with time. These fundamental quantities are known as state variables, and their values describe the behavior of the process. This behavior is observed through measured variables, y(t), which are related to the state variables via measurement equation: y(t) = Hx(t) + v(t) (2) where y(t) is an m X 1 vector of output variables (measurements) at time t, v(t) is an m X l vector of measurement noises at time t, and H is an m X n measurement matrix. Initially we will assume that the random vectors w(t) and v(t) are Gaussian and white with mean 0 and covariance matrices Q and R,respectively. For the system described by eqs 1 and 2, the Kalman filter (Kalman, 1960) provides the minimum variance unbiased estimate f(t-1 of the state vector x(t) based on all the measurements up to time t - 1. The corresponding innovations (also known as measurement residuals, measurement adjustments) are defined by r(t) = y(t) - 9(t) = y(t) - Hf(t-)

(3)

It is well-known (Maybeck, 1979) that the innovations r(t) form a zero mean Gaussian white noise sequence with covariance matrix V ( t )= (HP((t-l)-)HT+ R)

(4)

where P((t-l)-) is the covariance matrix of the state estimate error vector, Le., x(t) - t ( t - ) ,at time t. The innovations r(t) are used for gross error detection and identification since they represent the discrepancy between the observed and predicted data. In addition to the above estimator, we have a linear closed-loop control law

u(t) = -Lt(t+)

(5)

where i(t+) are the updated state estimates at time t given all measurements up to time t which are calculated from the Kalman filter and L is an r X n control matrix which can be obtained by making use of the linear quadratic Gaussian (LQG) algorithm (Stengel, 1986, Maybeck, 1982). The following assumptions are made in modeling gross errors. (a) We assume that each gross error is a step function and that the constant magnitude 6 of each gross error is not known. A gross error in any measurement can occur at any arbitrary time and will remain in place until it is detected and removed. (b) At any time we allow at most one gross error to be present in the set of measurements. In this study three types of gross error models previously proposed by Narasimhan and Mah (1988) are considered. They are as follows. Measurement Bias. A bias of magnitude 6 in the ith measurement can be modeled by the measurement equation y(t) = Hx(t) + ~ ( t + ) hi,, (6) where ei,,,is the ith unit vector of dimension m. Process Leak. A leak of magnitude 6 in process unit i can be modeled by the state equation x(t+l) = Ax(t) + Bu(t) + ~ ( t+)6hi (7) where hi = [0, 0, ..., hi, ..., OIT is the gross error vector for leak in unit i with zeros in every position except the ith place. For more detail regarding how to determine hi,see Narasimhan and Mah (1988). Controller Bias. A bias of magnitude 6 in the ith controller can be modeled by the control equation u(t) = -Lf(t+) + hi,,

(8)

Effect of a Gross Error on the Innovations It is well-known (Maybeck, 1979) that, under the null hypothesis of no gross errors, the innovations are serially independent, normally distributed with conditional mean vector and covariance matrix given by E[r(t)lY(t-l)] = 0 Cov[r(t)lY(t-l)] = (HP((t-l)-)HT+ R) = V ( t )

(9)

(10)

where Y(t-1) is the set of measurement vectors yo') from - 1, i.e., Y(t-1) = (y(l),y(2), ...,y(t-111. When a gross error has occurred, it can be shown (Narasimhan, 1987) that ita effect on the innovations can be expressed as

j = 1 to t

r(t) = rl(t) + 6G(t)fi

(11)

where rl(t) is the innovation vector obtained if no gross errors are present. The matrix G(t) is referred to as the signature matrix. Formulas for computing the signature matrices for different types of gross errors are given in Appendix A. The vector fi depends on the type of gross error and is given by fi =

i

ei,

eim

hi

for bias in the ith controller, i = 1,2, ..., r for bias in the ith measurement, i = 1, 2, ..., m for leak in the ith process unit, i = 1, 2, ..., q (12)

Narasimhan and Mah (1988) have shown that when a gross error is present, the innovations are still serially independent, normally distributed with conditional mean

256 Ind. Eng. Chem. Res., Vol. 31, No. 1, 1992

vector and covariance matrix given by E[r(t)lY(t-l)] = SG(t)fi

(13)

covariance matrix of f(tM (t = 1,2, ...). The resulting test will declare a gross error in the ith measurement at time tN if

Cov[r(t)lY(t-l)] = Cov[rl(t)JY(t-l)] = (HP((t-l)-)HT + R) = V(t) (14) In other words, the expected value of r(t) changes to a nonzero value (which is also a function of time), but the covariance matrix remains unchanged. In the steady-state case (Kao et al., 19901, the expected value of r(t) changes to a constant nonzero value. The hypotheses for the presence of a gross error can thus be formulated as

I

Ho:E[r(t)IY(t-111 = 0 HI:.E[&) IY(t-1)l Z 0

(15)

Detecting Gross Errors Using Statistical Process Control Techniques There are a variety of statistical process control (SPC) techniques available which are graphical implementations (time order plots) of the corresponding testa. Different SPC testa are compared on the basis of their average run lengths (ARL's), where the ARL of an SPC test is the expected number of sampling periods before the test gives an out-of-control signal. If Hois true, then the ARL (denoted by ARL(0)) should be large so that the frequency of false signals is low. If HIis true, then the ARL (denoted by ARL(b/u)) should be small. Here S/u is the magnitude expressed in the units of true gross error present under HI, of the standard deviation. The problem is to find a test, which for a specified value of the in-control ARL(0) (taken to be 200 in the present paper) minimizes the out-of-control ARL(G/u) for a specified magnitude S / u of the gross error. The Measurement Test (MT) In steady-state chemical processes, methods based on the measurement test (MT) have been widely used to detect measurement biases (Mah and Tamhane, 1982; Iordache et al., 1985; Serth and Heenan, 1986; Rosenberg et al., 1987; Kao et al., 1990). The MT corresponds to the SPC test underlying the Shewhart f chart (Montgomery, 1990). The major advantages of the MT are that it is simple to use and can identify the measurement biases directly. Narasimhan and Mah (1987) proved that the generalized likelihood ratio (GLR) test is equivalent to the MT for identifying measurement biases. Therefore, any comments made about MT also apply to the GLR test. We can apply the MT in the dynamic case as follows. For the ith innovation at time t, the ratio z,(t) = ri(t)/(uii(t))l12

i = 1, 2,

..., m

(16)

can be used as a standardized normal test statistic under the hypothesis of no gross errors at time t; here uii(t)is the ith diagonalelement of V ( t ) .The MT declares that a gross error is present in the ith measurement at time t if lzi(t)l > zPI2 i = 1, 2, ..., m (17) where @ = 1 - (1- a ) l / m .This choice of the critical constant controls the overall type I error probability Sa,and hence ARL(0) 1 1/a, because of the statistical dependence among the zi(t)'s statistics for i = 1, 2, ..., m; see Sidak (1967). In this case, we require ARL(0) 1 200. Hence a = 0.005 and j3 = 1- (0.995)1/mis used. For convenience, we can apply the MT once every N sampling periods to the r(t)'saveraged over those periods. W N )dpotes the averaged innovation vector at time tN, and V(tN), the

where D i i ( t N ) is the ith diagonal element of O(tN). Tests Based on the Global Test (GT) Statistic in Dynamic Systems The main disadvantage of applying MT directly is that it may have difficulty in identifying the location of a gross error since the expected value of zi(t) under HI(eq 16) changes with time. Thus the gross errors get smeared over all the components of the innovation vector r(t) as time passes. On the other hand, the GT statistic (Almasy and Sztano, 1975; Tamhane and Mah, 1985) is designed to detect the presence of a gross error but not its location. Therefore it may be more efficient to detect the presence of a gross error using the GT statistic, and if a gross error is indicated, then identify its location using another technique. Several testa can be formulated based on the GT statistic, which is a quadratic form in r(t):

d t ) = (r(t))T(v(t))-lr(t) (19) It is known that under Ho, ~ ( tis)a central x2 variable with m degrees of freedom (dfj. (The degrees of freedom of eq 19 are equal to m since the covariance matrix V(t) of r(t) is nonsingular and ita rank is equal to m). This statistic underlies the x2 chart used in multivariate quality control (Montgomery, 1990). We now describe several tests based on this test statistic. A Test Based on Individual Data Vectors. This test (referred to as the x2 test) declares a gross error if 7(t) > X m , 2

(20)

where the critical value xm,a2,the 1 - a quantile of the central x2 distribution with df = m, is chosen such that the ARL(0) under Ho is controlled around 200. (In other words, the value of a = l/m = 0.005 and the critical value x ~ , is used.) ~ . ~The ~ test is applied at every sampling time until it first rejects He The Moving Average (MA) Test. The previous test has the feature that it uses only the current data and ignores the past data. Since the effect of a gross error on the innovation vector varies from time to time in any dynamic system, it seems reasonable to develop methods which take into account the historical data in some way. The moving average (MA) test with the moving window size, say N, uses the N most recent data values with equal weight. The test is applied after every k samplings of the data (1 Ik S N). Each new sampling of k seta of data forces the oldest k seta of data in the group out of the computation. The test statistic is given by J.(tk) =

5 5

T(S)

=

s=tk-Nil

s=tk-Nil

(r(s))T(v(s))-lr(s) t = 1,2, ... (21)

It is easily shown that +(tk) follows a central x2 distribution with df = mk under He We declare a gross error if + ( t k ) > Xmk,az

(22)

where the critical value x , k , 2 is chosen such that ARL(0) under Hois controlled around 200. When k < N, succeasive $(tk) statistics are correlated because of the overlap (where

Ind. Eng. Chem. Res., Vol. 31, No. 1, 1992 257 the overlap is defined as N - k). Therefore if we use the with a = 1/200, it will not give ARL(0) critical value xmk,2 = 200. The necessary critical value must be found by trial and error. However, when k = N (no overlap), the successive test statistics are independent and a = 1/200 will give the desired ARL(0) = 200. The Exponentially Weighted Moving Average (EWMA) Test. The EWMA is a statistic that gives less weight (in geometric proportion) to the data as they become older. The EWMA is defined by r*(t+l) = (1- h ) T * ( t ) + h T ( t ) (23)

apply the GLR test and estimate the magnitude of the gross error only if the GLR test is also positive. Otherwise,

the GT-GLR method will continue until both testa are positive.

Systems with Serially Correlated Noises: Prewhitening Procedure For a system with white process and measurement noises, the Kalman filter provides optimal (minimum variance unbiased) estimates of the state variables. Since the resulting innovations are serially independent, the GLR Here h (0 < h < 1) is a fixed weighting constant, and ~ ( t ) test can be derived easily and gives accurate estimates of the gross errors. However, in reality due to the model is the x2 statistic defined by eq 19. We may take the uncertainty, linearization, etc., the noises may not be white starting value T*(O) = 0. The EWMA test concludes that but may be serially correlated. Such serially correlated a gross error is present at time t if noises may be modeled by autoregressive moving average 7 * ( t ) > c1 (24) (ARMA) time series (Box and Jenkins, 1976). In such situations, if we apply the standard Kalman filter algorwhere c1 is chosen such that the ARL(0) is controlled ithm directly to the system, it will give a suboptimal esaround 200. Note that, for h 1,the EWMA test behaves timate and the resulting innovations r(t) will not be serially like the test for individual data vectors (eq 20). Typically independent. We propose a prewhitening procedure for h is chosen between 0.1 and 0.3 (Hunter, 1986). such cases which involves first prewhitening the system The CUSUM Test. Another way to deal with the noises and then applying the standard methods as before. historical data is to use the CUSUM test, which gives equal For simulation studies, we only consider a system with a weight to all the data up to the present time t. The CUserially correlated measurement noise sequence, which SUM statistic a t time t may be defined by follows an ARMA(1,l) time series model.

t

'Y(t)= CdS) s=l

(25)

Measurement Bias Model after Prewhitening

The test statistic we will use is somewhat different from eq 25. We know that under Ho, T ( S ) is a central x2 variable with df = m. Therefore

Now we will consider a measurement bias of magnitude 6 (eq 6) in the ith measurement. After prewhitening the system becomes

E[T(s)] = m (26) Instead of computing the cumulative sum of ~ ( s ) ,we will compute the cumulative sum of the deviations given by d(s) = T ( S ) - m (27)

state equation: x*(t+l) = A*x*(t) + B*u(t) + C*w(t)

(30)

measurement equation: y*(t) = H*x*(t) + a(t)+ 6*ei,,

(31)

which has expectation zero under Ho, so the cumulative s u m does not become too large. Thus the test statistic is defined by

control equation:

t

'YW = C d ( 4 s=1

(28)

(32)

The definitions of the vectors x*(t) and y*(t) and the matrices A*, B*, C*, and H*are given in Appendix B for the ARMA(1,l) model. L* is an r X [2n + m] matrix which is given by

instead of eq 25. We then declare a gross error if

'Ye)>

u(t) = -L*x*(t)

-L* = [-LlO,lO,]

(33)

(29) Again the critical value c2 is chosen (by trial and error) to control the ARL(0) around 200.

where 0, and 0, are null matrices of dimensions r X n and r X m, respectively. Also 6* is given by

Identification and Estimation of Gross Errors Using the GLR Method If one of the aforementioned testa based on the GT statistic (global testa) declares a gross error, then additional testing schemes are needed to identify the type and location of the gross error, as well as to estimate ita magnitude. To achieve this,we make use of the generalized likelihood ratio (GLR) method (Willsky and Jones, 1974; Narasimhan and Mah, 1987,1988) since it is the only method available with the capability of simultaneously identifying the location of the gross error we modeled and estimating ita magnitude. The procedure is as follows: One of the global testa is applied periodically (e.g., a t every sampling period) until it rejects the null hypothesis, say, at time t. A rejection will indicate that a gross error has been detected. We then

where, for the ARMA(1,l) model, @(Z)and e(Z) are m m diagonal matrices given by

c2

a*

= [I - e(~)l-1[1-@(ala

(34) X

m . 0

0 -

0

(36)

258 Ind. Eng. Chem. Res., Vol. 31, No. 1, 1992 Table I. Simulated ARL of the Measurement Test (Equation 17)" N=l N=2 N=10 N=20 ZB/2 = 3.020 Z p / z 2.810 ~ 8 1 2 2.235 ~ ~ =1 1.950 2 AFtL(0) 192.38 201.99 202.28 197.75 126.25 50.25 16.05 ARL1(3) 34.55 302.49 344.90 339.54 mLz(3) 300.52 (37)

N = averaging period; z8/* = critical value.

Table 11. Signature Matrix, G ( t ) ,at Different Times t G(t) t G(t) 1

1.oooo o.oo00

2

10 20 X

30

-

TU

1- eii

Sei,

i = 1, 2,..., m

(38)

From eq 38 we see that the measurement bias after prewhitening still remains in the ith measurement; however, the magnitude of the bias is changed to ((1- 4ii)/(l- &))6 after prewhitenjng. Substituting eq 38 into eq 31 we obtain measurement equation: 1 - dii y*(t) = H*x*(t) + a(t) + -6eip (39) 1 - eii Similar to eq 11, the innovations in the case of prewhitening when a measurement bias is present in the ith measurement can be expressed as

0.8258 0.0183 0.1917 0.0094 0.0505 0.0021 0.0310 0.0011

o.oo00 l.m 0.0478 0.4653 0.2522 0.0087 0.2999 0.0106 0.3065 0.0105

40 50 60 80 100

0.0283 0.0010 0.0280 0.0010 0.0280 0.0010 0.0279 0.0010 0.0280 0.0010

0.3074 0.0105 0.3075 0.0105 0.3075 0.0105 0.3075 0.0105 0.3075 0.0105

Table 111. Simulated ARL of the x2 Test (Equation 20) (Critical Value x ~ , = ~10.60) . ~ ~ gross error simulated 6/u ARL@/d no gross error 0 197.88 bias in stream 1 3 76.62 bias in stream 2 3 54.79

implemented using eq 5, and the resulting innovations r(t) were used for gross error detection. If a specified gross error of magnitude 6 was present at time t, then the values of the state variables x(t),measurements y ( t ) , and control inputs u(t) were obtained by using an appropriate type of the gross error model according to eqs 6 8 . The simulation studies were performed on the level control problem. This benchmark problem, which was the same as that used by Narasimhan and Mah (1988), is summarized here. The state equation for this example is given by dt+l)=

and, as pointed out earlier, the innovations are serially independent whether a gross error is present or not. Any one of the GT-GLR methods described before can be applied to this prewhitened model to detect, identify, and estimate the magnitudes of the gross errors. Simulation Studies For simulation studies, we consider the dynamic system described by eqs 1and 2. We assume that the covariance matrices Q and R, initial values x(O),initial state estimates f(O+), and covariance matrix P(O+)of errors in the estimates f(O+) are known. The process noise w(t) and measurement noise v(t) are white noises which are obtained by using random numbers from multivariate normal distributions with mean vector 0 and covariance matrices Q and R, respectively. A random number generator was used to generate uniform random numbers, which were transformed into N(0,l) random variates by using the Box-Muller algorithm; for descriptions of these methods, see Knuth (1980). The N(0,l)random variates were then converted into multivariate N(0,Q) and N(O,R), respectively. When no gross errors were present, the values of the state variables x(t) and the measurements y ( t ) at time t were generated according to eqs 1 and 2, respectively. The Kalman filter was used to obtain the optimal estimates of the state variables. The control inputs were then

(41)

The control equation is given by ~ ( t= ) [0.0326 -l.OO]f(t+) and the measurement equation is given by

(42)

where xl(t) and z&) are deviations from the nominal value of the level in the tank and the valve position, reapectively. The process noises wl(t) and w2(t)are independently and normally distributed with variances 0.03 and 0.0025,respectively. The variances of the measurement noises, ul(t) and u 2 ( t ) , are both equal to 0.01. For a more detailed description of this example, see Narasimhan (1987). For each simulation trial, one run was made for the case of no gross errors and other runs were made for the case of specified measurement biases. A single specified gross error of a given magnitude was generated at time 0. The gross error detection methods were then applied at a specified time t to detect the presence of a gross error. The time was reset to 0, and the SPC tests were reinitialiied every time a gross error was correctly detected and removed. Note although the signature matrices for all three

Ind. Eng. Chem. Res., Vol. 31, No. 1, 1992 259 Table IV. Simulated ARL of the Moving Average Test (Equation 22) window size, N 2 3 4 sampling period, k 2 3 4 overlap 0 0 0 critical value, x ~ , o , s e 6 2 13.28 15.80 18.17 194.52 198.48 202.04 ARL(0) 60.98 44.85 35.56 ARLi(3) 47.82 44.88 42.52 ARLz(3) window size, N 2 3 3 1 2 1 sampling period, k 1 1 2 overlap 16.44 17.36 critical value,” ~ ~ ~ , ~ - a 2 14.25 ARL(0) 200.80 199.88 199.41 ARLi(3) 53.69 47.96 49.50 ARLz(3) 46.72 40.92 41.28

” The a in the critical value

5 5 0

20.48 201.45 38.90 41.75 4 3 1

18.65 198.84 35.97 41.31

8 8 0 27.10 196.72 35.20 37.12 4 2 2 19.40 199.20 37.98 40.56

10 10 0 31.41 198.50 38.10 37.20 5 4 1 20.90 200.80 38.72 40.92

20 20 0 51.81 196.40 54.20 37.60 5 2 3 22.10 199.66 37.10 37.64

is adjusted by trial and error to obtain ARL(0) around 200.

Table V. Simulated ARL of the EWMA Test (Equation 24) with Different Weighting Constants” X = 0.1 X = 0.2 X = 0.3 X = 0.4 ~1 = 3.10 ~1 4.00 ~1 = 4.80 ~1 = 5.65 198.58 194.60 199.22 ARL(0) 200.77 80.74 63.43 68.27 ARLl(3) 120.24 42.62 43.59 46.52 ARLz(3) 41.42

” X = weighting constant; c1 = critical value. types of gross errors were generated, i.e., the GLR test was set up to detect and identify all three types of gross errors, only measurement biases were actually simulated in this study.

Discussion of Results All results are based on 5000 simulations for each run. We first apply the MT test based on the test statistic given by eq 16. Table I gives the results of simulated ARL where ARL1(3) is the simulated ARL when a bias of magnitude 3al occurs in measurement 1 and ARL2(3)is the simulated ARL when a bias of magnitude 3u2 occurs in measurement 2. Note that ARL(0) is not subscripted since it is the average run length to detect a gross error in any stream when none of the streams have gross errors. From Table I we see that the MT performs poorly for all cases for detecting the bias in measurement 2. (The m(3is)even larger than A R L ( O ) . ) A partial explanation of why the MT test gives such a poor performance is as follows: From Table I1 we see that the signature matrix G(t) for the measurement bias changes with time. The diagonal elements of G(t)decrease rapidly from 1to 0.0279 and 0.0105, respectively. As a result, the expected values of the test statistics given by eq 16 decrease as time goes by. (Note that the covariance matrix of the innovations eventually becomes constant for a stable filter.) In other words, the net effect of the measurement bias reflected by the ith innovation, ri(t),becomes smaller and smaller with time (due to the smearing of the gross error over all innovations). Therefore, the component tests (MT, GLR, for example) do not have adequate power. Next we compare the performances of the global tests. Table III gives the results of simulated ARL for the x2 test (eq 20). Table IV gives the results for the MA test (eq 22). Table V gives the results for the EWMA test (eq 24)for different weighting constants A. Table VI gives the results for the CUSUM test (eq 29)for different magnitudes of the measurement bias. Among the above tests we see that the x2 and EWMA tests give the poorest performances; the MA test gives the intermediate performance, and the CUSUM test gives the best performance. An explanation of why the x2 and EWMA testa give poor performances is as follows: From Table I1 we have seen that the signature matrix G(t)for the measurement bias changes with

Table VI. Simulated ARL of the CUSUM Test (Equation 29) (Critical Value, c a = 6.97) gross error simulated ratio of (6/0) ARL no gross error 0 206.02 bias in stream 1 3 7.08 bias in stream 2 3 11.46 no gross error 0 211.56 bias in stream 1 2 50.91 bias in stream 2 2 24.15 no gross error 0 203.44 bias in stream 1 1 146.54 bias in stream 2 1 101.08 Table VII. Performance of the CUSUM-GLR Method average gross error estimate of simulated 6/o ARL(6/u) (6/0) AEE/% no gross error 0 202.15 bias in stream 1 3 7.21 3.22 7.43 3 11.87 bias in stream 2 3.40 13.41 207.56 no gross error 0 bias in stream 1 2 61.30 2.24 12.08 bias in stream 2 2 28.96 2.41 20.58 no gross error 0 209.00 237.32 bias in stream 1 1 1.18 18.20 124.16 1.30 20.58 bias in stream 2 1

time. The diagonal elements of G(t)decrease rapidly from 1 to 0.0279 and 0.0105,respectively. Therefore, the x2 statistic given by eq 19 decreases with time. In other words, the overall effect of the gross error on the innovations becomes smaller and smaller as time goes by. The x2 test has the feature that it uses only the information in the current innovations and ignores the past history and therefore is not very sensitive for detecting a gross error since the effect of a gross error diminishes with time. By the same token, the EWMA test has the feature that it gives less weight to the old data and more weight to the recent data, i.e., it weighs the data in the “wrong” direction (in this case) and therefore is not so powerful. On the contrary, the CUSUM test gives equal weight to all the data up to the present time, and therefore it gives superior performance. We also performed simulation runs for the aforementioned global testa for different magnitudes of the measurement biases, but only the results for the CUSUM test are reported in Table VI. Those simulation runs indicated that the CUSUM test gives the best performance of all SPC tests. Therefore we chose the CUSUM test to detect gross errors. Once gross errors have been detected, we use the GLR method to identify and estimate their magnitudes. We use the combination of CUSUM and GLR, the so called CU-

260 Ind. Eng. Chem. Res., Vol. 31, No. 1, 1992 Table VIII. Performance of the CUSUM-GLR Test When Measurement Noises Follow ARMA(1,l) Models (0.8, 0.2) (0.2, 0.8) (0.2, 0.8) (0.8, 0.2) values of ( hell) (0.2, 0.8) (0.8, 0.2) (0.2, 0.8) (0.8, 0.2) values of ($I~,eZ2) no Yes no Yes no no prewhitening Yes Yes -1.42 7.15 -1.55 9.95 6.97 6.50 40.0 6.97 critical value, cq 196.16 204.47 194.07 194.23 204.73 203.38 210.69 214.28 no gross error, ARL(0) 3.81 1.36 97.39 17.14 126.93 8.81 74.35 1.31 bias in stream 1,ARLl(3) 15.07 24.55 1.71 17.19 23.92 39.61 37.50 1.68 bias in stream 2, ARL2(3) 3.00 3.55 4.36 4.29 3.53 3.64 3.55 3.00 average estimate of (a/u), 4.48 3.77 4.01 3.76 3.00 4.60 3.67 3.00 average estimate of (6/uI2 16.67% 0.10% 17.67% 21.33% 45.33% 43.00% 18.26% 0.04% 49.33% 25.67% 25.33% 33.67% 53.33% 0.10% 22.25% 0.01% (AEE)q

SUM-GLR method. Table VI1 gives the results for the CUSUM-GLR method for detecting different magnitudes of measurement biases. The estimated magnitudes of measurement biases are also given. The AEE is the average relative percentage error in the estimated magnitude of the gross error and is defined by (44)

where N I is the total number of simulation trials in each case. From Table VI1 we see that as 6 increases, the ARL(G/u) decreases. This is expected since it is easier to detect a larger measurement bias. Also, the M E becomes smaller as S_ becomes larger for the same amount of absolute error 16, - 61. Comparing Tables VI and VII, we see that the GLR method has difficulty identifying the gross error when the magnitude of the gross error is small (see, for example, the 6 = 1 . 0 case). ~ Next we consider the case when the measurement noises are not white but are serially correlated and are modeled as ARMA(1,l) stochastic proteases. For studying this case, many combinations of values of (4ii,OJ are possible. The main thing of concern seems to be whether dii > Oii (denoted by “+”) or 4ii < Oii (denoted by “-”). Suppose we assign some numerical values to (&, OiJ, say (0.8,0.2)for the + case and (0.2,0.8)for the - case. There are 22 = 4 combinations possible, which would correspond to a full factorial design for the two-measurements-level control problem. Table VI11 gives the results for different ARMA(1,l) models for the measurement noises with and without prewhitening. Note that, for comparison of these two methods, the ARL(0) is controlled around 200, which was achieved by adjusting the critical value of the test statistic using trial and error via simulation. We see that the ARLi(6/a)’s for prewhitened data are smaller than those for the case without prewhitening when 4ii< Oii, and ~ Oii. This is expected since the the reverse holds when c $ ~ > magnitudes of measurement biases are magnified (see eq 39) after prewhitening when r # ~