Ind. Eng. Chem. Res. 2007, 46, 4531-4548
4531
Online Temporal Signal Comparison Using Singular Points Augmented Time Warping Rajagopalan Srinivasan*,†,‡ and Mingsheng Qian† Laboratory for Intelligent Applications in Chemical Engineering, Department of Chemical and Biomolecular Engineering, National UniVersity of Singapore, 10 Kent Ridge Crescent, Singapore 119260, and Process Sciences and Modeling, Institute of Chemical and Engineering Sciences, 1 Pesek Road, Jurong Island, Singapore 627833
Advances in instrumentation and data storage technologies have allowed the process industries to collect extensive operating data which can be used to extract information about the underlying process and provide online decision support. One of the fundamental problems in data-based decision support is comparison of time-series data. Many signal comparison methods require signals that are of the same length and synchronized. Synchronization of varying length signals is usually achieved using dynamic time warping (DTW). Major limitations of DTW include computational cost and the tendency to link operationally different points. Previously, we proposed singular points augmented time warping to overcome these shortcomings during offline signal comparison (Srinivasan and Qian Ind. Eng. Chem. Res. 2005, 44, 4697). Landmarks in process data such as extreme values and sharp changes, called singular points, are used to segment a signal into regions with homogeneous properties, called episodes. Singular points of two signals are linked by dynamic programming; time warping is used to synchronize the episodes. A locally optimal equivalent of DTW called extrapolative time warping (XTW) with better computational performance was also proposed. In this paper, we present the extension of this approach to online signal comparison. The online signal comparison problem is a generalization of the offline problem and has two additional challenges: (1) the reference signal for comparison is not known a priori and has to be selected from a library, and (2) the starting point of the reference and real-time signal would not coincide in general, and the corresponding points have to be identified. The approach proposed here addresses these by extending dynamic locus analysis (Srinivasan and Qian Chem. Eng. Sci. 2006, 61, 6109) and singular point augmented XTW using anchoring and flanking strategies. The application of the proposed approach is illustrated using two different case studies: online operating mode identification in the Tennessee Eastman process and online fault identification in a lab-scale distillation column. The results show that the proposed approach is robust, efficient, and suitable for online signal comparison. 1. Introduction As a result of significant advances in data collection and storage, vast amounts of historical operating data are becoming commonly available in the chemical process industry. This data is a rich source of information about the process that can be used to improve the plant operation. Given the parallel developments in pattern classification3 and statistical, information, and systems theories,4 data-based approaches have been gaining in popularity. Potential areas of application of datadriven methods include regulatory and sequence control, visualization, operation improvement, state identification, and fault diagnosis. Despite these developments in extracting information and knowledge from data, many important and challenging problems persist in data analysis and knowledge extraction. In this paper, we address one such problem: online temporal signal comparison. Temporal signals with time-varying properties commonly arise in chemical plants during transition states such as startup, grade change, shutdown, maintenance, and other abnormal operations. The precept of signal comparison-based approaches is that similar states result in similar temporal signals. So, if a historical database of representative signals is available, the root cause of a process change occurring in real-time can be * Corresponding author. E-mail:
[email protected]. Tel.: +65 65168041. Fax: +65 67791936. † National University of Singapore. ‡ Institute of Chemical and Engineering Sciences.
determined. The challenge, however, is that, as a result of the nature of industrial operations, signals from different instances of the same state are not exact replicates; there would be significant differences in magnitude and duration of the variable profiles as a result of run-to-run variations arising from differing initial conditions, impurities, seasonal affects, and operator actions. A direct comparison of a real-time signal with a reference signal in the historical database would therefore be incorrect, and signal comparison methods try to account for these normal operational variations. 1.1. Previous Work in Online Signal Comparison. Three classes of signal comparison methods can be differentiated. The first class of methods is based on multivariate statistical aggregates of the signal such as principal component analysis (PCA). PCA-based approaches have been popular in process monitoring. For comparing two multivariate signals, Krzanowski5 proposed a PCA similarity factor that compares the reduced PCA subspaces of the original signals. Raich and Cinar6 used the PCA similarity factor for diagnosing process disturbances. Singhal and Seborg7 modified the PCA similarity factor by weighing the principal components with the square root of their corresponding eigenvalue, λ. The PCA similarity factor is only applicable for statistically stationary signals whose properties do not change with time. To extend it to non-stationary signals, Srinivasan et al.8 proposed a dynamic PCA-based similarity factor SDPCAλ that accounts for the temporal evolution of the signal. The main advantage of the PCA-based methods is their inherent ability to deal with multivariate signals and
10.1021/ie060111s CCC: $37.00 © 2007 American Chemical Society Published on Web 05/10/2007
4532
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
Figure 1. Typical signal and its singular points.
their low computational requirements. Their main shortcomings are that (1) they do not explicitly consider the synchronization between the model and the real-time signal because time is accounted for implicitly in these models; (2) they are nonintuitive, especially for plant operators, because the comparison is based on a derived quantity with no physical significance; and (3) they consider the data as monolithic and arising from a single process state with specified statistical properties. This last requirement makes them unsuitable for online applications in multi-mode processes that can operate in multiple states. Also, for operator decision support, it is important to not only calculate the extent of similarity but also identify the point of divergence, that is, the point in time from when the two signals start to deviate from one another. Because the PCA-based methods consider the whole of the data as a single block they cannot directly detect the point of divergence. The second class of signal comparison methods uses a qualitative abstraction of the actual signal.9,10 Rengaswamy and Venkatasubramanian11 used a set of qualitative primitives, such as increase, decrease, and so forth, to represent the evolution of a signal. Libraries of qualitative trends corresponding to various process states are calculated offline. The patterns of the qualitative trends in the real-time signal are compared with those in the library and used for fault diagnosis. The interested reader is referred to the reviews by Venkatasubramanian et al.12 and Maurya et al.13 for further details. A simple trend comparison does not suffice for non-stationary processes because the mapping between the trend and the process state is one to many; that is, the same trend may correspond to normal operation in one state but a fault in another. Sundarraman and Srinivasan14 proposed an enhanced trend analysis approach to overcome this by considering the duration and magnitude along with the qualitative shape of the trend. The triplet of shape, duration, and magnitude of the trend enables state-specific comparisons because the enhanced trend-to-state mapping is one-to-one. The main shortcomings of the trend-based approaches are their univariate nature and larger time lags in state identification arising from the qualitative abstraction of the signal and the consequent loss of information. The third class of methods uses a direct comparison between an offline annotated library of signals and the real-time evolution
of the process to overcome the latter shortcoming. However, it is normal for signals from different runs of the same state to be slightly different and not match each other perfectly. Therefore, methods to compare signals by adjusting their time scales, called time warping, have been developed. One such is dynamic time warping (DTW), which originated as a robust method for calculating the difference between unsynchronized speech signals.15 Let T and R denote two time-sampled signals of lengths t and r to be synchronized and let j and i denote the time index of their trajectories, respectively. Let the superscript * denote the optimal value of the variable. DTW finds a sequence F* of P points on an r × t grid such that a total distance measure between the two trajectories is minimized.
F* ) {c(1), c(2), ..., c(p), ..., c(P)} c(p) ) [i(p) j(p)] D(r, t) )
1
(1) (2)
P
∑d[i(p), j(p)] w(p) N(w)p)1
D*(r, t) ) min[D(r, t)] F
(3) (4)
where (i(p), j(p)) is the warping assignment at step p, d[i(p) j(p)] is the local distance between the point j(p) of T and point i(p) of R, D(r, t) is the normalized total distance between the two signals, and D*(r, t) is the minimum normalized distance between them. Constraints are often used to define and restrict the search space and find an alignment that optimizes some criterion. They are motivated by physical considerations, to avoid excessive compression or expansion, speed up the calculation, or enforce other problem-specific limits on the alignment. A common global constraint is to set the end point of T and R to coincide, that is,
c(1) ) (1, 1)
(5)
c(p) ) (r, t)
(6)
Local constraints determine local features around this point. For example, the Itakura local constraint defines (i - 1, j), (i - 1,
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007 4533
Figure 2. Algorithm for real-time signal tracking.
j - 1), and (i - 1, j - 2) as the set of predecessors to (i, j) and results in a local slope in [1/2 2]. Let DA(i, j) be the minimum accumulated total distance between the two signals from (1, 1) to (i, j). The optimization problem in eq 4 reduces to
DA(i, j) ) DA(i - 1, j) + d(i, j) or [∞ if condition (A*)] min DA(i - 1, j - 1) + d(i, j) DA(i - 1, j - 2) + d(i, j)
{
}
(7)
where DA(1, 1) ) d(1, 1) and condition (A*) indicates that the predecessor of point (i - 1, j) is the point (i - 2, j). More details of DTW can be found in Sankoff and Kruskal.15 Kassidas et al.16 used DTW for synchronizing batch trajectories. A major shortcoming of DTW is its computational complexity (O(t‚r)) which precludes its use with long signals common in the process industry. Also, the entire assignment history has to be stored in memory with the concomitant memory requirements. Further, in many cases, the minimum distance criterion used in DTW does not guarantee comparison and synchronization of operationally equivalent points of the signal. To overcome the above limitations, Srinivasan and Qian1 augmented time warping by using landmarks in the signal, called singular points. We summarize this approach and related developments in section 2. Previous applications of DTW have focused on offline signal comparison, where the two signals to be compared are entirely available beforehand. For online state identification or fault diagnosis, a signal that is evolving in real-time has to be compared. Also, the reference signal is not known a priori. This results in two challenges: (1) locating the optimum reference signal from a library of signals and the point from which comparison between the two signals should start and (2) performing robust signal comparisons with a computational requirement suitable for online use. We address these two challenges in this paper. In section 3, we propose a flanking strategy for efficiently identifying the reference signal as well
an anchoring strategy for updating the signal comparison when a new sample becomes available. In section 4, the application of the method to state identification in the Tennessee Eastman simulation and fault diagnosis in a lab-scale distillation column is reported. Summary and conclusions from this work are presented in section 5. 2. Singular Points Augmented Time Warping Information content is not homogenously distributed throughout a signal; rather, the majority of the features of the signal are concentrated in a small number of points. Such points, which are landmarks in the signal evolution, are called singular points.1 The singular points of a sample signal are shown in Figure 1. Mathematically, each singular point is a triplet TSP ) {Γ, Ω, Ψ} where Γ is the time of occurrence of the singular point, Ω is the magnitude of the variable at the singular point, and Ψ is its type. Three types of singular points can be differentiated corresponding to sharp changes, extrema, and trend changes; thus, singular points are broadly points of local extrema in the variable and its first and second derivatives. The different types of singular points are not mutually exclusive, and the same point can correspond to more than one type. This is illustrated in Figure 1: P1 is both an extreme point and a sharp change point; similarly, P2 and P4 are trend change points as well as sharp change points. In such cases, the list of all matching types is noted. Methods for singular points identification are reported in Srinivasan and Qian1 and Qian.17 The segment of a signal between adjoining singular points is called a singular episode. An episode thus consists of regions of nearly constant slope, small oscillations, and so forth. Clearly, a signal can be deconstructed into adjoining singular episodes; equivalently, a signal can be described through an ordered set of singular points. The representation of a signal by its singular points enables its efficient synchronization and comparison with another signal. This is achieved by linking the singular points or episodes of the two signals using dynamic programming. One
4534
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
Figure 3. Algorithm for optimal reference signal R* identification.
Figure 4. Flanking segments used for reference signal identification.
important criterion for linking two sets of singular points is that their sequences should match. A sequence Violation occurs SP SP SP between a pair of singular points (TSP m , Rm ) and (Tk , Rl ) in SP signals T and R if m < k (that is, Tm temporally precedes TSP k )
SP but m > l (that is, RSP m does not temporally precede Rl ). As described in detail in Srinivasan and Qian,1 the singular points of two signals are said to correspond and can be linked if they are of the same type and if the linkage does not result in any sequence violations. For a given linkage of singular points, the distance between the two signals is calculated as the sum of episode-wise distances. Time warping is used to calculate the distance between the corresponding episodes so as to account for magnitude and duration differences between the episodes. Among the various linkages possible between two singular point sequences, the linkage that results in minimum signal distance is considered optimal. Srinivasan and Qian1 also proposed a new time warping strategy called extrapolative time warping (XTW), which is a greedy search modification of classical DTW with Itakura local constraint. The XTW method obviates dynamic programming for each local point by optimizing each point locally. In contrast to DTW, search in XTW proceeds in the forward direction starting from the first point of the signal to the last. Given the warping assignment (i, j), the optimal warping for the subsequent step, that is, the location of j* that corresponds to (i + 1), is based only on the previous decision and the current distance.
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007 4535
Figure 5. Temporal development and translation of flanking segments during optimal reference signal identification.
Three possible successors, (i + 1, j), (i + 1, j + 1), and (i + 1, j + 2), are considered. In lieu of eq 7, the optimal search path in XTW is therefore defined by
DA(i + 1, j*) ) min DA(i, j) + d(i + 1, j) or [∞ if condition (B*)] j* ) j DA(i, j) + d(i + 1, j + 1) j* ) j + 1 DA(i, j) + d(i + 1, j + 2) j* ) j + 2 (8)
{
}
with initial condition DA(1, 1) ) d(1, 1). Condition (B*) indicates that the predecessor of point j* ) j. Thus, for each step, the decision for the corresponding point for i is based only on three comparisons: to increase j by 0, 1, or 2. Following the Itakura local constraint, if the previous decision was to increase j by 2, then the successor would not have this option (so as to maintain the local slope in the [1/2 2] range), and j can increase only by 0 or 1. Similarly, in the preceding step, if j did not increase, the successor would not have the option to remain at the same j, and j* ) j + 1 or j* ) j + 2. Because any decision is based only on the previous decision and the current difference, dynamic programming is obviated. The search space of the XTW is the same as DTW with the Itakura local constraint. However, unlike DTW, in XTW once a match has been assigned, future assignments will not affect it. The assignment history tree, which is the origin for the large computational storage requirements of DTW, is not necessary in XTW; instead only the list of assignments needs to be maintained. The greedy extrapolative search for any point decreases the computational time and provides a major advantage when XTW is used in online signal tracking as explained in section 3. However, because global information is not used at each step, an optimal solution is not guaranteed by XTW. This issue comes to the foreground when long signals have to be compared. In our approach, this is addressed by combining XTW with singular point linkage. The singular points based time warping approaches enforce the optimal linkage of the major landmarks of the two signals
Table 1. TE Process Measurements and Their Base Values variable name A feed (stream 1) D feed (stream 2) E feed (stream 3) A and C feed (stream 4) recycle flow (stream 8) reactor feed rate (stream 6) reactor pressure reactor level reactor temperature purge rate (stream 9) product separator temperature product separator level product separator pressure product separator underflow (stream 10) stripper level stripper pressure stripper underflow (stream 11) stripper temperature stripper steam flow compressor work reactor cooling water outlet temperature condenser cooling water outlet temperature
variable number
base case value
units
XMEAS (1) XMEAS (2) XMEAS (3) XMEAS (4) XMEAS (5) XMEAS (6) XMEAS (7) XMEAS (8) XMEAS (9) XMEAS (10) XMEAS (11)
0.25052 3664.0 4509.3 9.3477 26.902 42.339 2705.0 75.0 120.40 0.33712 80.109
kscmh kg h-1 kg h-1 kscmh kscmh kscmh kPa gauge % °C kscmh °C
XMEAS (12) XMEAS (13) XMEAS (14)
50.000 2633.7 25.160
% kPa gauge m3 h-1
XMEAS (15) XMEAS (16) XMEAS (17)
50.000 3102.2 22.949
% kPa gauge m3 h-1
XMEAS (18) XMEAS (19) XMEAS (20) XMEAS (21)
65.731 230.31 341.43 94.599
XMEAS (22)
77.297
°C kg h-1 kW °C °C
using dynamic programming. The global optimality of the episode-level comparison is therefore not a critical requirement because the optimal assignment of each time point within an episode has no physical significance and is rarely necessary in practical applications. The two-step comparison also leads to significant improvements in speed, memory requirement, and efficiency of signal comparison. Another important advantage is that because the singular points have physical meaning such as the beginning or ending of a process event, they can be directly used for state identification, monitoring, and supervision. Singular point augmented time warping is suited for signals whose endpoints are known to correspond as these are used in the initial warping assignment (1, 1). This is not the case in
4536
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
Figure 6. Test signal T and reference signals (R1 and R2) for the illustrative example.
real-time signal comparison where the online signal is available starting from an unknown state. We use dynamic locus analysis (DLA) to identify the endpoints in the library signal that correspond to those of the real-time signal. 2.1. Dynamic Locus Analysis for Finding the Best Matching Segment. DLA is an efficient method to compare a short signal with a long reference signal and identify the best matching segment from the reference.2 Consider the short signal X ) {x1, x2, x3, ..., xm} which is the last m samples from an online sensor. Here m is the size of the evaluation window. Let Y ) {y1, y2, y3, ..., yn} be a long reference signal. For every segment of Y, say Z ) {yl, yl+1, ..., yj}, a segment Z* ) {yl*, yl*+1, ..., yj*}, is called the locus of X if
D*(X, Z*) ) min{D*(X, Z)} l,j
(9)
Also, yl* is called the corresponding point of x1, and yj* is the corresponding point of xm. The brute force approach of considering each possible l in Y and performing an independent comparison will result in an unacceptable computational load. DLA overcomes this by extending Smith and Waterman’s18 dynamic programming approach for comparing protein sequences. In DLA, the locus of X is identified by using a dissimilarity matrix, DS. Let i and j be the time indices of X and Y, respectively. The (i, j) element of DS measures the minimal difference between the sub-segment {x1, x2, x3, ..., xi} in X and the sub-segment {yl, yl+1, yl+2, ..., yj} in Y. In the general case, l is unknown and is determined using dynamic programming. i
DS(i, j) ) min{ F
∑∆(xd, yj(d))}
d)1
(10)
where yj(d) is the time warped point that matches with xd and ∆(xd, yj(d)) ) |yj(d) - xd| is the difference between xd and yj(d). Because the optimal search should allow for compression and elongation in Y relative to X, time warping is used to synchronize X and Y. Following DTW with Itakura local constraint, eq 10 reduces to
DS(i, j) ) min{DS(i - 1, j - 1) + ∆(xi, yj), Fi,j, Gi,j}i ∈ [2 m] j ∈ [2 n] (11) Fi,j ) DS(i - 1, j - 2) + ∆(xi, yj) Gi,j ) DS(i - 1, j) + ∆(xi, yj) or ∞ if G* where G* indicates that the predecessor of point (i - 1, j) is the point (i - 2, j). Note that DS(i, j) is not the total minimum distance between {x1, x2, x3, ..., xi} and{y1, y2, ..., yj}, rather it is the total minimum distance between X and its locus in Y. Segments of Y that are similar to X would lead to small values of DS. The optimal match between X and the locus in Y is given by DS(m, j*) where j* ) argminj{DS(m, j)} j ∈ [1 n]. More details of DLA can be found in Srinivasan and Qian.2 In this paper, we extend DLA and singular point augmented time warping for online signal comparison. 3. Online Signal Comparison Using Singular Points Augmented Time Warping The online signal comparison problem can be stated as follows: Given a set of reference signals K and a real-time signal T emanating from the process operating at an unknown state, (1) identify the reference signal that best matches the current state of the process and (2) identify the progress of the process
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007 4537
Figure 7. Comparison of real-time signal T at τT ) 8 with R1 (shown in part b) reveals a minimum at τR1 ) 199 (shown in part c). Similar comparison with R2 depicted in part d shows a minimum at τR2 ) 1083 as shown in part e.
with respect to the reference signal. The former is called the optimal reference signal identification problem while the latter is referred to as real-time signal (or state) tracking. The first step involves comparison of the real-time signal with many reference signals and is computationally more intensive than the second step; hence, although the first step could be repeated at every instant, it is not tenable for online application, where the requirement is that the calculation time at every sample must be less than the sampling interval ∆τ. New signal comparison strategies that extend XTWSP and DLA are needed for these purposes. 3.1. Flanking Strategy for DLA. The DLA provides an efficient way to identify the locus: a sub-segment of a long signal that best matches another (short) signal. Although DLA can be directly used for optimal reference signal identification, because its computational cost is proportional to the length of the two signals (real-time and reference), its efficiency would continually decrease as the real-time signal evolves and becomes longer. The flanking strategy bounds the length of the signal used for locus identification by decomposing the real-time signal. Flanking segments, signal segments of fixed length from the beginning and end of the signal, are used for this purpose. Consider a signal X ) {x1, x2, ..., xm}, where m g 2ν. In the flanking strategy, X is decomposed into three segments: the anterior flanking segment, the core segment, and the posterior flanking segment. The anterior flanking segment of X is defined
as its first ν points, that is, XA ) {x1, x2, ..., xν} where ν is the flank length. Similarly, the posterior flanking segment of X is defined as the last ν points, that is, XP ) {xm-ν+1, xm-ν+2, ..., xm}. The inner m - 2ν points of X comprise the core segment as illustrated in Figure 4 for a sample signal. The flanking segments of the signal can be used to identify the locus of X efficiently based on the recognition that any locii of X should also have segments that have high similarity with the flanking segments XA and XP. The flanking strategy exploits this property. Given a long reference signal Y, all the segments in Y, say YA, that closely match XA can be identified. Similarly, all adequate matches for XP, say YP, can also be identified. Note that the lengths of YA and YP may not be equal to the flank length due to run-to-run variations between the real-time and the reference signals. Each pair of YA and YP where YA precedes YP in Y can be used to construct a unique composite segment Z sandwiched by YA and YP, Z ⊂ Y. Each composite segment is a possible locus of X. By eq 9, the locus of X in Y is the composite segment Z* which has the least difference with T. The flanking strategy is thus a generalization of the DLA for identifying the locus of a longer signal. It recognizes that the computational complexity of any signal comparison method depends directly on the length of the two signals to be compared. The flanking strategy is computational efficient because the two flanking segments are short; therefore, the cost of the first phase of comparisons with all the reference signals is small and
4538
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
Figure 8. Snapshot at τT ) 226 of signal comparison between T and R.
Figure 9. Snapshot at τT ) 682 of signal comparison between T and R.
bounded by ν. In general, any method can be used to identify the locii YA and YP; we use DLA as described in section 3.4. Also, any method can be used to compare the composite segments Z with X; we use XTWSP for this purpose. 3.2. Anchoring Strategy for Time Warping. The anchoring strategy also seeks to improve the computational efficiency of
comparing a signal that is evolving in time. Consider X ) {x1, x2, x3, ..., xm}, the last m samples from an online sensor, and Y ) {y1, y2, y3, ..., yn}, a long reference signal. Let a segment of Y, say {y1, ..., yj} be known to match{x1, x2, x3, ..., xm}. The base point at time m is defined as the point in the Y that corresponds to xm. In this case, the base point at time m is yj.
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007 4539
Figure 10. Comparison of real-time signal T at τT ) 685 with R1 (shown in part b) reveals a minima at τR1 ) 185 (shown in part c). Similar comparison with R2 depicted in part d shows a minimum at τR2 ) 1087 as shown in part e.
The portion of X used for comparison is called the eValuation window and notated by the subscript w. The matching segment in Y is also notated by the subscript w. Thus, if the entire X is used for comparison, Xw ) {x1, ..., xm} and Yw* ) {yl, ..., yj}. When a new sample xm+1 becomes available from the online sensor, Xw is extended by appending xm+1, and the base point has to be recalculated. Any time warping method can be used for this purpose. As mentioned above, the computational expense of signal comparison increases with signal length; therefore, as new samples are obtained, it takes ever-increasing time to calculate the base point because the evaluation window increases monotonically. The anchoring strategy provides a systematic way of trimming the evaluation window without sacrificing accuracy. It relies on the insight that the entire X does not need to be compared with Y at every instant. If at time m, {x1, ..., xm} has matched {y1, ..., yj}, then any future divergence of X from Y can be detected using only recent observations of X in the evaluation window. Anchor points are 2-tuples from X and Y that are known to correspond. In the above, (x1, y1) can be considered an anchor point. In general, any pair of singular points in X and Y, say XSP and YSP, that have been linked by dynamic programming can be used as anchor points. Also, if tXSP and tYSP are the times of occurrence of XSP and YSP, respectively, then two nonoverlapping segments can be differentiated in X comprising the
observations {x1, ..., xtXSP-1} before the anchor point and the ones after, {xtXSP, ..., xm}. Similarly, Y is split by the anchor point into two segments: {y1, ..., ytYSP-1} and {ytXSP, ..., yn}. Denoting the difference between X and Y by ∆(X, Y),
∆(X, Y) ) ∆({x1, ..., xm}, {y1, ..., yj})
(12)
Using the linear additivity property,
∆(X, Y) ) ∆({x1, ..., xtXSP-1}, {y1, ..., ytYSP-1}) + ∆({xtXSP, ..., xm}, {ytXSP, ..., yn}) (13) Because XSP and YSP are linked singular points, ∆({x1, ..., xtXSP-1}, {y1, ..., ytYSP-1}) ≈ 0 and ∆(X, Y) ≈ ∆({xtXSP, ..., xm}, {ytXSP, ..., yn}). So, for updating the base point, the evaluation window can be shortened to begin from the anchor point, Xw ) {xtXSP, ..., xm} and Yw* ) {ytXSP, ..., yj}. The proposed online signal comparison approach uses the anchoring and flanking strategies for real-time signal tracking and optimal reference signal identification sub-problems as described next. 3.3. Real-Time Signal Tracking Using Anchoring Strategy. This step uses the optimal reference signal R* calculated a priori (see section 3.4) and seeks to confirm that the process continues to operate in the same state (i.e., same reference signal). It thus calls only for resynchronization of the real-time signal T based on the additional sample with R*. In the following, for ease of
4540
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
Figure 11. Schematic of the Tennessee Eastman19 process with control system. Table 2. Disturbance Profiles for TE Process (a) XD1 (b) XD2 (c) XD3 (d) XD4 (e) XD5. target
time (min)
target
XD1-A XD1-B XD1-C
1.20‚base value 1.15‚base value 1.10‚base value
180 240 300
1.40‚base value 1.35‚base value 1.30‚base value
XD2-A XD2-B XD2-C
1.03‚base value 1.025‚base value 1.02‚base value
180 240 300
XD3-A XD3-B XD3-C
1.05‚base value 1.045‚base value 1.04‚base value
XD3-A XD3-B XD3-C XD5-A XD5-B XD5-C
time (min)
target
time (min)
target
time (min)
(a) XD1 190 254 318
1.60‚base value 1.55‚base value 1.50‚base value
200 268 336
1.0‚base value 1.0‚base value 1.0‚base value
780 900 1020
1.05‚base value 1.045‚base value 1.04‚base value
(b) XD2 190 254 318
1.07‚base value 1.065‚base value 1.06‚base value
200 268 336
1.0‚base value 1.0‚base value 1.0‚base value
1020 1080 1200
180 240 300
1.10‚base value 1.09‚base value 1.08‚base value
(c) XD3 190 254 318
1.15‚base value 1.135‚base value 1.12‚base value
200 268 336
1.0‚base value 1.0‚base value 1.0‚base value
780 900 1020
1.05‚base value 1.045‚base value 1.04‚base value
180 240 300
1.10‚base value 1.09‚base value 1.08‚base value
(d) XD4 190 254 318
1.15‚base value 1.135‚base value 1.12‚base value
200 268 336
1.0‚base value 1.0‚base value 1.0‚base value
780 900 1020
0.95‚base value 0.955‚base value 0.96‚base value
180 240 300
0.90‚base value 0.91‚base value 0.92‚base value
(e) XD5 190 254 318
0.85‚base value 0.865‚base value 0.88‚base value
200 268 336
1.0‚base value 1.0‚base value 1.0‚base value
780 900 1020
explanation, the time variable for the real-time signal is denoted as τT and that of reference signal R as τR. The two signals can be compared starting from the beginning (i.e., τT ) 1); alternatively a smaller evaluation window can be used on the basis of the anchoring strategy described above. We pursue the latter option for computational efficiency purposes. Consider the real-time signal T ) {t1, ..., ti, ..., tm-1}. The evaluation window is defined as per the anchoring strategy based
on singular points. Let the last singular point in T be at time τT SP ) τSP T and let the value of T at that time be ttT . The corresponding singular point in R* can be obtained using XTWSP. These singular points in T and R* for the anchor point of the evaluation window. So, the evaluation window Tw at τT ) m - 1 is Tw ) {ttTSP, ttTSP+1, ..., tm-1}. Let the corresponding segment of R* that matches Tw be R/w. When a new sample tm becomes available at τT ) m, Tw is updated, and the task in
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007 4541
Figure 12. Three runs of XD1 in the TE case study with different magnitudes and duration.
real-time signal tracking is to update the base point and R/w through resynchronization as well as confirm that R/w continues to match Tw. In our approach, this is achieved in two steps, as shown in Figure 2: Step A: Efficient Calculation of the Difference between the Real-Time Signal and the Reference Signal. Any signal comparison method can be used for calculating the difference between Tw and R/w. We use XTW for this purpose because of its advantageous time and space requirements. The normalized time-warped distance between Tw and R/w is calculated as follows: m
η˜ (Tw, R/w) )
∑
|rj(i) - ti|
m-
τSP T
i)τTSP
of XTW comes at a price: a large η˜ (Tw, R/w) does not guarantee that Tw is dissimilar from R/w because the local greedy search in XTW can lead to an overestimate of the difference when signals become long.1 Such artifacts can be eliminated by a more accurate calculation using XTWSP, when necessary. Step B: Accurate Calculation of the Difference between the Real-Time Signal and the Reference Signal. XTWSP links the singular points in the real-time and reference signals and thus calculates an accurate difference between Tw and R/w. So if eq 15 is not satisfied, a more accurate difference is calculated using XTWSP, and a condition analogous to eq 15 is evaluated to confirm divergence of T from R*.
(14)
(η(Tw, R/w) < ηmax) and (∆(tm, rτR*) < 2 × ηmax) (16)
The numerator is the XTW difference between Tw and R/w while the denominator is the length of the evaluation window. The difference between the latest real-time sample tm and its warping assignment in R* as per XTW, notated as ∆ ˜ (tm, rτR*), is also calculated. If
Here, η(Tw, R/w) is the difference between Tw and R/w while ∆(tm, rτR*) is the distance between tm and the base point as calculated using XTWSP. Note that η(Tw, R/w) e η˜ (Tw, R/w) because XTWSP relies on singular point linkage. If eq 16 is satisfied, the process is considered to continue in the same state (i.e., no change to the reference signal), and the base point is updated. One byproduct of performing XTWSP is that new singular points could have been identified in Tw and linked with R*. The last singular point in T and its corresponding linked singular point in R* are subsequently used as the new anchor point which results in the shortening of the evaluation window. So, future Step A and Step B calculations become more efficient and accurate. If condition 16 is not satisfied, the process is considered to have moved to a new state, and another optimal
+1
(η˜ (Tw, R/w) < ηmax) and (∆ ˜ (tm, rτR*) < 2 × ηmax) (15) is satisfied, the process is considered to continue in the same stage of operation, and only the base point τR* is updated. The first condition ensures that the broad overall trend of the reference and real-time signals is the same while the second condition allows for larger local variations in the signal due to noise or run-to-run differences. The computational efficiency
4542
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
Figure 13. Schematic of the lab-scale distillation column.
reference signal has to be identified, starting at τT ) m as the point of divergence τPOD T . 3.4. Optimal Reference Signal Identification. The task in this stage is to identify the reference signal R* that best matches the state of the process at time tm. This would be required at τT ) 1 when the reference signal is not known and when the realtime signal has diverged from the previous reference (i.e., eq 16 is not satisfied). Consider a diVergent segment of T ranging from the point of divergence to the latest sample:
TD )
{
{tτTPOD, ..., tm}
if m g τPOD +ν T
{tm-ν+1, ..., tm} if ν e m < τPOD +ν T
(17)
Note that τPOD ) 1 at τT ) 1. The process state is identified by T comparing the divergent segment with all the reference signals in the library (K). Consider a reference signal R ) {r1, r2, r3, ..., rn} from K with time index j. Let η(TD, R) be the normalized difference between TD and R. The optimal reference signal R* is defined as
R* ) arg min (η(TD, R)) R∈K
(18)
A tradeoff exists between the speed and accuracy of reference signal identification. If the divergent segment used for identify-
ing R* is small, several good matches may exist in K. We quantify the accuracy of the located optimal reference signal in terms of the inseparability ratio R, defined as the ratio of the normalized difference of the best matching reference signal to that of the second-best one:2
R)
η(TD, R*) min
R∈K,R*R*
(η(TD, R))
(19)
The inseparability ratio thus reflects the uniqueness of R*. Following Srinivasan and Qian,2 the criterion for the optimal reference signal at τT ) m with base point τR* is
(η(TD, R*) < ηmax) and (R < Rmax)
(20)
where the first condition ensures low difference between TD and R* while the latter ensures its uniqueness. If eq 20 is not satisfied at τT ) m, an unknown state is flagged, and the calculation of R* and τR* is repeated when the next real-time sample becomes available at τT ) m + 1. Next, we describe how the difference η(TD, R) is calculated. Because the suitable start and endpoints of TD in R are not known a priori, the locus of TD in R has to be first calculated. The DLA can be used for this purpose but as described in section
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007 4543
Figure 14. Process signals from Run-03 of the lab-scale distillation column.
3.1 it is efficient as long as the length of the divergent segment is small. The flanking strategy proposed earlier becomes necessary if the divergent segment becomes long. We therefore consider two different phases as shown in Figure 3. En Bloc Comparison Phase. When the divergent segment < 2ν, and can be used in its entirety, is small, that is, m - τPOD T DLA is directly used to identify the locus of the divergent segment. The normalized difference is calculated as m
∑
η(TD, R) )
|rj(i) - ti|
i)m-ν+1
ν
(21)
where (i, j) is the warping assignment between TD and its locus in R. As mentioned above, if eq 20 is not satisfied and the optimal reference signal cannot be clearly determined, comparison is repeated when the next real-time sample becomes available. Thus the length of the divergent segment would increase with time. Once the length of TD becomes large, DLA becomes computationally expensive. We minimize the computational requirement for such cases using the flanking-based comparison strategy. Flanking Phase. When the divergent segment is large, m τPOD g 2ν and the flanking strategy becomes necessary. Short T flanking segments from the start and end of TD provide the basis for identifying the locus of TD in R. The anterior flanking segment of the divergent segment is XA ) {tτTPOD, tτTPOD+1, ..., tτTPOD+ν-1} and the posterior flanking segment is XP ) {tm-V+1, tm-ν+2, ..., tm}; thus, TD is sandwiched by the flanking segments.
DLA is used to identify all the matching segments YA in R such that η(XA, YA) < ηmax. [One significant point needs to be noted here. In contrast to the original DLA where only the best matching segment is identified, here several candidate segments may be identified. To optimize the calculation, candidates that are not at least one singular point away from a better candidate (in terms of smaller η) are rejected, so as to yield a wellseparated set of candidate segments (see Qian17).] The same criterion is applied to obtain all reference signal segments YP that match XP. Each pair of YA and YP defines a different composite segment in R. Because in general these composite segments can be long, XTWSP is used to synchronize them with TD. The difference between TD and R is then calculated as follows: m
∑
η(TD, R) )
|rj(i) - ti|
i)τTPOD
m - τPOD +1 T
(22)
Figure 5 shows how the en bloc comparison and the flanking phase are integrated in the proposed method for reference signal identification. When the process moves away from the previous state as indicated by eq 16, the immediately preceding ν points {tτT-ν+1, ..., tτT} are used as the divergent segment and en bloc comparison performed to identify the reference signal. If the optimal reference signal, as per eq 20, cannot be identified by τT ) τPOD + 2ν the divergent segment is split into anterior and T posterior flanking segments, each of length ν, and the flanking
4544
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
Table 3. Offline Signal Difference between Different Disturbance Instances in the TE Process Using Direct Comparison (×10-1)
XD1-A XD1-B XD1-C XD2-A XD2-B XD2-C XD3-A XD3-B XD3-C XD4-A XD4-B XD4-C XD5-A XD5-B XD5-C
XD1-A
XD1-B
XD1-C
XD2-A
XD2-B
XD2-C
XD3-A
XD3-B
XD3-C
XD4-A
XD4-B
XD4-C
XD5-A
XD5-B
XD5-C
0
0.0686 0
0.074 0.0629 0
0.2228 0.2235 0.2241 0
0.2052 0.2028 0.2009 0.1446 0
0.1983 0.1953 0.1885 0.166 0.11 0
0.1735 0.1735 0.1847 0.2327 0.2631 0.2465 0
0.1647 0.1564 0.1587 0.2309 0.2342 0.2424 0.1991 0
0.161 0.1476 0.1383 0.1892 0.2413 0.2157 0.1816 0.1736 0
0.3631 0.3739 0.3694 0.4558 0.4555 0.4556 0.4143 0.4482 0.4294 0
0.3589 0.3483 0.357 0.4544 0.4219 0.4351 0.4318 0.3953 0.4175 0.2139 0
0.3443 0.3387 0.3278 0.4107 0.4247 0.4001 0.4113 0.4007 0.3697 0.2909 0.2003 0
0.0977 0.1136 0.1114 0.2004 0.1914 0.1875 0.1591 0.176 0.1681 0.3386 0.3616 0.3589 0
0.1108 0.0903 0.1018 0.2015 0.1791 0.1786 0.1718 0.1415 0.1521 0.3605 0.3262 0.337 0.1005 0
0.1054 0.1023 0.0824 0.1739 0.1801 0.1624 0.1626 0.1565 0.124 0.3812 0.3425 0.3063 0.0868 0.0864 0
Table 4. Disturbance Identification during Run-4 of the TE Case Study η τT
R1
R3
R4
R5
R
1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044
0.0501 0.0592 0.0659 0.0693 0.0723 0.0755 0.0782 0.0779 0.0773 0.0767 0.0760 0.0752 0.0745 0.0737 0.0729
0.0511 0.0593 0.0656 0.0654 0.0604 0.0556 0.0510 0.0506 0.0508 0.0507 0.0506 0.0504 0.0500 0.0505 0.0504
0.0514 0.0589 0.0654 0.0572 0.0408 0.0240 0.0072 0.0070 0.0068 0.0065 0.0064 0.0061 0.0061 0.0060 0.0060
0.0512 0.0581 0.0647 0.0683 0.0702 0.0723 0.0742 0.0739 0.0732 0.0729 0.0723 0.0716 0.0712 0.0707 0.0703
0.9804 0.9864 0.9970 0.8746 0.6755 0.4317 0.1412 0.1383 0.1339 0.1282 0.1265 0.1210 0.1220 0.1188 0.1190
strategy is used. In this flanking phase of reference identification, only the posterior flank is translated when a new sample is added to the divergent segment. The anterior flanking segment remains anchored at XA ) {tτTPOD, tτTPOD+1, ..., tτTPOD+ν-1}. The length of the divergent real-time signal increases with time; however, the optimal reference identification is calculated in a computationally efficient fashion suited for online use. A detailed illustration is given next to explain the above-described signal comparison algorithm. 3.5. Illustrative Example. Consider the test signal T and the two reference signals R1 and R2 shown in Figure 6. Online data have been collected starting at τT ) 1. Because the optimal reference signal is unknown initially, it has to be identified first following the description in section 3.2. τPOD ) 1, and signal T comparison starts at τT ) 8 () ν). The en bloc difference calculation strategy is first used for finding the locus in reference signals R1 and R2 since the signal length at τT ) 8 is less than 2ν. At τT ) 8, the DLA difference for R1 and R2 are as shown in Figure 7. DS(ν, τR1) has a clear minimum at τR1 ) 199 and η(T, R1) ) 0.0151 ( ηmax). Therefore, R ) 0.1810 (< Rmax ) 0.70). So at τT ) 8, R1 is confirmed to be the optimal reference signal with τR1 ) 199 as the base point. From the next sample, τT ) 9, real-time signal tracking as described in section 3.3 is performed to confirm that T progresses as per reference signal R1. A snapshot of the real-time signal tracking at τT ) 226 is shown in Figure 8. At this time, the anchor for comparison between T and R1 is the previous corresponding singular points SP with τSP T ) 164 and τR1 ) 359. The base point is at τR1 ) 413.
Therefore, the evaluation window Tw ) {t164, ..., t226} is compared with R/w ) {r359, ..., r413} using XTW. η˜ (Tw, R/w) ) 0.0180 (< ηmax) and ∆ ˜ (tm, rτR*) ) 0.0086 (2 × ηmax), so eq 15 is violated. XTWSP is then used for accurate calculation of the difference between Tw and R/w. η(Tw, R/w) ) 0.0213 and ∆(tm, rτR*) ) 0.3633 (>2 × ηmax), so eq 16 is violated too. This confirms that the real-time signal is no longer similar to R1 and the ) 682 (see reference signal has to be re-identified with τPOD T Figure 9). The divergent segment TD ) {t675, t676, ..., t682} is used to find the new reference signal through DLA. Initially, m - τPOD T < 2ν, so en bloc difference calculation is performed. At τT ) 682, η(TD, R2) ) 0.0641, R ) 0.9596, and eq 20 does not hold, so the reference signal cannot be conclusively identified. Comparison is therefore repeated when subsequent samples becomes available. As shown in Figure 10, at τT ) 685, η(TD, R2) ) 0.0388 (< ηmax), and R ) 0.6039 (< Rmax). So the new reference signal is confirmed to be R2 with τR2 ) 1082 as the anchor point and τR* ) 1087 as the base point. In the next section, we evaluate the proposed online signal comparison on two case studies. 4. Case Studies 4.1. Case Study 1: Online Disturbance Identification in the Tennessee Eastman Process. The Tennessee Eastman (TE) process19 is a popular test bed for process control, fault diagnosis, and signal comparison. In this section, data from this simulated plant are used to test the accuracy of the proposed method. The TE process produces two products (G and H) and a byproduct (F) from reactants A, C, D, and E. The process flowsheet is shown in Figure 11. The process has five units: a two-phase reactor, a product condenser, a flash separator, a recycle compressor, and a product stripper. There are 53 variables in the TE plant: 22 of these are process measurement variables, 19 are component compositions, and 12 are processmanipulated variables. The closed-loop process simulator used here was developed by Singhal20 on the basis of the base control structure of McAvoy and Ye.21 During the simulation, variable values are recorded every minute. The 22 process measurements used in this paper for process state identification are shown in Table 1. In this case, we use dynamic signal matching to identify process disturbances online. Five disturbance classes, called XD1-XD5, that affect the A feed flowrate, reactor pressure, reactor level, reactor temperature, and compressor work are
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007 4545 Table 5. Online Process Disturbance Detection in the TE Process
Run-1 Run-2 Run-3 Run-4 Run-5 Run-6 Run-7 Run-8 Run-9 Run-10 Run-11 Run-12 Run-13 Run-14 Run-15 average
disturbance introduction time
disturbance identification time
bestmatching reference
9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
15 13 13 17 13 16 16 13 13 14 13 17 13 13 13
XD1 XD1 XD1 XD2 XD2 XD2 XD3 XD3 XD3 XD4 XD4 XD4 XD5 XD5 XD5
identification delay (sample)
second disturbance introduction time
disturbance identification time
bestmatching reference
6 4 4 8 4 7 7 4 4 5 4 8 4 4 4 5.1333
1020 670 680 1030 1100 420 860 970 1100 820 380 400 500 600 1080
1025 675 684 1034 1107 427 867 974 1105 825 387 407 505 604 1084
XD4 XD4 XD4 XD4 XD5 XD5 XD1 XD1 XD1 XD2 XD2 XD2 XD3 XD3 XD3
Table 6. Standard Operating Procedure (SOP) for Lab-Scale Distillation Column Startup
1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
identification delay (sample)
average time cost (CPU s)
average time cost of DTW (CPU s)
5 5 4 4 7 7 7 4 5 5 7 7 5 4 4 5.3333
0.0501 0.0699 0.1233 0.0273 0.0734 0.1289 0.0367 0.0689 0.2509 0.0373 0.0535 0.2024 0.0784 0.0712 0.2115 0.0989
55.3446 55.2385 55.2589 55.2721 55.3044 55.3314 55.3578 55.3446 55.3374 55.2985 55.3051 55.2517 55.2919 55.3051 55.2517 55.2996
Table 7. Process Disturbances During Distillation Column Operation
distillation column startup SOP
case
disturbance
type
set all controllers to manual fill reboiler with liquid bottom product open reflux valve and operate the column on full reflux establish cooling water flow to condenser start the reboiler heating coil power wait for all of the temperatures to stabilize start feed pump activate reflux control and set reflux ratio open bottom valve to collect product wait for all the temperatures to stabilize
DST01 DST02 DST03 DST04 DST05 DST06 DST07 DST08 DST09 DST10
reboiler power low reboiler power high feed pump high feed pump low Tray temperature sensor T6 fault reflux ratio high reflux ratio low bottom valve cooling water low cooling water flow and feed pump malfunction
step step step step random variation step step sticking slow drift step
studied here (see Table 2). Different instances (runs) of the same disturbance class have different start times, duration, and magnitude. For example, during XD1-A, the flowrate of A feed from upstream is increased from the base case value of 0.25052 kscmh to 0.3902 kscmh (a 60% change) in three steps starting at t ) 180 min as shown in Table 2a. After the process recovers from these, the inverse change, decreasing the A feed flow, is introduced at t ) 780 min. The process is then allowed to return to a steady state. The effect on the A flow rate (XMEAS(1)) and the downstream pressures (XMEAS(13) and XMEAS (16)) is shown in Figure 12. Two other instances XD1-B and XD1-C with changes of magnitude of 55% and 50% were also introduced. As described in Srinivasan et al.8 similar changes were introduced to bring forth the other disturbance classes. One instance from each of the classes, XD1-B, XD2-B, XD3B, XD4-B, and XD5-B, was used to develop the reference disturbance database for online disturbance identification. The difficulty in identifying the disturbance online can be estimated by a preliminary analysis. Table 3 shows the difference between the 15 disturbances calculated as the average difference among signals. Comparison is made after signals have been normalized to [0 1] on the basis of the range of the sensor (see Srinivasan et al.8). As can be seen from the table, the minimum inter-class distance is 0.0824 (between XD1-C and XD5-C) and the maximum intra-class distance is 0.2909 (class XD4). Therefore, difference between the classes by direct comparison, even if the complete signal is available, is a nontrivial exercise. In this work, we consider the even more difficult task of differentiating between the disturbances as they evolve. The proposed online signal comparison method is used for online disturbance identification as follows. Consider Run-4 where the process is in state XD2 until τT ) 1030. An unknown disturbance occurs starting at τT ) 1030 which is initially
indicated when η˜ (Tw, R/w) ) 0.0028, ∆ ˜ (tm, rτR*) ) 0.3147, and eq 14 is violated. An accurate difference is calculated using XTWSP and η(Tw, R/w) ) 0.0022 and ∆(tm, rτR*) ) 0.3065. Because this is larger than the 2 × ηmax threshold even after resynchronization, as per eq 15 it is evident that the real-time signal does not confirm to reference signal R2 starting from τT ) 1030 () point of divergence τPOD T ). The disturbance can be identified by calculating the new optimal reference signal R* and the base point τR* using the divergent segment TD ) {tm-V+1, tm-ν+2, ..., tm}. In the first iteration, at time τT ) 1030, η(T, R1) ) 0.0501, η(T, R3) ) 0.0511, η(T, R4) ) 0.0514, and η(T, R5) ) 0.0512 (see Table 4). Because the η values for all the reference signals are similar (as indicated by the inseparability ratio R ) 0.9804 > Rmax), R* cannot be identified at this point and further iterations are necessary. In each subsequent iteration, as the real-time signal T evolves, the evaluation window is updated (as shown in Figure 5) and the analysis repeated. As the disturbance becomes more evident with time, R decreases (see Table 4), and at τT ) 1034, η(T, R4) falls below ηmax. The optimal reference R* is then identified as R4 (i.e., disturbance XD4). The base point at τR* ) 314 in R4 is found to correspond to τT ) 1034. Real-time signal tracking is then resumed for subsequent samples. The average time cost for this run was 0.0273 CPU seconds with the proposed method as against 55.2721 CPU seconds with DTW as summarized in Table 5 (depicted as the second disturbance in Run-4). This brings out the computational advantages of the proposed strategy. Similar disturbance identification studies were performed for 14 other runs. Details of these are presented in Qian17 and only a summary is presented here. Similar high accuracies were found in all test runs as shown in Table 5. In each run, two disturbances were introduced. In all cases, the proposed method correctly identified the disturbance with an average delay of 5.23 min.
4546
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
Table 8. Faults Diagnosis Results for Lab-scale Distillation Column Case Study
case
time fault introduced (sample)
detection time (sample)
Run-01 Run-02 Run-03 Run-04 Run-05 Run-06 Run-07 Run-08 Run-09 Run-10
1 1 359 356 425 350 345 470 1 300
6 6 370 357 426 353 346 472 6 302
detection delay (sample)
identification time (sample)
5 5 11 1 1 3 1 2 5 2
average
τy (sample)
6 6 371 360 430 355 347 473 6 309
6 6 13 5 6 4 3 4 7 5
3.6
5.9
Table 9. Effect of Noise on Fault Diagnosis Delay in Lab-Scale Distillation Column Case Study identification time (sample) fault introduction time 1% Run-1 Run-2 Run-3 Run-4 Run-5 Run-6 Run-7 Run-8 Run-9 Run-10 average
1 1 359 356 425 350 345 470 1 300
6 6 371 360 430 354 347 473 6 308
identification delay (sample)
2%
3%
4%
5%
6 6 371 360 430 355 347 473 6 308
6 6 371 360 430 355 347 473 6 309
6 6 371 360 430 355 347 473 6 309
6 5 5 5 5 5 6 5 5 5 5 5 371 12 12 12 12 12 360 4 4 4 4 4 438 5 5 5 5 13 355 4 5 5 5 5 348 2 2 2 2 3 473 3 3 3 3 3 6 5 5 5 5 5 311 8 8 9 9 11 5.3 5.4 5.5 5.5 6.6
1%
2%
3%
4%
5%
The average time cost for online signal comparison is only 0.0989 CPU seconds (on a Pentium IV, 2.4 GHz cpu) in contrast to 55.3 s for DTW. This factor of 559 speedup in computation over DTW clearly shows the efficiency of the proposed method and illustrates its suitability for large-scale applications. 4.2. Case Study 2: Online Fault Diagnosis during Startup of a Lab-Scale Distillation Column. In this section, the proposed methodology is illustrated on a lab-scale distillation unit. The schematic of the unit is shown in Figure 13. The distillation column is of 2 m height and 20 cm inner-diameter and has 10 trays. The feed enters at tray 4. The system is well integrated with a control console and data acquisition system. A total of 19 variables comprising all tray temperatures, reboiler and condenser temperatures, reflux ratio, top and bottom column temperatures, feed pump power, reboiler heat duty, and cooling water inlet and outlet temperatures are measured at 10-s intervals. Cold startup of the distillation column with an ethanol-water 30% (v/v) mixture is performed following the standard operating procedure shown in Table 6. The feed passes through a heat exchanger before being fed to the column. The startup normally takes 2 h, and different faults such as sensor fault, failure to open pump, too high a reflux ratio, and so forth can be introduced at different states of operation. The reference database is first populated using data from 11 runs of the process: one normal startup and the 10 faults summarized in Table 7. The online signal comparison algorithm was then used for fault diagnosis and decision support during subsequent startups of the column. Consider one run (Run-3; Figure 14) when a fault was introduced at τT ) 3590 s when the operators introduced too large a feed pump flowrate (200 rpm) to the column. This causes instability in the column resulting in a drastic drop in the column’s temperatures. Results from this run show that the real-time signal is initially close to normal.
best matching reference
η
R
DST01 DST02 DST03 DST04 DST05 DST06 DST07 DST08 DST09 DST10
0.0162 0.0097 0.0101 0.0217 0.0109 0.0268 0.0319 0.0100 0.0103 0.0119
0.2817 0.1091 0.3456 0.6663 0.4113 0.2379 0.6940 0.2330 0.1582 0.6910
0.0156
0.3828
identification delay (sample) 5 5 12 4 5 5 2 2 5 9 5.4
time cost (s) 0.1716 0.1028 0.0317 0.0342 0.0256 0.0368 0.0264 0.0256 0.0556 0.0840 0.0594
The difference between the real-time signal and all other references is much higher; R ) 0.2235 at τT ) 6. Starting from t ) 3700 s, the difference between the real-time signal and the normal reference increases, indicating that there is a fault during the startup operation. The difference η between the real-time signal and DST03 is less than ηmax (0.05). Also, the R falls below Rmax (0.70), so the fault is identified as DST03. Similar tests were done for all other cases. In the interest of space, only a summary of the findings is presented in Table 8. Faults in all the test runs were correctly identified with an average delay of 3.6 samples (and maximum detection delay of 11 samples for Run-03). All faults could be accurately identified within an average of 5.4 samples of their occurrence. The maximum identification delay was about 12 samples for Run-03. The average R at the time of identification is 0.3828 against the Rmax threshold of 0.7, which shows the clear identification of the faults. The average computation time cost at each sample was 0.0594 s, which is much less than the sampling rate of 10 s. The proposed method is therefore clearly suitable for online fault diagnosis in this case as well. 4.3. Robustness and Parameter Tuning. In this section, the robustness of the proposed online signal comparison method is studied. Varying amount of noise levels were added to the online signal to investigate robustness to noise. The affect of the tuning parameter settings on online signal comparison was also studied. The interested reader is referred to Srinivasan and Qian1,2 for robustness studies of singular point detection and DLA. The robustness of the proposed method to sensor noise is reported in this part using data from the lab-scale distillation column case study. Additional measurement noises ranging from 1% to 5% were added to all the original signals and fault diagnosis performed. The results are shown in Table 9. As has to be expected, there is an increase in the fault identification delay with increasing noise; however, the effect is minimal with the average delay increasing from 5.3 samples to 6.6 samples. A similar study was performed for the TE case study as well (see Table 10), and the average identification delay increased from 5.1333 samples to 17.7333 samples at 5% noise level. This larger variation in the TE case arises from the inherent complexity and larger noise levels in the base process. Overall, the proposed method online signal comparison method is robust to noise. The proposed method uses two tunable parameters: Rmax and ηmax. While in the general case different process signals may require different values of these parameters, we have found that the same parameter settings can be used across variables and case studies. The results of decreasing Rmax from 0.80 to 0.60 for the distillation column as well as the TE case studies show that Rmax has no significant affect on fault detection delay. A smaller Rmax would require a clearer separation between the
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007 4547 Table 10. Effect of Noise on Disturbance Identification in TE Case Study identification time (sample)
Run-1 Run-2 Run-3 Run-4 Run-5 Run-6 Run-7 Run-8 Run-9 Run-10 Run-11 Run-12 Run-13 Run-14 Run-15 average
identification delay (sample)
disturbance introduction time
1%
2%
3%
4%
5%
1%
2%
3%
4%
5%
181 241 301 181 241 301 181 241 301 181 241 301 181 241 301
187 245 305 189 245 308 188 245 305 186 245 309 185 245 305
192 246 306 196 244 316 199 246 317 191 253 314 188 245 313
194 245 313 198 254 317 201 249 318 205 253 314 196 247 315
194 245 313 201 256 323 195 256 326 220 253 315 193 247 314
199 246 313 205 265 316 203 255 329 220 253 314 198 249 316
6 4 4 8 4 7 7 4 4 5 4 8 4 4 4 5.1333
11 5 5 15 3 15 18 5 16 10 12 13 7 4 12 10.0667
13 4 12 17 13 16 20 8 17 24 12 13 15 6 14 13.6000
13 4 12 20 15 22 14 15 25 39 12 14 12 6 13 15.7333
18 5 12 24 24 15 22 14 28 39 12 13 17 8 15 17.7333
Table 11. Effect of rmax on Identification Delay in the TE Case Study identification time (sample)
Run-1 Run-2 Run-3 Run-4 Run-5 Run-6 Run-7 Run-8 Run-9 Run-10 Run-11 Run-12 Run-13 Run-14 Run-15 average
identification delay (sample)
disturbance introduction time
Rmax ) 0.80
Rmax ) 0.75
Rmax ) 0.70
Rmax ) 0.65
Rmax ) 0.60
Rmax ) 0.80
Rmax ) 0.75
Rmax ) 0.70
Rmax ) 0.65
Rmax ) 0.60
181 241 301 181 241 301 181 241 301 181 241 301 181 241 301
187 245 305 189 245 304 186 245 305 185 245 305 184 245 305
187 245 305 189 245 305 187 245 305 185 245 308 184 245 305
187 245 305 189 245 308 188 245 305 186 245 309 185 245 305
187 245 311 190 245 313 189 249 315 186 247 313 185 245 313
187 245 313 191 267 313 189 249 316 186 247 327 185 282 355
6 4 4 8 4 3 5 4 4 4 4 4 3 4 4 4.3333
6 4 4 8 4 4 6 4 4 4 4 7 3 4 4 4.6667
6 4 4 8 4 7 7 4 4 5 4 8 4 4 4 5.1333
6 4 10 9 4 12 8 8 14 5 6 12 4 4 12 7.8667
6 4 12 10 26 12 8 8 15 5 6 26 4 41 54 15.8000
Table 12. Effect of rmax on Identification Delay in Lab-scale Distillation Column Case Study identification time (sample)
Run-1 Run-2 Run-3 Run-4 Run-5 Run-6 Run-7 Run-8 Run-9 Run-10 average
identification delay (sample)
fault introduction time
Rmax ) 0.80
Rmax ) 0.75
Rmax ) 0.70
Rmax ) 0.65
Rmax ) 0.60
Rmax ) 0.80
Rmax ) 0.75
Rmax ) 0.70
Rmax ) 0.65
Rmax ) 0.60
1 1 359 356 425 350 345 470 1 300
6 6 371 359 430 355 347 473 6 308
6 6 371 359 430 355 347 473 6 308
6 6 371 360 430 355 347 473 6 309
6 6 371 360 432 355 347 473 6 310
6 6 371 361 432 355 348 473 6 311
5 5 12 3 5 5 2 3 5 8 5.3
5 5 12 3 5 5 2 3 5 8 5.3
5 5 12 4 5 5 2 3 5 9 5.5
5 5 12 4 7 5 2 3 5 10 5.8
5 5 12 5 7 5 3 3 5 11 6.1
reference signal and the optimal reference signal before a fault is confirmed. This would lead to a delay in fault identification. The average identification delay for the TE case changed from 4.333 to 15.80 samples when Rmax was reduced from 0.80 to 0.60 (see Table 11) while for the distillation column case study the delay increased from 5.3 to 6.1 samples (see Table 12). The extent of robustness of Rmax is further revealed by the fact that for the distillation column case study, even setting Rmax ) 0.30 results in an average identification delay of only 9.1 samples. A similar result was obtained for ηmax as well (see Qian17 for details). Overall these results clearly establish the robustness of the proposed method to the two tuning parameters.
5. Summary Online signal comparison is important for process monitoring, fault diagnosis, and process state identification. In this paper, we have proposed a signal comparison-based strategy for online disturbance or fault identification. Given a suitably annotated historical database of process states, normal and abnormal, the proposed method finds the best matching state at any given time by comparing the real-time sensor measurements with the signals in the database. In contrast to signal comparison strategies reported in literature, which are designed for offline signal comparison, the proposed method does not require any a priori knowledge about the online signal; specifically, the beginning and end of the real-time signal do not need to coincide with
4548
Ind. Eng. Chem. Res., Vol. 46, No. 13, 2007
those of the library signals. The endpoints of the two signals are synchronized automatically using the DLA methodology. DLA is inherently computationally efficient when the real-time signal is small; the flanking strategy proposed here reduces the search complexity tremendously when a long segment of the real-time signal has to be compared. The real-time signal tracking strategy based on XTW and XTWSP further reduces the computational load required when the process essentially follows a previously determined reference signal. These endow the main advantage of the proposed method, which is that it is significantly faster in comparison with other time warping methods. This has been illustrated clearly using two different case studies: disturbance identification in the Tennessee Eastman challenge plant and fault online diagnosis during startup of a lab-scale distillation column. As shown in section 4, the method is also robust to noise as well as parameter settings. The time warping-based methods proposed here are inherently multivariate, although they have been used in a uni-variate form in the case studies. The choice between multivariate2 versus univariate1 signal comparison is based on the features of the application. Multivariate time warping relies on the premise that there is no desynchronization between variables so that the same warping can be applied to all the variables. This premise is not valid in processes undergoing transitions where the inherent variability of the manual operations would lead to singular points and inflections in the different variables occurring at different times. In such cases, additional process-specific logic can be used to synchronize cross variable differences. Notation i, j Time index DA(i, j) ) minimum accumulated DTW distance from point (1, 1) to point (i, j) DS ) DLA dissimilarity matrix between signal X and signal Y K ) collection of reference signals R ) reference signal R ) {r1, r2, ..., rj, ..., rn} R* ) optimal reference signal R/w ) locii of Tw in R* T ) real-time signal T ) {t1, t2, ..., ti, ...tm} TD ) divergent segment of T Tw ) evaluation window in T X ) signal X ) {x1, x2, ..., xi, ..., xm} XA ) anterior flanking segment XP ) posterior flanking segment Y ) signal Y ) {y1, y2, ..., yj, ...yn} YA ) matching segment of XA YP ) matching segment of XP Z ) segment of signal Y, {yl, yl+1, ..., yj} ∆(xi, yj) ) difference between xi and yj ∆ ˜ (tm, rτR*) ) difference between tm and its warping assignment in R* R ) inseparability ratio of reference signals 0 e R e 1 Rmax ) minimum inseparability threshold V ) flank length η(Tw, R/w) ) normalized difference between Tw and R/w η˜ (Tw, R/w) ) approximate XTW difference between Tw and R/w ηmax ) threshold for signal similarity
τR ) time index in R τT ) time index in T ) time index of point of divergence in T τPOD T τSP T ) time index of last singular point in T Literature Cited (1) Srinivasan, R.; Qian, M. S. Offline temporal signal comparison using singular points augmented time warping. Ind. Eng. Chem. Res. 2005, 44 (13), 4697-4716. (2) Srinivasan, R.; Qian, M. S. Online fault diagnosis and state identification during process transition using dynamic locus analysis. Chem. Eng. Sci. 2006, 61, 6109-6132. (3) Webb, A R. Statistical pattern recognition; Wiley: West Sussex, U.K., 2002. (4) Chiang, L. H.; Russell, E. L.; Braatz, R. D. Fault detection and diagnosis in industrial systems; Springer: London, New York, 2001. (5) Krzanowski, W. J. Between-group comparison of principal components. J. Am. Stat. Assoc. 1979, 74 (367), 703-707. (6) Raich, A.; Cinar, A. Diagnosis of process disturbances by statistical distance and angle measures. Comput. Chem. Eng. 1997, 21 (6), 661-673. (7) Singhal, A.; Seborg, D. E. Pattern matching in historical batch data using PCA. IEEE Control Syst. Mag. 2002, 22 (5), 53-63. (8) Srinivasan, R.; Wang, C.; Ho, W. K.; Lim, K. W. Dynamic PCA based methodology for clustering process states in agile chemical plants. Ind. Eng. Chem. Res. 2004, 43 (9), 2123-2139. (9) Cheung, J. T-Y.; Stephanopoulos, G. Representation of process trends - Part I. A formal representation framework. Comput. Chem. Eng. 1990, 14 (4/5), 495-510. (10) Bakshi, B. R.; Stephanopoulos, G. Representation of process trends - IV. Induction of real-time patterns from operating data for diagnosis and supervisory control. Comput. Chem. Eng. 1994, 18 (4), 303-332. (11) Rengaswamy, R.; Venkatasubramanian, V. A syntactic pattern recognition approach for process monitoring and fault diagnosis. Eng. Appl. Artif. Intell. 1995, 8 (1), 35-51. (12) Venkatasubramanian, V.; Rengaswamy, R.; Kavuri, S. N.; Yin, K. A review of process fault detection and diagnosis Part III: Process history based methods. Comput. Chem. Eng. 2003, 27, 327-346. (13) Maurya M. R.; Rengaswamy, R.; Venkatasubramanian, V. Fault diagnosis using dynamic trend analysis: A review and recent developments. Eng. Appl. Artif. Intell. 2007, 20, 133-146. (14) Sundarraman, A.; Srinivasan, R. Monitoring transitions in chemical plants using enhanced trend analysis. Comput. Chem. Eng. 2003, 27 (10), 1455-1472. (15) Sankoff, D.; Kruskal, J. B. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison; Addison-Wesley: Reading, MA, 1983. (16) Kassidas, A.; MacGregor, J. F.; Taylor, P. A. Synchronization of batch trajectories using dynamic time warping. AIChE J. 1998, 44 (4), 864875. (17) Qian, M. S. Ph.D. thesis, National University of Singapore, Singapore, 2004. (18) Waterman, M. S.; Eggert, M. A New Algorithm for Best Subsequence Alignments with Application to tRNA-rRNA Comparisons. J. Mol. Biol. 1987, 197, 723-728. (19) Downs, J. J.; Vogel, E. F. A plant-wide industrial process control problem. Comput. Chem. Eng. 1993, 17 (3), 245-255. (20) Singhal, A. Tennessee Eastman simulation model. http://www. chemengr.ucsb.edu/∼ceweb/computing/TE/tesimulation.htm (accessed 2003). (21) McAvoy, T. J.; Ye, N. Base control for the Tennessee Eastman problem. Comput. Chem. Eng. 1994, 18 (5), 383-413.
ReceiVed for reView January 25, 2006 ReVised manuscript receiVed March 26, 2007 Accepted April 2, 2007 IE060111S