Improvements in Fault Tolerance Characteristics for Large

Jul 1, 2008 - different units, and it must be able to give the classification and magnitude of the .... The “dry weather” disturbance file is used...
0 downloads 0 Views 2MB Size
5464

Ind. Eng. Chem. Res. 2008, 47, 5464–5481

PROCESS DESIGN AND CONTROL Improvements in Fault Tolerance Characteristics for Large Chemical Plants: 1. Waste Water Treatment Plant with Decentralized Control David Zumoffen† and Marta Basualdo*,† Group of Applied Informatic to Process Engineering (GIAIP), French-Argentine Internacional Center of Information Science and Systems (CIFASIS-CONICET-UPCAM III-UNR), BV. 27 de Febrero 210 Bis, S2000EZP Rosario, Argentina

A decentralized control scheme integrated with a recently developed fault detection, isolation, and estimation (FDIE) system applied on a benchmark of wastewater treatment plant (WWTP) is discussed here. This work belongs to a series whose objective is to demonstrate quantitatively the achievable improvements on the fault-tolerance characteristics supported by the integration with FDIE systems. In the previous work, a decentralized structure applied on a larger benchmark (such as the pulp mill plant) was presented. Specific aspects that are related to that case study were successfully handled by this methodology. Here, the WWTP presents new scenarios based on the occurrence of faults and their incidence on the associated dynamic with this process. In this work, faults such as extra dead time at actuators and bias and slow drifts in sensors (which, in some cases, can cause saturation), are analyzed. Under these conditions, a real need exists for turning the existent control scheme into a fault-tolerant (FT) scheme. This is done through reconfiguration of the controllers, using online identification and model-based control (MBC) tools. As the main contribution of this paper, a rigorous quantitative analysis is performed that takes account of a complete set of simulation cases, showing different scenarios. The accurate comparison study is given through several indexes associated with the WWTP benchmark, and others that are useful for analyzing the methodology are proposed here. 1. Introduction Currently an important need exists internationally to enhance the safety and reliability of chemical plants in ways that reduce their vulnerability to serious failures. They could be defects/ malfunctions in process equipment, sensors and actuators, failures in the controllers or in the control loops, which, if not appropriately handled in the control system design, can potentially cause undesired economic, environmental, and safety problems that seriously degrade the operating efficiency of the plant. These considerations provide a strong motivation for the development of systematic methods and strategies for improving fault-tolerant (FT) control systems, and they have motivated many research studies in this area.1 The work presented here is part of a series whose main purpose is to offer new results about this subject. In this context, one must consider some aspects, such as the size of the problem, that is, (a) a single unit plant with few actuators and sensors (centralized structure) and (b) a larger plant with several interconnected processing units and larger number of actuators and sensors (distributed hierarchical structure). Therefore, a proper design of the fault detection, isolation, and estimation (FDIE) system must be done according to the size of the plant. The FDIE system must be responsible for monitoring the different units, and it must be able to give the classification and magnitude of the detected abnormal events in a timely manner. Finally, a good integration with the existent control structure (conventional or advanced) for coordinating the FDIE * To whom correspondence should be addressed. Tel.: +54-341482-1771/6300 Int 104. Fax: +54-341-482-1772. E-mail: basualdo@ cifasis-conicet.gov.ar. † Facultad Regional Rosario, Universidad Tecnolo´gica Nacional, Argentina.

outputs in a way that minimizes the propagation of the failure effects must be done. Most of these topics, together with a deep bibliographic analysis, have been reported in previous papers. In the work of Zumoffen et al.,2 an FDIE module, designed for a single unit, linked with adaptive predictive control (APC) and adaptive predictive robustness filter control (APRFC), allowed an important improvement in FT capacity. The overall approach was applied to a nonlinear chemical plant, such as a continuously stirred tank reactor (CSTR) system with a jacket. There, bias in sensors and extra dead time (delay) in actuator actions were considered as the most common faults. The FDIE system was designed using wavelets decomposition for the sensor faults and identification techniques for the actuator faults. A new FDIE architecture, suitable for large plants, was successfully applied to the pulp mill process in the work of Zumoffen and Basualdo3 with decentralized control. In Zumoffen et al.,4 the FDIE system is integrated with a model predictive control (MPC) scheme applied on the same pulp mill process. In the present work, the study is completed by analyzing the FDIE behavior when the bias in the sensors provides saturation in certain actuators and extra dead time in actuators. Hence, a new challenge is imposed for both the FDIE and the control scheme applied on the WWTP. In the literature, some works have only focused on monitoring systems for detecting typical faults in a WWTP. One can mention the work of Lee et al.,5 where a noisy and faulty measurements reconstruction system was proposed. This approach suggested the cross-correlation data analysis. Nonsimultaneous noise and bias in sensors were considered. The fault identification system is proposed for dynamic processes based on time-lagged principal compoennt analysis (PCA). The approach is based on multivariable statistical methods, iterative

10.1021/ie800098t CCC: $40.75  2008 American Chemical Society Published on Web 07/01/2008

Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008 5465

optimization for fault reconstruction, and a sensor validity index. In the work of Yoo et al.,6 a model-based monitoring system was presented. There, the modeling process is based on statistical tools (PCA) and classification system (Fuzzy-c-means and adaptive-Fuzzy-c-means). Another application of an expert system was proposed in the work of Punal et al.,7 where a knowledge base and if-then rules of the process variables measurement are developed for diagnosis purposes of an anaerobic WWTP. Thus, information, advice, actions, and recommendations are generated. Some applications about WWTP monitoring and control have appeared in the literature. In the work of Ekman et al.,8 a cascade proportional-integral (PI) control and switching strategy are proposed for a predenitrifying activated sludge process, to evaluate a supervisory aeration volume control. No faults were considered in either of the monitoring aspects. A method to characterize newly installed online sensors or to evaluate monitoring data that may contain systematic errors were proposed in the work of Rieger et al.9 Their proposed approach was based on linear regression and statistical tests to construct an expert system that allows one to differentiate between trueness and precision for sensors. On the other hand, in the work of Christofides and El-Farra,1 a complete chapter is dedicated to fault tolerance control (FTC), which accounts for the delay problem. They are based on the assumption that a real chemical process involves significant time delays, which often occur due to a transportation lag, such as that which occurs in flow through pipes; dead times associated with measurement sensors (measurement delays) and control actuators (manipulated input delays) are the most common reasons. They presented a methodology for the synthesis of nonlinear output feedback controllers for nonlinear systems, which include time delays in the states, the control actuator, and the measurement sensor. The development was performed by accounting for the models of different processes. The FTC theory point of view of the study was very powerful, although in practice, accurate models for large chemical plants cannot be available. Recent results about the integration of fault detection, feedback, and supervisory control have been presented in the work of Mhaskar and co-workers10,11 In these works, the authors proposed a good theoretical methodology for FTC in both state and output feedback control design cases. A family of candidate control configurations, characterized by different manipulated inputs, is performed when actuator failures occur. Each control configuration presents a Lyapunov-based controller design. In the report of Mhaskar et al.,12 similar ideas (stability region characterizations, configurations control, and switching policy) were demonstrated to be useful for the sensor fault problems (complete failure or intermittent unavailability of measurements). The application results were given for one or two process units that are considered to be perfectly modeled. Probably for large chemical plants, this methodology must be implemented, accounting other stronger limitations. A robust predictive FTC design is presented in Mhaskar et al.,13 extending the ideas displayed in previous works. Tools such as multivariate statistic and incidence graphs are used to perform detection and diagnosis tasks in nonlinear systems, subject to actuator faults and disturbances.14 There, the design of nonlinear model-based statefeedback control laws were presented. The FDIE system used here, based on input-output historical data, in essence, has a structure similar to the detailed structure described in the work of Zumoffen and co-workers.3,4 These hybrid designs are thought to overcome typical problems that are found in the fault detection and diagnosis (and FTC) as the

problem-dependent solutions. This approach becomes independent from questions such as process dimension (small/medium/ large scale) and type (linear/non linear), fault types (sensors/ actuators), and disturbances (input/output) and generalizes a methodology for solving the integration between FDIE and FTC techniques (decentralized/MPC). In the work presented here, it must be redesigned, accounting the control scheme characteristics and the specific aspects involved on the WWTP, such as dynamics and the selection of the faults to analyze, because of their impact on the plant. In this case, the FDIE integrated to FTC method is designed to handle fault types such as extra dead time in actuators and offset and slow drift in sensors. In addition, saturation management is presented with this methodology. Note that extra delays provide severe limitations on the achievable control quality and cause serious problems in the behavior of the closed-loop system, including poor performance (e.g., sluggish response, oscillations) and instability. In this context, it is crucial to make the existing decentralized control strategy as fault-tolerant (FT). It is done by applying the MBC theory for the proper proportional-integral-differential (PID) tunings. Thus, the controllers are being updated online, improving the overall process performance in both the actuator and sensor fault cases. The WWTP presents a complicated transitory response that is due to the diurnal periodical disturbance. This scene presents a challenge in the FDIE and FTC designs. The efficacy of the proposed approach is demonstrated through several simulation results. In addition, typical indexes used for the WWTP and others used in these series of works are presented here for quantitative comparison purposes. 2. Case Study: Waste Water Treatment Plant (WWTP) The application case considered here is based on the COST benchmark presented by Copp.15 It is a fully defined simulation protocol that was developed for evaluating activated sludge wastewater treatment control strategies. It includes plant layout, specific model parameters, and a detailed description of the common disturbances that affect the system, which are considered for testing the real effectiveness of the developed strategies. The benchmark is based on two internationally accepted process models: the Activated Sludge Model No. 1 (ASM1), which is considered the biological process model, and the double-exponential settling velocity function, which is considered the settling process model. The first model has 13 components (state variables) and 8 process units. To ensure consistent application of the model in benchmarking studies, all of the kinetic and stoichiometric model parameters have been defined. The second model is based on the solids flux concept and is applicable to both hindered and flocculent settling conditions. Both interconnected models give a complex biological process that must be controlled under permanent disturbance conditions. The “dry weather” disturbance file is used in this work and depicts what is considered to be normal diurnal variations in flow and chemical oxygen demand (COD) load. The correct process monitoring and control are essential, because any abnormal event could produce inadmissible control actions or performance index values. The objective of the activated sludge process is to achieve, at a minimum cost, a sufficiently low concentration of biodegradable matter in the effluent, together with minimal sludge production. To do this, the process must be controlled. The process consists of five biological tanks (Ri, for i ) 1,..., 0.5) in series with a nonreactive secondary settler (S). Reactors R1 and R2 are anoxic tanks, but are fully mixed, and the three last

5466 Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008

Figure 1. WWTP layout and decentralized control strategy. Table 1. Process Element Attributes and Variables of the WWTP attribute

value Elements

volume of tanks 1 and 2 volume of tanks 3, 4, and 5 area of settler volume of settler

1000 m3 1333 m3 1500 m2 6000 m3

Variables influent flow rate recycle flow rate internal recycle flow rate wastage flow rate KLa for tanks 3 and 4 KLa for tank 5

18446 m3/day 18446 m3/day 55338 m3/day 385 m3/day 10 h-1 3.5 h-1

tanks (R3, R4, and R5) are aerated. Two internal recycles exist in the process: the nitrate internal recycle from the fifth tank to the first tank, and recycle from the underflow of the secondary settler to the front end of the plant (because no biological reaction occurs in the settler, the oxygen concentration in the recycle is the same as that in the fifth tank reactor). The wastage is pumped continuously from the secondary settler underflow. The complete process layout and the control strategy are represented in Figure 1. The physical process elements attributes, as well as the nominal flow rates and aeration coefficient values, are shown in Table 1. The decentralized control strategy is implemented by two loops. The first one involves the dissolved oxygen (DO) level as the controlled variable in the fifth reactor, by means of a PI that manipulates the aeration coefficient for this reactor (hereafter referenced as KLa5. This is performed using a DO ideal sensor. Analogously, the second control loop involves the nitrogen (N) level control in the second reactor, by means of a PI manipulating the internal recycle flow (Qintr). In this case, a more-realistic and fault-free sensor is considered, because the N measurement is both noisy and delayed. The original controller settings and the process variables are shown in Table 2. In this work, two fault types in four process elements are proposed. Sensor Faults. Both the DO and N sensors are assumed to work with possible faults characterized by means of the abrupt offset at the measurement, giving an erroneous process observation. The studied offset magnitudes indicate the following characteristics: DO Fault Magnitude: (1) Range of [0, 4] g/m3: produces minimum degradation in the performance indexes if 0 < offset < 0.7 g/m3; if offset > 0.7 g/m3: some type of compensation must be considered, because the higher energy consumption is reported. (2) Offset > 4 g/m3: produces saturation in the manipulated variable (MV), causing considerable degradation.

(3) Offset < 0: produces saturation in the MV with degradation. N Fault Magnitude: Range of [0, 1] g/m3: produces minimum degradation in the performance indexes if 0 < offset < 0.3 g/m3; when the nitrogen violation in the effluent presents a high value (offset > 0.3 g/m3), some type of compensation must be considered. Offset > 1 g/m3: produces saturation in the MV; considerable degradation occurs. [Range of 0,-8] g/m3: causes effluent violation diminution, and small offsets are negligible; when the energy has a high value, some type of compensation must be considered. Offset < -8 g/m3: produces saturation in the MV; considerable degradation occurs. Actuator Faults. Both the KLa5 and Qintr actuators can suffer malfunctions. The faults in this case are characterized by an extra delay in the control actions. Small values in the sensor offsets cause minimum degradation in the performance indexes; for example, an offset in the sensors of (0.5 g/m3 provokes modifications in the effluent violation and the pumping energy indexes of (10%. The minimum detectable offset in the sensors is set to account for several aspects, such as the relationship between noise-signal, security for avoiding false alarms, and the velocity of detection, among others. A FDIE that detects a small offset is very sensitive to noise and disturbances. The main results chosen for this part are related to the worst case when the offset magnitudes cause saturation of the valves. Other results, considering other faults (such as small offset and slow drifts in the sensor) are presented at the end of this work. It is well-known that the time delay exists widely in practice, induced by long-distance transportation and communication or mechanical faults in the valves. Another reason for this selection is because it might cause instability in the closed-loop systems and control performance deterioration, so the FTC represents a valuable tool for support for these types of abnormal events. However, only a few works have analyzed these type of faults.2,16–18 These faults represent a real challenge for the FDIE system design and its integration with FT strategies. A blockade in the valves is another type (extreme case) of actuators fault, which is addressed in the work of Zumoffen and Basualdo,3 in the context of plant-wide control case. A control reconfiguration and redesign is essential to guarantee FT characteristics. The control policy reconfiguration may be performed by online computation or by selecting a predefined one (designed offline). An abrupt offset in the sensors is proposed here as a common fault type in process elements. When the plant is under control, these fault types are not easy to detect, isolate, and estimate, because the control masks the fault. The controlled variable shows the same behavior as a disturbance rejection. If the sensor fault type would be a slowly temporal drift to another incorrect

Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008 5467 Table 2. Controller Settings First Loop Variables controller type controlled variable manipulated variable set point Parameters proportional gain, K integral time constant, Ti antiwindup time constant, Tt manipulated saturations [min, max ]

continuous PI with AW DO (g/m3) KLa5 (h-1) 2 (g/m3)

continuous PI with AW SNO (gN/m3) Qintr (m3/d) 1 gN/m3

500 d-1 ((g COD)/m3)-1 0.001 d 0.0002 d [0, 10] h-1

15000 m3 d-1 (gN/m3)-1 0.05 d 0.03 d [0, 92230] m3/d

Table 3. Automatic Rules Extraction ∆x1

fault type

KLa5

Qintr

F1 F1 F2 F3 F4

N N N satmax N

N H H N satmin

F1 F1 F2 F2

satmin satmax N N

satmax N satmax satmin

∆x2

∆x3

∆x4

∆x5

∆x6

L L L H N

L L H H N

N N L L N

N H H N N

L H H L

L H H L

H L L H

L H H L

Under Control L L L H N

H H H L N

Loss of Control L H L L

H L H L

Table 4. FDIE Parameters APCA

ANN

Fuzzy Logic

Ts ) 1/96 d A(0) ) 4 Nw ) 1345 samples Naux ) 100 samples CPV ) 98%

KLa5: 2HU, 1OU Qintr: 3HU, 1OU DO: 2HU, 1OU N: 4HU, 1OU

ai ) [Lmin i + 0.1, Lmax i - 0.1] bi ) [Lmin i - 0.1, Lmax i + 0.1] ci ) [sati- + 0.1, sati+ - 0.1] di ) [sati- - 0.1, sati+ + 0.1] ei ) [-1000, 1000]

Table 5. Initial IMC Tuning control loop DO N

τm 0.001 0.05

km -3

4 × 10 1.33 × 10-4

Second Loop

τf 0.001/2 0.05/2

value, the APCA module from the fault detection isolation (FDI) is able to detect this abnormal event. The treatment of this fault can be considered as successive abrupt steps with variable magnitude (increasing or decreasing). The additive compensation also would be useful in this case. Note that the selection of the faults considered here is based on information taken from the Abnormal Situation Management (ASM) website.19 There, a detailed presentation about several types of abnormal events, potential sources, impact, solutions, etc. for real chemical process applications is available. 3. FDIE Approach In this section, only the main aspects related to the faults considered for the WWTP have been included in the FDIE description. For more details about the theoretical topics of each subsystem, the reader is referred to the work of Zumoffen and Basualdo.3 3.1. Monitoring and Fault Detection. The online monitoring begins with the calculations of the next process sample data x(k), which is modified by certain scaling factors to get xj(k). It is evaluated by the T2(k) and SPE(k) statistics, and the computation of the combined statistic z(k) is performed with the previous control limits δT2(k - 1) and δSPE(k - 1). If z(k) j n(k) is updated by deleting e δz is true, the normal data matrix X the older data and concatenating the new data sample xj(k); in contrast, this sample is not included in the normal data matrix.

If z(k) > δz, a warning alert appears; if four consecutive warning alerts are still active, then a fault alert advisory is shown. In this case, the actual samples x(k) are stored in the auxiliary matrix Xaux during the next Naux samples. Note that the moving window is directly related to the maximum time required to achieve steady state (inherent to the process). In addition, if k g Naux (all faulty data has been included in the auxiliary matrix), the scaling factors are updated to b(k) and s(k), using the auxiliary matrix. The next step is to update the control limits δT2(k) and δSPE(k), using the actual normal data matrix and the 99% (ν ) 3) confidence level criteria. It is followed by updating the correlation information contained at the original PCA model. j n(k) is First, the SVD of the correlation data matrix from X performed; second, the selection of the principal components retained A(k), using the CPV approach for 98%, is done. Note that, with these two modifications, the PCA model and A also are updated each sampling time k. Thus, the number of the principal vector scores will be able to vary throughout the monitoring process. This can be seen clearly in the example shown in Figure 2e. For the WWTP, a rigorous analysis is performed, to evaluate benefits between the classical PCA and the APCA methods. In Figure 2, the behavior of both algorithm types is shown graphically. In this simulation, two consecutive faults in plant elements are proposed. One case considers a positive offset at the OD sensor of 1 g/m3 at t ) 3 days, which abruptly appears. Analogously, the second consecutive fault appears at t ) 10 days of magnitude -1 g/m3. Figure 2a shows the combined statistic z(k) for the classical PCA, which shows that one only can detect the first fault and its statistic diverges without giving information about the second fault. On the other hand, in Figure 2b, the combined statistic z(k) for the APCA detects both faults perfectly. Figure 2c shows the contribution to T2 where the first fault is not so evident. However, Figure 2d shows how the SPE of the APCA captures both faults timely. It justifies the importance of using both T2 and SPE statistics. Finally, an interesting result is shown in Figure 2e, where the principal component retained must change when the second fault appears, under APCA methodology. Such a conclusion can be attained in this example because the statistics for APCA are capable of making new detections each time the system requires them. 3.2. Isolation: FL Module and Automatic Rule Extraction. The FL module uses the information signals given by the APCA to classify the abnormal event. For example, these signals can be the contributions to the T2(k) and SPE statistics, together with the manipulated variables. The decision of which is the best selection is directly related to each plant and fault considered. It then is possible to derive the linguistic values and rules corresponding to the pattern found for each specific fault. Therefore, both the normal and abnormal process behavior databases are taken into account and analyzed. Under normal operation, all variables are inside the range bounded between its maximum and minimum limits. The first database has the

5468 Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008

Figure 2. (a) PCA algorithm and (b, c, d, e) APCA algorithms.

information about the overall process variable limits at the normal operation point. In the presence of an abnormal event, specific process variables exceed their limits in such a way that allow recording this behavior as a well-defined pattern for each specific fault. The set of those patterns allows development of the so-called rules matrix in FL language, which is essential for performing a correct fault classification. Thus, the FL evaluation begins only when a fault is detected by the APCA. The inputs to the FL system are the variables used to analyze the T2 and SPE contributions (and the specific measurement process variables, if they are needed) and are self-scaled, with respect to the normal operation behavior. Taking account of the exact time when the fault occurs (Td), which is given by the APCA system, is defined as a zone analysis [Tiz, Tfz] for each component. Figure 3 shows the difference between normal and abnormal behavior for the same variable when a N sensor fault occurs at time Tf. The linguistic values adopted in each case for a set of variables involved in the fault are useful for performing the correct fault classification. This set defines a pattern (signature) for each abnormal event, which are called the rules that are associated with each fault. Finally, each FL output is computed by means of the defuzzification procedure, which involves evaluating the corresponding rule support from the rules matrix. When the saturation of the manipulated variables, which is generated by a malfunction at the sensors, occurs, the automatic rules extraction from the abnormal behavior database gives the matrix rules, which can be

observed in Table 3. In addition, the fault patterns given by the contribution to the SPE;∆xi (where i ) 1,..., 6); and the MV;KLa5 and Qintr;are given. The FL system classifies the different abnormal events by considering whether MV achieves saturation values or not. Thinking about the suitable control, actions/decisions necessary to compensate these fault effects are necessary to know the magnitude of the faults needed to perform the corresponding controllers reconfiguration, to maintain both performance and stability conditions for the overall process. 3.3. Fault Estimation: ANN Approach. The problem here is to determine the process variables that are closely related to the fault type and allow estimation of its magnitude. The ANN approach learns about specific mapping generated between the mean Variations of the offset and standard deViation Variations of the dead time. Assuming that enough information about previous faults has been recorded, a set of training patterns for the ANN subsystem then can be obtained. Therefore, the training data consist of two vectors that contain the patterns φi,yi (inputs and outputs, respectively). The network output prediction is represented by eq 1. The input-output relationship must be obtained and the weights and biases [wjl, Wij] must be adjusted during the training period. Thus, it is basically a conventional identification problem, where the weights and biases are estimated from the data based on some cost function to be minimized.

Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008 5469

Figure 3. Fuzzy logic method. N

Table 6. Performance Indexes Effluent Violation [% of operation time]

Integral Absolute Error, IAE [g days/m3]

pumping energy, PE DO loop N loop [kWh/day]

case Normal Behavior DO Sensor Fault offset ) 1 g/m3 without FTC with FTC offset ) 4 g/m3 without FTC with FTC N Sensor Fault offset ) -1 gN/m3 without FTC with FTC offset ) 4 gN/m3 without FTC with FTC KLa5 Actuator Fault delay ) 0.6 min without FTC with FTC Qintr Actuator Fault delay ) 20 min without FTC with FTC



1 ε2(k, θ) 2N k)1

(2)

The objective here is to determine the weights θ in the network that minimize the criterion given by eq 3:

NV

AV

1488.14

18.30

17.26

ˆ ) arg min V (θ, ZN) θ N

6.9978 0.2060

1687.57 1487.24

13.09 18.00

24.70 16.81

Figure 4 shows the validation procedure for each ANN. Thus, the actuators delay estimation is performed by means of eqs 4a and 4b,

13.7085 0.0078

4246.63 1496.33

26.04 18.30

97.76 17.55

dˆKe La5(k) ) ANNKLa5(stdDO(k))

(4a)

7.115 1.4186

1844.37 1486.54

12.35 18.00

17.26 16.81

ˆ dQe intr(k) ) ANNQintr(stdN(k))

(4b)

5.8241 2.0328

1753.24 1555.94

96.13 15.77

14.28 16.81

0.0075

1.4818

0.7626 0.0276

2.9837 2.9901

[ ( ny

yˆinn(k) ) Fi

VN(θ, ZN) )

1492.00 1489.84

17.85 17.85

16.81 16.81

1487.97 1364.49

18.89 23.51

17.11 16.81



∑ W f ∑ w φ (k) + w ij j

j)1

jl l

l)1

) ]

j0

+ Wi0

(1)

A set of data ZN ) [φ(k), ynn(k)] with the input patterns and output target, respectively, and k ) 1,..., N then must be given. The network prediction defined as yˆnn(k, θ) ) g(φ(k), θ), where k is the sampling time, the function g( · ) is represented by eq 1, and θ is the parameter vector that contains the weights and biases [wjl, Wij]. Therefore, all of these elements can define a cost function that measures the prediction error ε(k, θ) ) ynn(k)yˆnn(k, θ), as can be seen from eq 2, which displays the squared prediction error.

(3)

and the corresponding sensor offset estimation is done by eqs 5a and 5b, f (k) ) ANNDO meanK a5(k) oˆDO ( ) L

(5a)

oˆNf (k) ) ANNN(meanQintr(k))

(5b)

where stdi(k) with i ) DO, N are the standard deviation variations of the CV and meani(k) with i ) KLa5, Qintr are the mean value variation of the MV. The ANN inputs are given by the APCA system information. The ANN system is “turned on” to estimate the fault magnitude if and only if a fault has been detected by the APCA system and suitably classified by the FL system. In the case where the FL system detects saturation in the MV due to the occurrence of sensor faults, the magnitude estimation process becomes simpler in this range. The MV saturation causes the process works in open loop (without control) and the CV moves away from the reference trajectory. This difference is assumed directly proportional to the offset magnitude introduced by the sensor and it is not necessary training any ANN for this purpose. Hence, a slight mismatch

Table 7. FDIE Performance Indexes event F1 F1 F2 F2 F3 F4

event value 3

1 g/m 4 g/m3 -1 gN/m3 4 gN/m3 0.6 min 20 min

process element

Tf [days]

Td [days]

DPT [%]

yj

RSPj [%]

MSPEj

DO level sensor DO level sensor N level sensor N level sensor KLa5 actuator Qintr actuator

3.00 3.00 3.00 3.00 3.00 3.00

3.48 3.02 3.02 3.02 3.03 3.09

31.94 1.38 1.38 1.38 2.08 6.25

1.00 1.00 1.00 1.00 1.00 0.87

62.50 70.32 56.25 65.62 79.68 78.57

8.64 × 10-4 4.00 × 10-8 5.63 × 10-5 2.46 × 10-2 9.36 ×10-4 3.10 ×10-3

5470 Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008

produced between the actual measurement and the set point is enough to estimate the fault magnitude. In this case, the sensor off-set estimation is performed using eq 6: oˆif(k) )

{

-max(spi(k) - measi(k)) (if MVj(k) g MVmax j ) -min(spi(k) - measi(k)) (if MV (k) e MVmin) j

j

previously analyzed. Note that the algorithm is slightly different from that presented in the work of Zumoffen and Basualdo3 for the pulp mill process. This is done because of the specific characteristics of each plant. The FDIE output is used as an important support for the FTC strategy, as will be shown in the following section.

(6) 4. Integration to Fault Tolerant Control where spi(k) is the set-point policy; measi(k) is the CV measurement; MVi(k) is the molecular volume; MVjmax and MVjmin are the maximum and minimum limits to MV, respectively, with i ) DO, N and j ) KLa5, Qintr, respectively. The overall FDIE algorithm used to detect, isolate, and estimate the possible different faults types is shown in Figure 5. This figure shows the interconnection between the methods

In this section, the integration between the FDIE with the existent decentralized control strategy in the process is described. The principal objective is to convert the control characteristic to a fault-tolerant one. This integration is thought as an on-line reconfiguration for both the control policy and the controller settings.

Figure 4. ANN validation for (a) delays in the DO loop, (b) delays in the N loop, (c) offsets in the DO sensor, and (d) offsets in the N sensor.

Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008 5471

Figure 6. Feedback control approaches: (a) IMC and (b) classical. Table 8. EIP Index;WWTP event

event value

F1 F1 F2 F2 F3 F4

Figure 5. Diagram of the proposed fault detection, isolation, and estimation (FDIE) approach.

4.1. Sensors Fault Case. For the case of the sensor faults (such as abrupt additive biases), the compensation strategy is based on an online reference trajectory modification, according to the information given by the FDIE system oi(k). Thus, the set-point profile is given by eq 7: ofi (k) spi(k) ) sp0i (k) + ˆ

(with

i ) DO, N)

(7)

0

where spi(k) is the actual set point, spi (k) is the original one and oif(k) is the fault magnitude estimated which is used for doing the reference corrections; i ) DO, N. Note that, in the decentralized control strategy, the additive compensation can be done either on the setpoint or on the abnormal measurement, because both arrive at the same result.2,3 Otherwise, under multivariable MPC, only the measurement compensation4 can be done. In the case of a slow drift sensor fault, the compensation also is done on the measurement; the detection and classification procedures are performed by the same tools that have been mentioned previously (APCA and FLS). This fault causes a specific pattern of variations on the directly associated manipulated variable, which allows to determine the abnormal event and perform the correct classification and magnitude estimation. The fault compensation is made using eq 8: m ( ) ( ) of,drift (k) ycm j k ) yj k - ˆ j cm

1 g/m 4 g/m3 -1 gN/m3 4 gN/m3 0.6 min 20 min

where yj (k) is the compensated measurement, yj (k) the faulty measurement, and oˆjf,drift(k) the slow drift estimation, being (k) ) slopeMV FTd + slopeMV F(t - Td) ˆ of,drift j j j

(9)

where slopeMVj is the slope in manipulated variable j that is due to the drift, and F is a proportional parameter to be estimated by the controlled variable drift, according to the relationship slopeMVj ) slopeMV,jF. Thus, the first term, slopeMVjFTd corresponds to the introduced error by the sensor until the fault detection instant Td. This first compensation allows one to achieve the same starting point between the measurement and the real value. Meanwhile, the term slopeMVjF(t - Td) corresponds to the drift compensation, which allows one to keep the measurement close to the real value since that moment. The variable j refers to the involved control loop.

97.06 99.94 80.06 65.10 96.38 -0.21

4.2. Actuators Fault Case. On the other hand, in the case of multiplicative faults, such as those that can occur at the actuators (extra dead time), the compensation strategy is based on the controller parameter reconfiguration. Here, an MBC strategy is used to perform the online controller tuning. One of the most well-known MBC techniques is the IMC method, which needs a good process model, which is obtained by applying a proper identification procedure. A particular internal model factorization and filter design, which are described in detail in the work of Rivera,20 which is associated with classical controller tunings, then are taken into account here to perform the online reconfiguration. This will be done by focusing on the robustness, stability, and performance criteria. In the following, a brief review is presented here. Defining the factorized process model in Laplace domain as g˜(s) ) g˜+(s)g˜-(s), where g˜+(s) is the noninvertible part that contains all the time delays and zeroes of the nonminimal phase, g-(s) is the invertible stable part of the model, and the IMC design is given by eq 10. -1 (s)flp(s) gc(s) ) g˜-

(10)

where flp(s) is a low-pass filter of suitable order such that gc(s) is at least a proper function. By means of this model factorization, gc(s) is physically realizable. For the case study considered here, a first-order model, given as g˜(s) ) km/(τms + 1) is identified and the selected factorization is g˜+(s) ) 1 and g˜-(s) ) km/(τms + 1). Doing some algebra of the block diagram in Figure 6a, it is possible to arrive at the classical control structure shown in Figure 6b, where

(8) m

EIP [%]

3

C(s) )

-1 (s) gc(s) g˜) -1 ( ) ( ) ˜ 1 - g s gc s flp (s) - g˜+(s)

(11)

and, considering the above factorization and the filter transfer function flp(s) ) 1/(τfs + 1), the resultant controller structure is shown in eq 12. C(s) )

(

τm τms + 1 1 ) 1+ kmτfs kmτf τms

)

(12)

A comparison using the PI controller transfer function (named C) reveals that both parameters are available. Thus, its proportional gain is calculated by τm/kmτf and the reset time is given by τm. When a perfect modeled system g˜(s) ) g(s) is available, the filter time constant can be chosen freely. In practice, the perfect model is achieved rarely or, generally, some plant/model mismatch is found at a specific frequency range in which the

5472 Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008

Figure 7. Fault in the DO sensor, with and without FTC strategy (offset ) 1 g/m3 at t ) 3 days): (a) controlled variable; (b) manipulated variable; (c) combined statistic, z(k); and (d) warning and fault signals.

model is valid. Generally, a process can be well-modeled by means of the first-order-plus-dead-time transfer function, g(s) ) km exp(-ξs)/ (τms + 1). According to the model selection obtained previously, it is possible to observe that no information about the delay is taken into account in this case (g˜+(s) ) 1 (using a zero-order Pade´ approximation, exp(-ξs) ) 1)). The IMC approach allows one to quantify the model validity and delimit the effective frequency range of the closed-loop system. To measure the zero-order Pade´ approximation problem, a function called the norm-bounded multiplicatiVe error (described by eq 13) is evaluated. g(s) - g˜(s) (13) g˜(s) Usually, |em(s)| approaches a value that is g1 for higher frequencies. Using a value of |em(ω)| ) 1 in this case results in a calculation of ω = 1/ξ, and this frequency value bounds the model validity range. For the operational frequency range of ω > 1/ξ, the robustness to multiplicative uncertainties is not guaranteed. In addition, in the IMC approach, the closed-loop system response is proportional to the filter time constant (τf). In this context, a conservative design is achieved by selecting the filter time constant to be τf > ξ. Using the previous result, the online controller reconfiguration can be performed. In the actuator fault case (multiplicative em(s) )

faults), the estimated delay (dˆe(k), given by the monitoring system, can be used to reconfigure the decentralized PI controllers. In this case, the filter time constant is selected using eqs 14a and 14b, considering eqs 4a and 4b and the robustness condition τf > ξ. ˆKLa5(k)µ τfKLa5(k) ) ηd e

(14a)

ˆQintr(k)µ τfQintr(k) ) ηd e

(14b)

The factor µ, which is defined as µ)

1 1440

is added to perform the proper units conversion of time (from minutes to days). The factor η ) 5 is a design parameter that accounts for the robustness criteria τf > ξ. Using these online reconfigurations, the controller is updated according to eq 12. 5. Simulation Results In this section, a complete simulations set is presented, considering the different fault types analyzed previously. The implementation is made in a single algorithm that receives the process data each sampling interval of Ts ≈ 1/96 days. The WWTP model works assuming a dry weather perturbation file. The simulation time used for this test corresponds to two weeks.

Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008 5473

Figure 8. Fault in the DO sensor, with and without FTC strategy (offset ) 4 g/m3 at t ) 3 days): (a) controlled variable; (b) manipulated variable; (c) combined statistic, z(k); and (d) warning and fault signals.

IThe APCA parameters for starting these simulations appear in the first column of Table 4. The initial principal components retained (A(0)), the moving window size (Nw) for the normal data matrix, the auxiliary data matrix size (Naux), and the cumulative percent variance limit for selecting the principal components retained (CPV) are included. In addition, the ANN parameters are displayed in the second column of Table 4. Here, both the number of hidden and output units (HU and OU, respectively) are configured for the learning procedure. In all cases, the neuron types are chosen as hyperbolic tangent for HU and linear for OU. The training stage is conducted with the neural network toolbox of Matlab, using Levenberg-Marquardt methodology as a backpropagation training algorithm. Analogously, the parameters for the FL system are presented in the last column of Table 4. Here, the membership function (MF) parameters are defined; the range for normal behavior, bounded i i between Lmin and Lmax as the lower and upper limits, respectively, for component i inside the normalized input vector to the FL system, is shown. The terms sati- and sati+ are the normalized saturation limits for the MV (the lower and upper limits, respectively). In the ∆xj(k) case, the MF represents three possible linguistic values: low, normal, and high. For MV analysis and saturation management, two additional MF are developed: lower saturation and upper saturation. Table 5 shows the initial parameters selected for each model to be implemented by the IMC approach. The time constant (τm), the gain (km), and the low-pass filter time constant (τf) are also

included. These parameters are adjusted for the normal process behavior resulting in the controller parameters shown in Table 2. If the actuator fault appears, the filter time constant is updated online, as described previously, supporting the reconfiguration of the controller parameters. The first simulation proposed here considers a fault at the DO sensor (see Figure 7), which produces an abrupt offset of 1 g/m3 at Tf ) 3 days. Figure 7a presents both the measured and true variable DO (CV) evolutions with and without FTC strategy. This figure shows how the FTC strategy modifies the set-point policy and returns the process behavior close to that of the normal operation (2 g/m3). In this case, the offset f estimation is oˆDO ) 1.0294 g/m3. In addition, the time of fault occurrence (Tf ) 3 days), fault detection (Td ) 3.4791 days), and reconfiguration (or compensation) (Tr ) 4.5104 days) are displayed. The last two times are directly related to the high detection capacity when the combined statistic z(k) and the selected auxiliary matrix dimensio´n Naux, given by the APCA strategy, are taken into account. In Figure 7b, the KLa5(k) (MV) evolutions, with and without FTC strategy, and the normal behavior are shown. The subsequent figures (Figures 7c and d) show the combined statistic z(k) and the warning and fault signals, respectively. Figure 8 shows the same fault considered in the previous case, but now its greater magnitude (offset ) 4 g/m3) causes moreserious problems. As can be observed in Figure 8b, the MV saturates in this case, which provides the loss of control when

5474 Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008

Figure 9. Fault in the N sensor, with and without FTC strategy (offset ) -1 gN/m3 at t ) 3 days): (a) controlled variable, (b) controlled variable without FTC strategy, (c) manipulated variable with FTC strategy, (d) combined statistic, z(k); and (e) warning and fault signals.

the FTC strategy is not working. In addition, it is clear how the FTC strategy returns back to control, although the fault magnitude is important, by means of the reference trajectory updating. In addition, Figure 8a shows how the real process variable evolution returns to normal operation by means of the FTC action. If FTC is not taken into account, the process moves

away from the desired point (2 g/m3). In this case, the offset f estimation results are oˆDO ) 4.0002 g/m3. The fault, detection, and reconfiguration instants are, respectively, Tf ) 3 days ≈ Td ) 3.0208 days, and Tr ) 4.0521 days, which indicates fast detection. The combined statistic z(k) is presented in Figure 8c and also is an indicator of which process returns to normal

Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008 5475

Figure 10. Fault in the N sensor, with and without FTC strategy (offset ) 4 gN/m3 at t ) 3 days): (a) controlled variable, (b) manipulated variable, (c) combined statistic, and (d) warning and fault signals.

operation after the instant of reconfiguration. In Figure 8d, the fault (up) and warning signal (down) given by the FDIE system are displayed. In this case, we observe how the integration strategy proposed here can solve saturation problems that are due to sensor faults in a suitable way. Next, a nitrate sensor fault is proposed; here, a bias in the N sensor of the magnitude of -1 gN/m3 at t ) 3 days is presented. The cases without and with FTC strategy are displayed separately in Figures 9a and 9b, respectively. Here, it can be seen how the set-point reconfiguration allows one to return the real process state to the desired operation point, 1 gN/m3. The compensation magnitude is given by the FDIE system using eq 7; its estimation result is oˆNf ) -1.0075 gN/m3. Figure 9c presents the MV evolution for this control loop. The MV behaves very closely to the normal case, which is also included. The variations of the combined statistic and the warning and fault signals are presented in Figures 9d and 9e, respectively. The instants of fault, detection, and reconfiguration, in this case, result in values of Tf ) 3 days ≈ Td ) 3.0208 days, and Tr ) 4.0521 days, respectively, giving good resolution time and proper detection. Another case for this sensor fault is presented in Figure 10. As was shown for the DO case, here the saturation management due to an important offset in the N sensor is analyzed. Figure 10a shows the advantage of using the FTC strategy when an

offset of the magnitude of 4 gN/m3 at t ) 3 days appears. Here again, the set-point reconfiguration allows one to return the real process state to the desired one (1 gN/m3). Without FTC strategy, the process begins working in an open loop, because of the saturation effect in the MV (see Figure 10b). Thanks to the FTC strategy, the MV saturation is rapidly avoided. The combined statistic and the fault advising from the FDIE are shown in Figures 10c and 10d, respectively. In this case, the offset estimation results in a value of oˆNf ) 4.1568 gN/m3, and this value is used to integrate both the FDIE and FTC systems. In addition, the instants of fault occurrence, detection, and reconfiguration (Tf ) 3 days ≈Td ) 3.0208 days, and Tr ) 4.0521 days, respectively) are displayed. To evaluate the actuator-fault-type compensation, a simulation set is proposed, considering an abnormal event at the aeration system for the fifth biological reactor and the internal recycle valve flow. In these cases, the fault is characterized using extra dead time in the control loop due to mechanical malfunctions. First, the KLa5 actuator fault is presented in Figure 11. Here, the extra delay magnitude is given as d ≈ 0.6 min added to the DO control loop. Figure 11a shows how the stability is degraded when only the classical control strategy is acting. Using the dead time estimation given by the FDIE system and through the IMC-based tuning technique, the online updating is performed and the controller parameters are reconfigured using eqs

5476 Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008

Figure 11. Fault in the KLa5 actuator, with and without FTC strategy (delay ) 0.6 min at t ) 3 days): (a) controlled variable, (b) manipulated variable, (c) combined statistic, and (d) warning and fault signals.

4a, 4b, 12, 14a, and 14b. It is shown, by the simulations with FTC strategy, where the stability is recovered and guaranteed. The dead time estimation gives a result of dˆKe La5 ) 0.6306 min, and the instants of detection and reconfiguration are given as Td ) 3. 0313 days and Tr ) 4.0625 days, respectively. In Figure 11b, the evolution of the MV for this control loop is presented. The combined statistic evolution is shown in Figure 11c, indicating the correct fault compensation and the return to normal conditions. The fault and warning signals are displayed in Figure 11d. The second actuator fault simulation is represented in Figure 12, where the internal recycle valve flow presents mechanical problems at Tf ) 3 days and produces additional dead time in the nitrate control loop. In this example, the considered extra intr delay (d) is 20 min. The correct delay estimation (dˆQ ) e 19.9444 min) allows one to select a suitable value of the filter time constant and reconfigure the controller parameters based on the IMC tuning methodology. This can be observed in Figure 12a, where the measured CV (the nitrate level at the second reactor) is displayed with and without FTC strategy. Analogously, in Figure 12b, the temporal evolution of the MV is shown, revealing that the improvement that is observed in using the FTC approach occurs because the saturations are avoided. Note that a delay of 20 min is not as crucial as in the other loop, because of the lower dynamic involved for the N loop.

therefore, the controller performance does not indicate important benefits, with respect to the conventional one. Figure 12c shows that the combined statistic does not return under the confidence limits after the reconfiguration is performed. Consequently, the advising of warning and fault still working, as is shown in Figure 12d. This situation suggests that, for this fault, the inclusion of a reconfiguration interval in the APCA algorithm must be taken into consideration, after the compensation is done. This interval will be directly related to the slowest process dynamics. The reconfiguration process (in this interval) could involve the scale parameters (mean and standard deviation (std)), the confidence limits of SPE and T2, and the number of the principal components retained (A) updates. Hence, this fault test is useful for including specific modifications on the FDIE for dealing with the delay treatment properly. To analyze the benefits of having (or not having) the FTC strategy quantitatively, it is important to evaluate the performance indices developed by Copp15 for this WWTP. In Table 6, several indexes that were evaluated during the last simulation week are listed. A representative index for measuring the controller performance is the integral absolute error (IAE) between the CV and the corresponding set-point policy (sp(k)). Another indicator for the controlled process performance is the pumping energy (PE) index, which measures the energy cost

Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008 5477

Figure 12. Fault in the Qintr actuator, with and without FTC strategy (delay ) 20 min at t ) 3 days): (a) controlled variable, (b) manipulated variable, (c) combined statistic, and (d) warning and fault signals.

in the pumping of the internal recycle flow, return sludge recycle flow, and waste sludge flow. Finally, the effluent violation index is considered for quantifying the time that the plant is violating the effluent constraints during the operation time. In this case, two indexes are considered: the total nitrogen level violation (NV), which is limited to18 gN/m3, and the total ammonia level violation (AV), which is constrained to 4 gN/m3. Several other indexes could be considered;15 however, here, those that have been chosen were assumed to have greater impact for this specific methodology. The process operation indexes under normal conditions are presented at the top of Table 6. When different fault types are present, these indexes are evaluated and compared according to the classic control strategy, without tolerant characteristics and the decentralized FTC. In all the sensor fault cases, the IAE values with FTC are less than the IAE values without FTC. This situation is also observed for the fault in the KLa5 actuator; meanwhile, for the other actuator, the use of FTC gives an IAE value that is lightly superior to the strategy without FTC. The general improvement in the IAE values is observed because the reference trajectory update is performed by the decentralized FTC strategy, returning the process closer to the desired working point. In all cases,

the pumping energy index is less when FTC is used. In particular, it is drastically lower for the fault in the DO sensor when the bias is 4 g/m3. This is due to the working point changes, except for the case where the Qintr sensor saturates to its minimal value (0 m3/day) and a loss of control occurs. In addition, note that, with the FTC approach, the pumping energy index remains near the value given for normal operation range. These two indexes also must be analyzed with the effluent violation index. They are very sensitive, mainly with regard to the sensor faults. In the DO sensor fault case, the drastic benefit is observed in the effluent violation of the ammonia level. On the other hand, the N sensor fault case has the stronger impact on the total nitrogen level violation. In the actuator fault cases, the effluent violation index essentially remains next to the normal values. Other important performance measurements that have not been considered until now for the WWTP have been taken into consideration here. They are the same as those evaluated in the work of Zumoffen and Basualdo;3 hence,only a brief description is given here. For the APCA module, the detection percent time (DPT) is given by eq 15.

5478 Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008

DPT (% ) )

Td - Tf × 100 Tsd

(15)

For the FL subsystem, the rule support percent (RSP) index is used, as given by eq 16: yj RSPj (%) )

(

1 p-1 yj

p

)∑ y

i

i)1,i*j

× 100

(16)

The fault magnitude estimations given by the ANN subsystem are evaluated using eq 17: N

MSPEj )



1 (sˆ (k) - sj(k))2 N k)1 j

(17)

where sj(k) is the real value for the fault j, sˆj(k) the magnitude estimation given by ANN, and N the number of samples needed for the computation time interval. The signal s make reference to both the actuator delay (dˆKe La5(k) and dˆQintr (k)) and the sensor e f offset (oˆDO (k) and oˆNf (k)). Finally, the error improVement percentage (EIP), which is the index for measuring the overall FTC performance, with respect to the classical decentralized strategy, is used. Its calculation is based on the absolute integral error (IAE) for

both conventional and fault-tolerant strategies, as can be observed in eq 18: EIPj (% ) )

IAEconv - IAEftc j j IAEconv j

× 100

(18)

where EIPj corresponds to fault j, IAEjconv is the tracking performance for a specific control loop in the conventional decentralized control strategy, and IAEftc j is the same parameter evaluated for the FTC strategy. According to the indexes given in Table 7, the DPT is almost zero in most cases, which indicates quick fault detection. The higher value for F1 (1 g/m3) is explained because of the bias magnitude, which is more difficult to detect because it does not produce valve saturation. None of the RSPj indexes are less than 56.25, which indicates good robustness characteristics for the classification aspects. This fact is based on the good values for yj. The MSPEj values were shown to be sufficient for doing very accurate fault magnitude estimations. Finally, Table 8 shows that all the EIP values are greater than zero and most of them are almost 100, which indicates a real advantage of the FTC inclusion, except for fault F4, which was explained previously. Although its EIP value is very close to zero, it is at least a satisfactory result.

Figure 13. Small fault in the N sensor with FTC strategy (offset ) 0.5 gN/m/3 at t ) 3 days): (a) controlled variable, (b) manipulated variable, (c) combined statistic, and (d) offset estimation.

Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008 5479

Figure 14. Drift fault in the OD sensor, with and without FTC strategy (slope ) 1/5 gN/m3/day, beginning at t ) 0 days): (a) measured controlled variable, (b) real controlled variable, (c) manipulated variable, (d) combined statistic, and (e) drift estimation.

Finally, two tests are considered here to check the system behavior when other faults occur. Figure 13 presents some interesting responses when a small offset of 0.5 gN/m3 at t ) 3 days occurs in the N sensor, accounting the range of variations given in section 2. As can be seen by the curves in Figure 13, correct fault management and compensation through set-point

updating is achievable. Although the offset estimation shown by Figure 13d presents a small error, which is due to the low signal-to-noise ratio, the compensation is correct. On the other hand, Figure 14 shows the behavior when the slow drift fault case in the DO sensor occurs, with and without FTC strategy. Without FTC strategy, the drift, at some point, produces

5480 Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008

saturation in the corresponding MV and eventually a loss of control, as can be seen from Figure 14c. With FDIE/FTC integration, the faulty sensor is detected, classified, and compensated in a suitable way. The drift begins at Tf ) 0 and is detected at Td ) 3.4896 days; it is classified and reconfigured at Tc ≈ Tr ) 4.5208 days. Figure 14e shows that the drift estimation is 0.19. Based on this information, the measurement can be compensated for, giving the real value of that variable (see Figure 14b). These last two simulation results display more insight about the high FDIE capacities for managing a wide range of abnormal situations. 6. Conclusion This work has shown how a well-designed fault detection, isolation, and estimation (FDIE) system provides essential tools for achieve real improvements in fault-tolerance characteristics. This work belongs to a series where the FDIE system has been successfully tested for large chemical plants and under different types of faults. In addition, the fault-tolerance control (FTC) strategy that has been integrated into the FDIE system is able to coordinate remedial actions and ensure safe and profitable plant operation. In particular, the wastewater treatment process (WWTP) presents the manipulated variable (MV) saturations when the bias in sensors are important, and the delays in actuators could produce serious instability problems. On one hand, when the offsets at sensors are detected, although small magnitudes are involved, they are corrected by updating the set-point trajectory online, using the fault magnitude estimation given by the FDIE system. If faults in sensors are slow drifts, the FDIE also proved to be efficient. On the other hand, the actuators faults are compensated online by a model-based control (MBC) tuning methodology, using the delay estimation. The overall tests presented here seem to indicate that this methodology is sufficiently flexible and could be extended to other types of abnormal events, such as process faults. The main question is to determine the fault pattern properly. The use of performance indexes shown in this paper clearly quantify the real improvements produced by the methodology explained here. Nomenclature AbbreViations ASM ) abnormal situation management APCA ) adaptive principal components analysis ANN ) artificial neural networks APC ) adaptive predictive control APRFC ) adaptive predictive with robustness filter control AW ) anti-windup AV ) ammonia violation CSTR ) continuously stirred tank reactor CPV ) cumulative percent variance CV ) controlled variable DO ) dissolved oxygen E ) effluent FDI ) fault detection and isolation FL ) fuzzy logic FTC ) fault-tolerant control FDIE ) fault detection, isolation and estimation I ) influent IAE ) integral absolute error IMC ) internal model control MBC ) model-based control MV ) manipulated variable NV ) nitrogen violation PCA ) principal component analysis

PI ) proportional integral PID ) proportional integral derivative Ri ) biological reactor i RAS ) recycled activated sludge RS ) residual space SPE ) squared predictive error TF ) transfer function WWTP ) wastewater treatment plant Variables A ) principal components retained b ) mean vector d ) disturbances signal d˜ ) estimated disturbances d˜ei ) estimated delay for actuator i DO ) dissolved oxygen content Dλ ) eigenvalues matrix em ) norm-bounded multiplicative error k ) sample time km ) model gain KLa5 ) aeration coefficient for R5 li ) linguistic value for component i Lji ) normal behavior limits for i component Naux ) size of auxiliary data matrix Nw ) size of normal data matrix window oˆif ) estimated offset for sensor i oˆjf,drift ) estimated drift for sensor j pi ) eigenvector i P ) selected principal components matrix j ) residual principal components matrix P Qintr ) internal recycle flow r ) FL input size R ) correlation data matrix s ) standard deviation vector slopeMVj ) slope estimation for MV j slopeCVj ) slope estimation for CV j spi0 ) original set-point policy for loop i spi ) updated set-point policy for loop i SPE ) squared prediction error T2 ) Hottelling statistic Td ) fault detection time Tf ) fault occurrence time Tfz ) final analysis zone time Tiz ) initial analysis zone time Tr ) reconfiguration time uFL ) FL input vector m uFL ) mean contribution vector VN ) cost function for N samples wi,j ) ANN input weights Wi,j ) ANN hidden weights jx ) normalized new sample X ) training data matrix j ) normalized training data matrix X Xaux ) auxiliary data matrix Xn ) normal data matrix y ) process output yjcm ) compensated measured variable j yinn ) ANN output i yjm ) measured variable j yinn ) ANN output prediction i z ) combined statistic Greek Symbols δi ) control limit for statistic i ∆xj ) prediction error for PCA ε ) prediction error for ANN

Ind. Eng. Chem. Res., Vol. 47, No. 15, 2008 5481 θ ) estimation parameters vector λi ) eigenvalue i µi ) mean value for statistic i ξ ) real process dead time F ) proportional parameter σi ) standard value for statistic i τf ) filter time constant τm ) model time constant φi ) ANN input i ω ) frequency variable

Acknowledgment The authors want to acknowledge the financial support from CONICET (Consejo Nacional de Investigaciones Cientficas y Te´cnicas), ANPCYT (Agencia Nacional de Promocio´n Cientfica y Te´cnica) from Argentina. We thank Dr. Ulf Jeppsson, Associate Prof. from Technology Institute of Lund-Sweden, who developed and gave us the rigorous model of (WWTP) using Matlab/Simulink. Finally, we acknowledge UTN-FRRO for their support. Literature Cited (1) Christofides, P. D.; El-Farra, N. H. Control of Nonlinear and Hybrid Process Systems. In Designs for Uncertainty, Constraints and Time-Delays; Springer-Verlag: Berlin, Germany, 2005. (2) Zumoffen, D.; Basualdo, M.; Jorda´n, M.; Ceccatto, A. Robust Adaptive Predictive Fault-Tolerant Control Integrated to a Fault-Detection System Applied to a Nonlinear Chemical Process. Ind. Eng. Chem. Res. 2007, 46 (22), 7152–7163. (3) Zumoffen, D.; Basualdo, M. From Large Chemical Plant Data to Fault Diagnosis Integrated to Decentralized Fault Tolerant Control: Pulp Mill Process Application. Ind. Eng. Chem. Res. 2008, 47 (4), 1201–1220. (4) Zumoffen, D.; Basualdo, M.; Molina, G. Improvements in Fault Tolerance Characteristics for Large Chemical Plants: 2. Pulp Mill Process with Model Predictive Control. Ind. Eng. Chem. Res. 2008, 47, 5482–5500. (5) Lee, C.; Choi, S. W.; Lee, I. B. Sensor Fault Identification Based on Time-Lagged PCA in Dynamic Processes. Chem. Intell. Lab. Syst. 2004, 70, 165–178. (6) Yoo, C. K.; Vanrolleghem, P. A.; Lee, I. B. Nonlinear Modeling And Adaptive Monitoring With Fuzzy And Multivariate Statistical Methods In Biological Wastewater Treatment Plants. J. Biotechnol. 2003, 105, 135– 163.

(7) Punal, A.; Roca, E.; Lema, J. An Expert System for Monitoring and Diagnosis of Anaerobic Wastewater Treatment Plants. Water Res. 2002, 36, 2656–2666. (8) Ekman, M.; Bjo¨rlenius, B.; Andersson, M. Control of the Aeration Volume in an Activated Sludge Process Using Supervisory Control Strategies. Water Res. 2006, 40, 1668–1676. (9) Rieger, L.; Thomann, T.; Gujer, W.; Siegrist, H. Quantifying the Uncertainty of On-Line Sensors at WWTPS During Field Operation. Water Res. 2005, 39, 5162–5174. (10) Mhaskar, P.; Gani, A.; El-Farra, N.; McFall, C.; Christofides, P.; Davis, J. Integrated Fault Detection and Fault-Tolerant Control of Nonlinear Process Systems. AIChE J. 2006, 52, 2129–2148. (11) Mhaskar, P.; McFall, C.; Gani, A.; Christofides, P.; Davis, J. Isolation and Handling of Actuator Faults in Nonlinear Systems. Automatica 2007, 44, 53–62. (12) Mhaskar, P.; Gani, A.; McFall, C.; Christofides, P.; Davis, J. FaultTolerant Control of Nonlinear Process Systems Subject to Sensor Faults. AIChE J. 2007, 53, 654–668. (13) Mhaskar, P. Robust Model Predictive Control Design for FaultTolerant Control of Process Systems. Ind. Eng. Chem. Res. 2006, 45, 8565– 8574. (14) Ohran, B.; Munoz de la Pena, D.; Christofides, P.; Davis, J. Enhancing Data-Based Fault Isolation Through Nonlinear Control. AIChE J. 2008, 54, 223–241. (15) Copp, J. The COST Simulation Benchmark;Description and Simulator Manual; Office for Official Publications of the European Community: Luxembourg, 2002. (16) Gao, W.; Ma, G.-F.; Zhou, M.-L.; Li, Y.-C.; Li, Y. Parameter Identification and Adaptive Predictive Control of Time-Varying Delay Systems. In Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, PRC, August 18-21, 2005; IEEE: Piscataway, NJ, 2005; pp 609-613. (17) Jiang, C.; Zhou, D. H. Fault Detection and Identification for Uncertain Linear Time-Delay Systems. Comput. Chem. Eng. 2005, 30, 228– 242. (18) Bjorklund, S.; Ljung, L. A Review of Time-Delay Estimation Techniques. Proc. IEEE Conf. Decision Control 2003, 42, 2502–2507. (19) http://www.asmconsortium.com. (20) Rivera, D. Una Metodologa Para La Identificacion Integrada Con El Diseno de Controladores IMC-PID. ReV. Iberoam. Automat. Inf. Ind. 2007, 4-4, 5–18.

ReceiVed for reView January 18, 2008 ReVised manuscript receiVed April 22, 2008 Accepted April 28, 2008 IE800098T