Fault Diagnosis of Batch Chemical Processes Using a Dynamic Time

Fault diagnosis is important for ensuring chemical processes stability and safety. The strong nonlinearity and complexity of batch chemical processes ...
0 downloads 0 Views 2MB Size
ARTICLE pubs.acs.org/IECR

Fault Diagnosis of Batch Chemical Processes Using a Dynamic Time Warping (DTW)-Based Artificial Immune System Yiyang Dai and Jinsong Zhao* State Key Laboratory of Chemical Engineering, Department of Chemical Engineering, Tsinghua University, Beijing, 100084, People’s Republic of China ABSTRACT: Fault diagnosis is important for ensuring chemical processes stability and safety. The strong nonlinearity and complexity of batch chemical processes make such diagnosis more difficult than that for continuous processes. In this paper, a new fault diagnosis methodology is proposed for batch chemical processes, based on an artificial immune system (AIS) and dynamic time warping (DTW) algorithm. The system generates diverse antibodies using known normal and fault samples and calculates the difference between the test data and the antibodies by the DTW algorithm. If the difference for an antibody is lower than a threshold, then the test data are deemed to be of the same type of this antibody’s fault. Its application to a simulated penicillin fermentation process demonstrates that the proposed AIS can meet the requirements for online dynamic fault diagnosis of batch processes and can diagnose new faults through self-learning. Compared with dynamic locus analysis and artificial neural networks, the proposed method has better capability in fault diagnosis of batch processes, especially when the number of historical fault samples is limited.

1. INTRODUCTION As modern equipment and plants become more complex and larger in scale, the chance of mishaps and faults also increases. Because of the flammable, explosive, toxic, and corrosive nature of a typical chemical process, any single accident can trigger a major catastrophe that, in turn, may cause tremendous environmental, social, and economic losses. To ensure that a chemical process runs safely, various process models and algorithms have been developed to eliminate potential hazards. Batch processes are important chemical processes used to produce high-quality and high value-added materials, which causes their monitoring and control to emerge as essential techniques. There are fundamental differences in both design and operation between batch and continuous processes.1 Compared with continuous processes, batch processes have more operation stages and larger variations in their prescribed processing trajectory. A small change in operating conditions or a misoperation during critical stages may impact the final product quality and even lead to a chemical disaster.2 Because of the nonlinearity and complexity of unsteady operating conditions in batch processes, fault diagnosis is usually much more difficult than that in continuous processes. Mostly multivariate statistical process modeling methods are used to study batch chemical processes for process monitoring and fault detection, including principal component analysis (PCA), multiway PCA (MPCA), dynamic-PCA (DPCA), multiway kernel PCA (MKPCA), and adjoined PCA (AdPCA).3-7 However, most of these methods are skilled in detection rather than diagnosis. Some of them diagnose the type of fault, using contribution plots with manual check intervention, and it is not suitable for online diagnosis once a new fault type is detected. Besides multivariate statistical process modeling methods, artificial neural networks (ANNs) represent a widely accepted approach for fault detection and diagnosis of batch processes8,9 r 2011 American Chemical Society

with the online training methods available. On the other hand, the establishment of an ANN requires a certain number of training samples. However, actually, in the industrial processes, the samples of faults are quite limited. When there is only a small amount of training samples, ANNs might not work very well. Petri net and expert systems have also been widely investigated to solve the problem of batch processes.10-12 However, whether Petri nets or expert systems, they require initialization by experts. Once a new fault type was introduced or the operation conditions changed, the model may be out of work. Therefore, the fault diagnosis models of batch chemical processes must have strong self-adaptive ability to retrain themselves during each batch and to learn new faults with minimal human intervention. As Figure 1 showed, the same fault introduced to a batch process at different time, the fault data may differ greatly. Therefore, how to diagnose the same fault at different time and scale with less training samples is significant for fault diagnosis of batch processes. In this paper, a new method is proposed for diagnosing batch chemical process faults, based on an artificial immune system (AIS) and dynamic time warping (DTW) algorithm. It is capable of recognizing identical fault types that have varying occurrence times and magnitudes. This article is structured as follows. In section 2, a brief overview of artificial immune systems is provided. In section 3, the DTW-based AIS algorithm for fault diagnosis is proposed. In section 4, we illustrate the proposed approach using a simulated penicillin fermentation process. Finally, in section 5, the main conclusions of the paper are given. Received: July 9, 2010 Accepted: February 16, 2011 Revised: February 15, 2011 Published: March 08, 2011 4534

dx.doi.org/10.1021/ie101465b | Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research

ARTICLE

Figure 2. An antibody of the artificial immune system. Figure 1. Samples of the same fault introduced at different times.

2. ARTIFICIAL IMMUNE SYSTEM (AIS) 2.1. Immune System. The immune system is a defense system that has evolved to protect its host from pathogens (harmful micro-organisms such as bacteria and viruses).13 It comprises the organs and processes of the body that provide resistance to infection and toxins. Immunity may be divided into two components, according to function: natural immunity and acquired immunity. Natural immunity is the immunity to disease that occurs as part of an individual’s natural biologic makeup. It is present before the onset of infection and constitutes a set of diseaseresistance mechanisms that are not specific to a particular pathogen, providing the first line of defense against infection. Acquired immunity is the immunity to a particular disease that is not innate but has been acquired during life. It does not come into play until there is an antigenic challenge to the organism. Acquired immunity responds to a challenge with a high degree of specificity, as well as the remarkable property of “memory.” Exposure to the same antigen some time in the future results in a memory response: the immune response to the second challenge occurs more quickly than the first, is stronger, and is often more effective in neutralizing and clearing the pathogen. With the organic combination of natural and acquired immunity, the immune system can protect the body from foreign substances and pathogenic organisms by an immune response. Similar to natural immunity in the body, we have control systems that protect chemical processes from disturbances. However, once a disturbance has caused a departure from the acceptable range of an observed variable or a calculated parameter associated with the process, a fault has been introduced.14 The principle of acquired immunity can be used for fault diagnosis of a chemical process. 2.2. Artificial Immune System. As described by Timmis,15 the AIS is a diverse area of research that attempts to bridge the divide between immunology and engineering, developed through the application of techniques such as mathematical and computational modeling of immunology, abstraction from those models into algorithm (and system) design and implementation in the context of engineering. In 1990, Ishida16 used the immune system principle for the first time in an engineering application to address the problem of fault diagnosis of sensor networks. Forrest17 later used an immune system for computer security and virus detection, and this approach has been widely used recently. Since then, AISs

have been widely used in diverse fields. Most research can be found in reviews of AIS.18-20 Recently, researchers have used the artificial immune system for fault diagnosis of process systems. Aguilar21 proposed an AIS approach that applied a pattern recognition model to perform fault detection. Zhao and her team22,23 have done much work to apply AIS to the fault diagnosis of the Tennessee Eastman (TE) process, which is the benchmark of a typical chemical process. However, until now, there have been few effective studies on using AIS for fault diagnosis of batch chemical processes. Negative selection and clonal selection are the two most popular immune algorithms and have been widely used. The first negative selection algorithm was proposed by Forrest et al.17 to detect data manipulation caused by a virus in a computer system. In this algorithm, a set of self-strings (S) is first produced that define the normal state of the system. Subsequently, a set of detectors (D) are generated that only bind the complement of S. These detectors can then be applied to new data to classify them as being self or nonself. In a clonal selection algorithm, once the B-cells that recognize the antigen proliferate, they are treated as parent cells.24 New cells (antibodies) are then generated as clones of their parents. During the cloning phase, antibodies mutate, creating differences between the new antibodies and their parents. All of the new antibodies are capable of recognizing an antigen that triggers an immune response. In addition to the negative selection and clonal selection algorithms, two important immune algorithms are the immune network25 and danger theory.26

3. DYNAMIC TIME WARPING (DTW)-BASED ARTIFICIAL IMMUNE SYSTEM In most algorithms of traditional artificial immune systems, each antigen and each antibody are represented by vectors of data from a sample at a certain time, no matter what algorithm is used. However, in a batch chemical process, the data are not stable, so a sample at a point of time cannot accurately reflect the dynamic working conditions of the process. Furthermore, once a fault is introduced into the process, data changes will be reflected in trends, and antigens and antibodies that are simply represented by vectors of data do not contain trend information. This is not conducive to the diagnosis of fault type or early fault detection. Therefore, in this paper, a new fault diagnosis approach is proposed for batch chemical processes, in which antibodies are generated using historical data and antigens are generated by online real-time data and the system will detect and diagnose faults by calculating the affinity of the antigen-antibody binding. 4535

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research

ARTICLE

Another matrix, D, representing the dissimilarity distance between the two subsequences, is then obtained, using the following formula: 8 9 > < Dði - 1, jÞ þ dði, jÞ or ½¥, ifconditionðAÞ > = Dði, jÞ ¼ min Dði - 1, j - 1Þ þ dði, jÞ > > : Dði - 1, j - 2Þ þ dði, jÞ ; ð4Þ Condition (A) indicates that the predecessor of point (i - 1, j) is the point (i - 2, j). The initial condition is Dð1, 1Þ ¼ dð1, 1Þ

Figure 3. Principle of dynamic time warping (DTW).27

Both antigens and antibodies are represented by matrices of time-sampled data, instead of vectors of data. Figure 2 shows an antibody for one type of fault. The antibody in Figure 2 contains a set of time-sampled data with 6 variables and 20 samples. In the traditional AIS, antigens and antibodies are represented by vectors and the differences between them are always calculated using the Euclidean distance. However, in our new approach, the Euclidean distance is not suitable for calculating the difference between two matrixes containing trend information. To solve this problem, dynamic time warping (DTW) is introduced. 3.1. Dynamic Time Warping (DTW). DTW is a flexible method for comparing two dynamic patterns that may not be perfectly aligned and are characterized by similar, but possibly expanded or contracted, temporal correlations. It was first used in the area of speech recognition for the identification of isolated and connected words. In 1998, Kassidas27 used it for fault detection and diagnosis of the Tennessee Eastman process. This was the first time that DTW was used for a chemical process and it was used only for off-line signal comparison. Recently, Srinivasan and Qian28 proposed that dynamic locus analysis (DLA) based on DTW can be used directly for multivariate temporal signals and has the computational efficiency needed for real-time applications. In DTW, R and T denote two matrices with dimensions r  n and t  n, where n is the number of variables and r and t are the lengths of the data for the variables in each matrix, respectively. As Figure 3 shows, DTW will find a minimum sequence F of K points on a t  r grid:27 F ¼ fcð1Þ, cð2Þ, :::, cðkÞ, :::, cðKÞg

ð1Þ

cðkÞ ¼ ½iðkÞ, jðkÞ

ð2Þ

The main steps of DTW are as follows. First, a matrix, d, is constructed that represents the Euclidean distance between R and T. i and j denote the sample times in T and R, respectively, c denotes the variables of R and T, and ωc denotes the nonnegative weight of variable c:   ∑ ωc jRðj, cÞ - Tði, cÞj c¼1 n

dði, jÞ ¼

ð3Þ

ð5Þ

However, in most cases of online fault diagnosis, the first sample of the data is not always the first sample of historical data. Therefore, in the proposed method, the start point will not be taken as done for eq 5, but is as follows: Let t ag denote the sequence number of the first sample in the antigen and twin denote the size of the sample window for searching the start point D(1,1), which is usually the same as the sample number m of an antigen. Find the minimum point j(1) for the start point of the normal antibody: jð1Þ ¼ arg minj fdð1, jÞg, j ∈ ½maxð1, tag - twin Þ, minðm, tag þ twin Þ

ð6Þ Then, by tracking the dynamic locus of T in R, a sequence F that matches R and T is determined by F ¼ fð1, jð1ÞÞ, ð2, jð2ÞÞ, 3 3 3 , ðm, jðmÞÞg

ð7Þ

By warping the original subsequence, the optimal path that reflects the most similar information is found. A difference matrix j can be calculated by jði, cÞ ¼ RðjðiÞ, cÞ - Tði, cÞ

ð8Þ

Finally, the normalized difference η between R and T is calculated as m

ηðT, RÞ ¼

∑ jjðiÞj i¼1 m

ð9Þ

In this paper, the DTW algorithm was improved to calculate the difference between the antibody and antigen, also aiding the mutation algorithm during the antibody cloning process, which will be described below. 3.2. System Initialization. Before fault detection and diagnosis, there is a system initialization phase. As the antigens and antibodies are represented by matrices, it is difficult to generate a set of detectors that bind the complement of self-strings. Therefore, the algorithm used in this new approach is more similar to a clonal selection algorithm. All the data from a historical normal batch composes a normal antibody, while the data from a historical fault batch composes a 4536

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research

ARTICLE

Figure 4. An example of the cloning phase.

fault antibody, as follows: Abmn Þ, normal ðk ... , Abnormal ðnÞ normal ¼ ½Abnormal ð1Þ, Abnormal ð2Þ, ... , Ab

ð10Þ

where Abnormal(k) indicates the data of variable k and m is the number of all samples in one normal batch. The original fault antibody is composed of the historical data after the time that the fault was introduced: Abmn fault ¼ ½Abfault ð1Þ, Abfault ð2Þ, :::, Abfault ðkÞ, :::, Abfault ðnÞ ð11Þ where Abfault(k) indicates the data of variable k and m represents the sample number of fault antibodies, which is usually be taken to be 15-20 in this paper. All of the original normal and fault antibodies of different types will have to be cloned and mutated to create fault antibody libraries to enhance the capability of diagnosing faults varying in

both of time and space scales. In the following, a new cloning algorithm is proposed by integration of the DTW principle. If there is only one original antibody X0 of one type, let j0 indicates the difference matrix between X0 and a set of historical data of any normal batch calculated using eq 8, X* indicates the antibody cloned from X0, Xn indicates a section randomly cut from the historical normal data with the same size of X0, and a indicates a random number from 0.5 and 2 in this paper. The new antibody X* can be cloned with mutation by ð12Þ X  ¼ Xn þ aj0 If there are more than two original antibodies in one type, take two antibodies randomly from the same type library. Let X1 and X2 indicate the two original antibodies, then find two minimum sequences and calculate two difference matrixes j1 and j2 between the two antibodies and a set of historical data of any normal batch, respectively, using eq 8. Let j* represent the cloned matrix with mutation from j1 and j2, using eq 13, a 4537

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research

ARTICLE

Figure 5. Flowchart of artificial immune fault diagnosis.

Table 1. Set of Parameters for Normal Conditions control parameter

Figure 6. Flowsheet of the penicillin fermentation process.29

indicates a random number between 0.5 and 2 and b indicates a random number between -1 and 1 in this paper. j ¼ aj1 þ b½j1 - j2 

ð13Þ

Let X* indicates an antibody cloned from X1 and X2 with mutation, and it can be generated by ð14Þ X  ¼ Xn þ j For example, take a one-variable system to illustrate how antibody cloning is performed. Suppose there is one fault antibody library of the system containing only two original fault antibodies X1 and X2. As Figure 4a shows, trends A and B represent historical data of the same fault that occurred at different times, while C is a historical normal data trend curve. Spots denoted by an asterisk (*) indicate antibody X1 generated from the historical fault data A, and spots denoted by a plus sign (þ) indicate antibody X2 generated from the historical data B of the same fault. Use the above DTW principle to find the minimum sequences F1 and F2, as well as to calculate the difference matrixes j1 and

value

substrate concentration

15.0 ( 0.5 g L-1

dissolved oxygen concentration

1.16 ( 0.05 mmol L-1

biomass concentration

0.1 g L-1

penicillin concentration

0 g L-1

culture volume

100.0 L

carbon dioxide concentration pH

0.5 ( 0.1 g L-1 5.0

fermenter temperature

298

j2. As Figure 4b shows, j* is the newly cloned difference matrix, using eq 13. In Figure 4c, Xn is the section cut from the historical normal curve C with the same size of X1. The newly cloned fault antibody X* is obtained using eq 14 and then it is stored in the fault antibody library of X1 and X2. This is the end of generating one clone. Multiple clones can be easily generated by repeating the above procedure to build up the fault antibody library. The choice of the number of clones is a tradeoff. The larger the number of clones, the heavier the calculation load during the fault detection and diagnosis. However, if the number of the clones is too small, the fault diagnosis capability is deteriorated. For each known fault type, its fault antibody library can be constructed the same way. After the antibody libraries are built up, a threshold for each library will be calculated as follows. A matrix η is constructed by the difference between every two antibodies of the same type using DTW. Let η(i, j) indicates the difference between antibody i and antibody j, and η(i, i) = ¥. Hence, the threshold of this library will be n

n

δ ¼ max ðmin ηði, jÞÞ i¼1 j¼1

ð15Þ

When the antibody libraries have been constructed and the thresholds of the libraries have all been calculated, the system initialization is completed. 4538

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research

ARTICLE

Table 2. Set of Historical Fault Samples Decreasing Agitator Power

Increasing Aeration Rate

Increasing Substrate Feed Rate

description

occurrence (h)

description

occurrence (h)

description

32% step decrease

180

38% step increase

100

30% step increase

90

50% step decrease

135

55% step increase

65

48% step increase

120

3.3. Fault Detection. After the initialization phase, the system will have the capability for fault detection and diagnosis. As the online real-time data are introduced to the system, the last l samples before the current time will compose an antigen, as

Ag ln ¼ ½Agð1Þ, Agð2Þ, :::, AgðkÞ, :::, AgðnÞ

ð16Þ

where Ag(k) indicates the data of variable k, and l is usually an integer taken from 10 to 15 in this paper. The differences between the antigen and all the antibodies in the normal antibody library are calculated using DTW, as described in section 3.1. The differences are denoted by η0 = [η0(1), η0(2), ..., η0(n)], where η0(i) indicates the difference between the antigen and antibody i. If any of the differences is smaller than the threshold of the normal antibody library, the system is under the safe condition and the next sample is read in. If min(η0) > δnormal, where δnormal indicates the threshold of normal antibody library, then the system will report that a fault has been detected and initiate the fault diagnosis phase described in the following. 3.4. Fault Diagnosis. In the fault diagnosis phase, the differences between the antigen and all the antibodies in all of the fault antibody libraries are calculated using DTW algorithms described in section 3.1. The differences are denoted by ηk = [ηk(1), ηk(2), ..., ηk(n)], where ηk(i) indicates the difference between the antigen and antibody i from the fault k antibody library (k = 1, 2, ..., N, where N represents the number of known fault types). If min(ηk) < δk, where δk indicates the threshold of the fault k antibody library, then the fault type would be diagnosed as k. For each k, if min(ηk) > δk, read the next sample and generate a new antigen to repeat the above steps. If the fault type still could not be diagnosed after m samples have been read in, the system would report that a new type of fault might have occurred. Then, the human expert will have to confirm whether it is a new fault type or the batch process is still running in a normal state. If the batch process is still under normal conditions, the next sample is read in. Otherwise, the expert will input the type of the new fault to the system. 3.5. System Self-Learning. When a batch is completed, if no fault is detected, all the data from the batch can be used to generate a new normal original antibody. If a fault type k is diagnosed, the online data after the time that the fault was detected will be used to generate a new original fault antibody and the new antibody will be stored in fault antibody library k. Equations 13 and 14 are then used to generate new clones to update fault antibody library k, then the threshold of library k should also be updated using eq 15. If a new fault is diagnosed by a human expert as type N þ 1, fault antibody library N þ 1 is then created. The online data after the time that the fault was detected will be used to generate a new original fault antibody and the new antibody will be stored in fault antibody library N þ 1. New clones of fault antibody N þ 1 are then generated to update the fault antibody library N þ 1 using

occurrence (h)

eq 12. After the cloning phase, the threshold of library N þ 1 should also be updated using eq 15. When the antibody library and its threshold are updated, the system self-learning phase is then completed. A scheme of the proposed DTW-based artificial immune system is shown in Figure 5. In the next section, a case study is described to demonstrate the effectiveness of the system.

4. APPLICATION TO THE FED-BATCH FERMENTATION OF PENICILLIN 4.1. Process Description. The effectiveness of the proposed system is illustrated using the PenSim v2.0 model. PenSim v2.0 is :: a penicillin fed-batch simulator developed by Birol, Undey, and C-inar,29 based on the mathematical model of Bajpai and Reuss.30 Figure 6 shows a flowsheet of the penicillin fermentation process. Two proportional integral differential (PID) controllers are used to control the fermenter temperature and pH that may affect the quality of the product. The model is capable of simulating concentrations of biomass, CO2, hydrogen ions, penicillin, carbon source, oxygen, and heat generation under various operating conditions. The simulated data generated by the model can be applied for multivariate statistical process monitoring and fault diagnosis. This model has four load variables (including aeration rate, agitator power, substrate feed rate, and substrate feed temperature) and six internal state variables (including culture volume, generated heat, and concentrations of carbon dioxide, dissolved oxygen, biomass, penicillin, and substrate feed). Fermenter temperature and pH are the two controlled variables, and acid/base and heating/ cooling water flow rates are the manipulated variables. According to actual needs, six variables were selected to test the proposed system: aeration rate, concentration of dissolved oxygen, concentration of carbon dioxide, culture volume, agitator power, and cooling water flow rates. In this study, the duration of each batch was 400 h, and all the batches were simulated based on an integration step size of 0.02 h and a sampling interval of 0.1 h. Two normal batches and six fault batches were simulated to create historical data to generate the antibody libraries and initialize the system. Small variations were added to the simulation input data to create normal samples with process variations as a realistic condition. The set of parameters for normal conditions is shown in Table 1. Six batches of fault samples were of three fault types, with different introduced times and fault magnitudes, including decreasing agitator power, increasing aeration rate, and increasing substrate feed rate. Table 2 shows the details of the historical fault samples. In the system, the length of each fault antibody was taken to be 15. The system was initialized as described in section 3.2. Twelve fault batches were simulated to test the performance of the proposed DTW-based artificial immune system, three of which were running under the fault types that were not included in the 4539

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research

ARTICLE

historical fault sample. Table 3 shows the details of the testing fault samples. In the system initialization step, 100 antibodies are generated for each fault antibody library by eqs 13 and 14. In addition, the threshold of each library was calculated using eqs 15: • The threshold of normal antibodies is δnormal = 0.0021. • The threshold of fault antibodies of agitator power decreasing is δfault1 = 0.0121. Table 3. Set of Testing Fault Samples number

fault type

magnitude

occurrence (h)

Known Fault 1

agitator power decreasing

23%

92

2 3

agitator power decreasing agitator power decreasing

15% 38%

100 68

4

aeration rate increasing

80%

110

5

aeration rate increasing

25%

200

6

aeration rate increasing

30%

175

7

substrate feed rate increasing

66%

90

8

substrate feed rate increasing

43%

180

9

substrate feed rate increasing

70%

65

10 11

aeration rate decreasing aeration rate decreasing

25% 30%

68 90

12

aeration rate decreasing

45%

130

Unknown Fault

Figure 7. Online data of test batch 1.

• The threshold of fault antibodies of aeration rate increasing is δfault2 = 0.0418. • The threshold of fault antibodies of substrate feed rate increasing is δfault3 = 0.0436. 4.2. Known Fault Diagnosis. Take test batch 1 as an example to illustrate how the proposed DTW-based artificial immune system diagnoses a known fault. Test batch 1 is a fault sample of 23% step agitator power decreasing in 92 h. Figure 7 shows the variables selected for fault diagnosis of batch 1 with normalization, and Figure 8 shows the historical normal data and agitator power decreasing data with normalization. At time point t, 10 samples before t were taken to produce the antigen. The difference η0(i) between antigen and the normal antibody i was calculated using eqs 3-9. Let ηt = min(η0), and compare ηt and δnormal to detect the fault. When t < 92 h, as Figure 9 shows, ηt was smaller than the threshold δnormal. Therefore, the system identified that the process was under the safe condition and the next sample was read in. When t = 92 h, a fault was introduced but the difference was still smaller than the threshold value. Thus, the system did not triggger an alarm. After four samples, the difference was η92.4 = 0.0022, which was larger than the threshold value δnormal. Thus, the fault was detected and a diagnosis phase was initiated. The difference ηk(i) between the antigen and antibody i from fault antibody library k was then calculated using eqs 3-9. Let ηk = min(ηk); as Figure 10 shows, η2 and η3 were always larger than the threshold δ2 and δ3. When t < 92.7 h, η1 > δ1; at η1(t = 92.7) = 0.006, it was smaller than δ1. Therefore, the fault type was diagnosed as type 1: aeration rate increasing. Once the fault was diagnosed, the online data from t = 92.4 will be used to generate a new original fault antibody and the new antibody will be stored in the fault antibody library of aeration rate increasing. The antibody library of aeration rate increasing was updated through cloning and mutation, using eqs 13 and 14. The threshold of this library was also updated, using eq 15. 4.3. Unknown Fault Diagnosis. Because the historical fault samples did not have the fault type of aeration rate decreasing, this would comprise an unknown fault. Take test batch 10, for example, to illustrate how the proposed DTW-based artificial immune system diagnoses an unknown fault. At first, the same as that for a known fault, 10 samples were taken before time point t to produce the antigen, and the minimum value of differences between antigen and the normal antibodies was calculated as ηt. ηt and δnormal were compared to detect the fault. When t = 68.4 h, the difference η68.4 = 0.0025, it

Figure 8. Historical normal data and agitator power decreasing data. 4540

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research was larger than the threshold value δnormal. Thus, the fault was detected and diagnosis phase was initiated. The difference ηk(i) between the antigen and antibody i from fault antibody library k was then calculated using eqs 3-9.

Figure 9. Difference between antigen of test batch 1 and normal antibodies.

Figure 10. Difference between antigen of test batch 1 and fault antibodies.

ARTICLE

Let ηk = min(ηk) and compare it to ηk and δk. However, after 15 samples, none of the ηk were smaller than δk. Therefore, the system identified that there might be a new fault. Determined by human experts, it was named as aeration rate decreasing. Hence, a new library of aeration rate decreasing was created and an original fault antibody was generated by the online data after t = 68.4, and more antibodies were generated through cloning of the antigen using eq 12. The threshold of this library was calculated using eq 15. The system was updated and aeration rate decreasing became a known fault. Introducing test batch 11 to the system, using the steps described in section 4.2, a fault was detected at 90.5 h and the type was then diagnosed at 90.8 h, which was a known fault: aeration rate decreasing. 4.4. Comparison. For the comparison, dynamic locus analysis (DLA) and ANN were also used for fault diagnosis. Using the DLA method for fault diagnosis, the small data window for the antigen was compared with historical normal samples. Once the difference between the data window and the historical normal samples became larger, the fault was detected. Then calculate the differences between the data window and the historical fault samples. Since the data before the fault introduced were similar to the data of normal samples, only the samples after introduced time were used in this phase. If the difference for one historical batch is the minima of all the differences, the test data are deemed to be of the same type of this historical batch’s fault. Since the normal data generated using the Pensim model has limited differences between each batch, the DLA can detect the fault as quickly as that proposed by the AIS method. However, in the fault diagnosis phase, since the fault introduced times of each historical fault batch differ greatly, it resulted in some mistakes. Take test batch 1, for example, to illustrate how DLA method works. The fault was detected by DLA in 92.4 h. Thus, the differences between the online data window and historical fault batches were calculated. As Figure 11 shows, because the introduced time of this test batch is 92 h, which is similar to the historical data of substrate feed rate increasing in 90 h, the DLA method diagnosed the fault as substrate feed rate increasing. Other diagnosis results obtained using DLA are shown in Table 4. The RBF neural network was also used for comparison in this case study. The data of historical normal batches and the data of historical fault batches after introduced time were used to train the net. A vector of output indicates the type of fault diagnosis

Figure 11. Differences between online data of test batch 1 and historical data. 4541

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research

ARTICLE

Table 4. Fault Diagnosis Result Diagnosis Result test batch 1

fault description 23% step agitator power decreasing

AIS

DLA

ANN

agitator power decreasing

substrate feed rate increasing

agitator power decreasing

agitator power decreasing

agitator power decreasing

agitator power decreasing

agitator power decreasing

aeration rate increasing

agitator power decreasing

in 92 h (known fault type) 2

15% step agitator power decreasing in 100 h (known fault type)

3

38% step agitator power decreasing in 68 h (known fault type)

4

80% step aeration rate increasing

aeration rate increasing

aeration rate increasing

new fault (aeration rate increasing,

5

in 110 h (known fault type) 25% step aeration rate increasing

aeration rate increasing

agitator power decreasing

as determined by human experts) new fault (aeration rate increasing,

6

30% step aeration rate increasing

aeration rate increasing

agitator power decreasing

aeration rate increasing

7

66% step substrate feed rate

substrate feed rate increasing

substrate feed rate increasing

substrate feed rate increasing

substrate feed rate increasing

substrate feed rate increasing

new fault (substrate feed rate increasing determined

substrate feed rate increasing

substrate feed

substrate feed rate increasing

new fault (aeration rate decreasing

aeration rate increasing

new fault (aeration rate decreasing,

in 200 h (known fault type)

as determined by human experts)

in 175 h (known fault type) increasing in 90 h (known fault type) 8

43% step substrate feed rate increasing in 180 h

9

70% step substrate feed rate increasing

10

25% step aeration rate decreasing

(known fault type)

by human experts)

in 65 h (known fault type) in 68 h (unknown fault type)

rate increasing determined by human expert)

as determined by human experts)

11

30% step aeration rate decreasing

aeration rate decreasing

substrate feed rate increasing

aeration rate increasing

12

in 90 h (unknown fault type) 45% step aeration rate decreasing

aeration rate decreasing

agitator power decreasing

aeration rate decreasing

in 130 h (unknown fault type)

Figure 12. Fault diagnosis result of test batch using ANN.

result:

8 > > > ½0 0 0 > > > < ½1 0 0 Y ¼ ½0 1 0 > > > ½0 0 1 > > > : others

normal agitator power decreasing aeration rate increasing substrate feed rate increasing new fault

ð17Þ

After each batch, the online data after the time that the fault was detected will be used to retrain the net. If a new fault is

detected, the dimension of the vector of output will increase, and the net will be retrained. Take test batch 1 for example. As Figure 12 shows, the output changed after the point at which the fault was introduced. When t = 92.6 h, Y = [0.1481 -0.0001 0.0005], Y1 > 0.1, the fault was detected. After several samples at t = 93.5 h, Y = [0.2436 0.0001 -0.0003], Y1 > 0.2, the fault type was diagnosed as type 1: agitator power decreasing. Thus, the data of test batch 1 after t = 92.6 can be added to retrain the net. 4542

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research

ARTICLE

Table 5. Fault Diagnosis Time AIS test batch

fault description

detected time (h)

ANN

diagnosed time (h)

detected time (h)

diagnosed time (h)

1

23% step agitator power decreasing in 92 h

92.4

92.6

92.6

93.5

2

15% step agitator power decreasing in 100 h

100.3

100.5

100.5

101.2

3

38% step agitator power decreasing in 68 h

68.3

68.6

68.6

69.4

4

80% step aeration rate increasing in 110 h

110.3

110.6

110.2

5

25% step aeration rate increasing in 200 h

200.5

200.7

200.2

6

30% step aeration rate increasing in 175 h

175.4

175.6

175.2

175.4

7

66% step substrate feed rate increasing in 90 h

90.3

90.5

90.2

90.3

8 9

43% step substrate feed rate increasing in 180 h 70% step substrate feed rate increasing in 65 h

180.5 65.3

180.6 65.5

180.2 65.2

65.4

10

25% step aeration rate decreasing in 68 h

68.4

68.2

68.3

11

30% step aeration rate decreasing in 90 h

90.5

90.8

90.2

12

45% step aeration rate decreasing in 130 h

130.4

130.7

130.2

Take another test batch: test batch 4, for example. As Figure 12 shows, the fault was detected at 110.2 h (see Table 5). However, the fault type could not be diagnosed automatically, although human experts might diagnose it as fault type 2. According to the result in Table 4, when the introduced time or fault magnitude of the testing sample differ greatly with training samples of fault types 2-4, the method could not work efficiently. It was concluded that the requirements of training samples are more stringent for ANN than the proposed AIS method.

130.4

’ ACKNOWLEDGMENT The authors gratefully acknowledge financial support from 863 High Tech Research and Development Plan of China (Grant No. 2006AA04Z176) and National Natural Science Foundation of China (Grant No. 20776010). ’ NOTATION Acronyms

5. CONCLUSION This paper presents a new artificial immune system for fault diagnosis of batch chemical processes. In this system, antigens are generated by online data and antibodies are generated by historical data. Both antigens and antibodies are represented by matrices of time-sampled data. The differences between antigen and antibodies are calculated using the dynamic time warping (DTW) algorithm. Normal antibodies are used for fault detection, and fault antibodies are used for fault diagnosis. A novel cloning algorithm is proposed for building up fault antibody libraries with mutation. Finally, the performance of the proposed system was tested in a fed-batch fermentation of penicillin. The system could detect and diagnose process faults correctly and showed the stronger capability to estimate an unknown fault and diagnose it immediately upon the next occurrence. Compared with dynamic locus analysis (DLA) and artificial neural network (ANN), this approach can solve fault diagnosis problems of larger magnitude and greatly differing introduced times with less training samples, which is more suitable for actual industry applications. The DTW-based artificial immune system is effective for fault diagnosis but requires further study to resolve problems such as how to best select the size of the antigen and antibodies. With further development, the system will be generic and be able to be applied to fault diagnosis in any batch chemical processes. ’ AUTHOR INFORMATION Corresponding Author

*Tel.: þ86 10 62783109. Fax: þ86 10 62770304. E-mail: [email protected].

AdPCA = adjoined principal component analysis AIS = artificial immune system DTW = dynamic time warping DLA = dynamic locus analysis DPCA = dynamic principal component analysis MPCA = multiway principal component analysis MKPCA = multiway kernel principal component analysis PCA = principal component analysis PID = proportional integral differential Notations

Abnormal = the matrix of a normal antibody Abfault = the matrix of a fault antibody Ag = the matrix of an antigen Xi = an original antibody X* = an antibody cloned from original antibodies Greek Letters

η = the difference between two matrices j = the difference matrixes between two matrices δ = the threshold of an antibody library

’ REFERENCES (1) Rippin, D. W. T. Batch process systems engineering: A retrospective and prospective review. Comput. Chem. Eng. 1993, 17, S1–S13. (2) Chen, J.; Liu, K. C. On-line batch process monitoring using dynamic PCA and dynamic PLS models. Chem. Eng. Sci. 2002, 57, 63–75. (3) Jackson, J. E. A User’s Guide to Principal Components; Wiley: New York, 1991. (4) Nomikos, P.; MacGregor, J. F. Monitoring batch processes using multiway principal component analysis. AIChE J. 1994, 40, 1361–1375. 4543

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544

Industrial & Engineering Chemistry Research (5) Ku, W.; Storer, R. H.; Georgakis, C. Disturbance rejection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Syst. 1995, 30, 179–196. (6) Lee, J. M.; Yoo, C. K.; Lee, I. B. Fault detection of batch processes using multiway kernel principal component analysis. Comput. Chem. Eng. 2004, 28, 1837–1847. (7) Ng, Y. S.; Srinivasan, R. An adjoined multi-model approach for monitoring batch and transient operations. Comput. Chem. Eng. 2009, 33, 887–902. (8) Ruiz, D.; Nougues, J. M.; Calderon, Z.; Espu~na, L.; Puigjaner, L. Neural network based framework for fault diagnosis in batch chemical plants. Comput. Chem. Eng. 2000, 24, 777–784. (9) Zhang, J. Improved on-line process fault diagnosis through information fusion in multiple neural networks. Comput. Chem. Eng. 2006, 30, 558–571. (10) Power, Y.; Bahri, P. A. A two-step supervisory fault diagnosis framework. Comput. Chem. Eng. 2004, 28, 2131–2140. (11) Mehranbod, N.; Soroush, M.; Panjapornpon, C. A method of sensor fault detection and identification. J. Process Control 2005, 15, 321–339. (12) Hashizume, S.; Yajima, T.; Kuwashita, Y.; Onogi, K. Integration of fault analysis and interlock controller synthesis for batch processes. Chin. J. Chem. Eng. 2008, 16, 57–61. (13) Goldsby, R. A. Kindt, T. J. Osborne, B. A. Kuby, J. Immunology, Fifth ed.; W. H. Freeman and Company: San Francisco, 2003. (14) Himmelblau, D. M. Fault Detection and Diagnosis in Chemical and Petrochemical Processes; Elsevier Press: Amsterdam, 1978. (15) Timmis, J.; Andrews, P.; Owens, N.; Clark, E. An interdisciplinary perspective on artificial immune systems. Evol. Intell. 2008, 1, 5–26. (16) Ishida, Y. Fully distributed diagnosis by PDP learning algorithm: Towards immune network PDP model. In Proceedings of International Joint Conference on Neural Networks, San Diego, 1990; pp 777782. (17) Forrest, S.; Perelson, A. S.; Allen, L.; Cherukuri, R. Self-nonself discrimination in a computer. In Proceedings of the 1994 IEEE Symposium on Research in Security and Privacy, Oakland, CA, 1994; pp 202-212. (18) Dasgupta, D. Advances in artificial immune systems. IEEE Comput. Intell. Mag. 2006, 140–49. (19) de Castro, L. N. Fundamentals of natural computing: An overview. Phys. Life Rev. 2007, 4, 1–2. (20) Timmis, J. I. Artificial immune systems—Today and tomorrow. Nat. Comput. 2007, 6, 1–18. (21) Aguilar, J. An artificial immune system for fault detection. In Proceedings of the 17th International Conference on Innovations in Applied Artificial Intelligence, Ottawa, Canada, 2004; pp 219-229. (22) Xiong, C.; Zhao, Y.; Liu, W. Fault detection method based on artificial immune system for complicated process. In Proceedings of International Conference on Intelligent Computing, Kunming, PRC, 2006; pp 625-630. (23) Wang, C.; Zhao, Y. A new fault detection method based on artificial immune systems. Asia-Pac. J. Chem. Eng. 2008, 3, 706–711. (24) de Castro, L. N.; Timmis, J. I. Artificial Immune Systems: A Novel Paradigm to Pattern Recognition. In Artificial Neural Networks in Pattern Recognition; Corchado, J. M., Alonso, L., Fyfe, C., Eds.; University of Paisley, U.K., 2002; pp 67-84. (25) Coutinho, A. The network theory: 21 years later. Scand. J. Immunol. 1995, 42, 3–8. (26) Swimmer, M. Using the danger model of immune systems for distributed defense in modern data networks. Comput. Networks 2007, 51, 1315–1333. (27) Kassidas, A.; Taylor, P.; MacGregor, J. F. Off-line diagnosis of deterministic faults in continuous dynamic multivariable processes using speech recognition methods. J. Process Control 1998, 8, 381–393. (28) Srinivasan, R.; Qian, M. S. Online fault diagnosis and state identification during process transitions using dynamic locus analysis. Chem. Eng. Sci. 2006, 61, 6109–6132.

ARTICLE

:: (29) Birol, G.; Undey, C.; C-inar, A. A modular simulation package for fed-batch fermentation: Penicillin production. Comput. Chem. Eng. 2002, 26, 1553–1565. (30) Bajpai, R.; Reuss, M. A mechanistic model for penicillin production. J. Chem. Technol. Biotechnol. 1980, 30, 330–344.

4544

dx.doi.org/10.1021/ie101465b |Ind. Eng. Chem. Res. 2011, 50, 4534–4544