2378
Ind. Eng. Chem. Res. 2000, 39, 2378-2383
Discovery of Operational Spaces from Process Data for Production of Multiple Grades of Products F. Z. Chen and X. Z. Wang* Department of Chemical Engineering, The University of Leeds, Leeds LS2 9JT, U.K.
An industrial case study is presented which uses principal component analysis (PCA) to identify operational spaces and develop operational strategies for manufacturing desired products. Analysis of a historical database of 303 data cases from a refinery fluid catalytic cracking process discovered that the data are projected to four operational zones in the reduced two-dimensional plane. Three zones were found to correspond to three different product grades, and the fourth is a zone that has a high probability of product off-specification and is very likely caused by product changeover. Variable contribution analysis was also conducted to identify the most important variables that are responsible for the observed operational spaces, and consequently strategies were developed for monitoring and operating the process in order to be able to move the operation from producing one product grade to another, with minimum time delays. Introduction Modern chemical manufacturing industry produces a wide variety of products serving many other sectors as well as the consumer market. Not only is it subject to the inevitable changes in feedstock properties, but it is also often required to meet the market or downstream demand for continual changes in product quality. To be able to respond to these changes, it is necessary to devise an operational strategy which can move the plant rapidly to new operating conditions that minimize the loss of product during the product changeover. Product changeover is complicated by the multivariate nature of process variables. Traditionally, operators have used their knowledge and experience to find a new operating condition producing a desired product through trial-and-error. During the process, off-specification product may be produced which causes economic loss. Jaeckle and MacGregor1 defined the task as product design and indicated that the historical operational data of a process can be a useful resource. However, as they have noticed, little work has been published on this topic and the few publications did not give details. Jaeckle and MacGregor1 developed an approach using multivariate statistical analysis to analyze historical operational data and to find operational spaces leading to varied products or grades of a product. The case study they used was a high-pressure tubular reactor producing low-density polyethylene. The process has six operational variables, and the data were generated from computer simulation. While the case study was very useful for illustrating the procedures, it is clearly more interesting to apply the approach to industrial processes. To the best of our knowledge, there have been few reports on application of multivariate statistical analysis to product design through analysis of data directly collected from industrial operations. In this contribution, we report such an industrial case study of applying principal component analysis (PCA) to analyze a database of 303 cases collected from a refinery fluid catalytic cracking process (FCC). The rest * To whom correspondence should be addressed. Telephone: + 44 113 233 2427. Fax: +44 113 233 2405. E-mail:
[email protected].
Figure 1. Main fractionator of the FCC process.
of the paper is organized as follows. In the next section, the process and the data are described. Then, in the third section, the method used will be briefly introduced. The results and discussion are presented in the fourth and fifth which are followed by the final remarks. FCC Main Fractionator and Product Quality The FCC process of the refinery converts a mixture of heavy oils into more valuable products. The relevant section of the process is shown in Figure 1, where the oil gas mixture leaving the reactor goes into the main fractionator to be separated into various products. The individual side-draw products are further processed by downstream units before being sent to blending units. One of the products is light diesel whose quality is typically characterized by the temperature of condensation. Traditionally the temperature of condensation has been analyzed off-line in the laboratory, which has caused time delays because the interval between the two samples is between 4 and 6 h. In an early study,2 we have designed a software sensor using 303 data patterns spanning over nearly a year for predicting the condensation point using 14 process variables which are identified by plant engineers and can be measured online. The software sensor was developed using feedforward neural networks (FFNN), which are shown in Figure 2.
10.1021/ie9904899 CCC: $19.00 © 2000 American Chemical Society Published on Web 06/13/2000
Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000 2379
Figure 2. software sensor developed using a feed-forward neural network.2
An interesting problem with the process is that it is required to produce three product grades according to seasons and market demand, namely, -10#, 0#, and 5# defined by the ranges of condensation temperature. Because there are more than one process variable to monitor, the operators usually use their knowledge and experience through trial-and-error to adjust the variables in order to move the operation from producing one product grade to another. There is a clear need to minimize the time of product changeover because an offspecification product can be produced during the transition. Although the software sensor shown in Figure 2 is able to predict the quality on-line, which unquestionably helps operators in timely monitoring the product quality while searching for new operational conditions, it provides no clues on which process variables are more important than others to a desired product quality. We have noticed that, in the literature, methods have been studied to use neural networks as a tool to carry out sensitivity studies to find out the relative importance of inputs to the outputs.3 There are also reports on making use of the FFNN weights to directly calculate the impact of an input on a particular output.4-7 These approaches are promising but still require further studies. An obvious problem that needs to be considered in using these approaches is that the FFNN inputs may not be completely independent. Therefore these approaches should be used with care. In this study we used the principal component analysis for the purpose of discovering knowledge for operating the process. It needs to be pointed out that both NN and PCA should be used simultaneously: the NN software sensor gives real-time indication of product quality, while PCA provides information to operators on what are the most important variables that need to be operated and monitored, for a specific task of product changeover. Though PCA is a well-established approach and has been introduced in many textbooks and recent publications, it is still necessary to briefly introduce the method in this paper in order to highlight the sort of problem PCA is designed to solve. Principal Component Analysis and Its Application to Process Data Analysis The method of principal component analysis was originally developed in the 1900’s8,9 and has now reemerged as an important technique in data analysis. The central idea is to reduce the dimensionality of a data set consisting of a large number of interrelated variables, while retaining as much as possible of the
variation present in the data set. Multiple regression and discrimination analysis use variable selection procedures to reduce the dimension but result in the loss of one or more important dimensions. The PCA approach uses all of the original variables to obtain a smaller set of new variables (principal componentss PCs) that they can be used to approximate the original variables. The greater the degree of correlation between the original variables, the fewer the number of new variables required. PCs are uncorrelated and are ordered so that the first few retain most of the variation present in the original set. Given a data matrix X representing n observations of each of p variables, x1, x2, ..., xp, the purpose of principal component analysis is to determine a new variable y1 that can be used to account for the variation in the p variables, x1, x2, ..., xp. The first principal component is given by a linear combination of the p variables as
y1 ) w11x1 + w12x2 + ... + w1pxp
(1)
where the sample variance is greatest for all of the coefficients (also called weights), w11, w12, ..., w1p, conveniently written as a vector w1. The w11, w12, ..., w1p have to satisfy the constraint that the sum-ofsquares of the coefficients, i.e., w′1w1, should be unity. The second principal component, y2, is given by the linear combination of the p variables in the form
y2 ) w21x1 + w22x2 + ... + w2pxp
(2)
or
y2 ) w′2x which has the greatest variance subject to the two conditions,
w′2w2 ) 1 w′2w1 ) 0
(3)
(so that y1 and y2 are uncorrelated)
Similarly the jth principal component is a linear combination
yj ) w′jx
(4)
which has the greatest variance subject to
w′jwj ) 1
and
w′jwi ) 0 (i < j)
To find the coefficients defining the first principal component, the elements of w1 should be chosen so as to maximize the variance of y1 subject to the constraint, w′1w1 ) 1. The variance of y1 is then given by
Var(y1) ) Var(w′1x) ) w′1Sw1
(5)
where S is the variance-covariance matrix of the original variables. The solution of w1 ) (w11, w12, ..., w1p) to maximize the variance y1 is the eigenvector of S corresponding to the largest eigenvalue. The eigenvalues of S are roots of the equation
||S - λI| ) 0
(6)
If the eigenvalues are λ1, λ2, ..., λp, then they can be arranged from the largest to the smallest. The first few
2380
Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000
Table 1. The Fourteen Process Variables T1-11 T1-12 T1-33 T1-42 T1-20 F215 T1-09 T1-00 F205 F204 F101 FR-1 FIQ22 F207
temp on tray 22 where the light diesel is withdrawn temp on tray 20 where the light diesel is withdrawn temp on tray 19 temp on tray 16, i.e., the initial temp of the pump around return temp of the pump around flow rate of the pump around column top temp reacn temp fresh feed flow rate to the reactor flow rate of the recycle oil steam flow rate steam flow rate flow rate of the over-heated steam flow rate of the rich-absorbent oil
eigenvectors are the principal components that can capture most of the variance of the original data, while the remaining PCs mainly represent noise in the data. PCA is scale dependent, so the data must be scaled in some meaningful way before PCA analysis. The most usual way of scaling is to scale each variable to unit variance. Some researchers10,11 have indicated that PCA is a linear operation and therefore have proposed nonlinear PCA models which use a small number of hidden neurons in a neural network to represent the original variables. While the dimension is reduced, the dependency between the latent variables is not as clear as in linear PCA. In linear PCA, the PCs are known to be linearly independent. PCA has attracted much attention recently in analyzing process operational data.12-23 The focus of these studies has been on developing multivariate statistical monitoring charts, e.g. T2 and SPE charts.12 However, to our knowledge, the work of Jaeckle and MacGregor1 is the only publication on applying the approach to identifying operational zones leading to multiple grades of products. The problem described in the second section is a multidimensional problem. It is well-known that humans are not able to simultaneously analyze problems involving more than three variables very effectively, and this becomes more difficult when the variables are not independent. In this work, PCA is used as a projection tool. It projects the data from high dimensional spaces to low dimensional spaces and retain most of the variance present in the original data. In the next section we will demonstrate that useful knowledge can be discovered on the basis of analysis of the data in the lower dimensional spaces.
Table 2. Summary of the PCs PCs
eigenvalues
% of eigenvalues
cum %
1 2 3 4 5 6 7 8 9 10 11 12 13 14
5.0729 2.3663 1.9453 1.1271 1.0382 0.7757 0.5845 0.4485 0.3034 0.1350 0.0795 0.0682 0.0385 0.170
36.2352 16.9017 13.8953 8.0506 7.4155 5.5406 4.1751 3.2032 2.1670 0.9641 0.5679 0.4873 0.2750 0.1214
36.2352 53.1369 67.0322 75.0828 82.4983 88.0389 92.2140 95.4173 97.5843 98.5484 99.1163 99.6036 99.8786 100.0000
Figure 3. PC1 and PC2 two-dimensional plot.
Discovery of Operational Zones Leading to Different Products The difficulty of the present problem comes from the fact that there are 14 process variables to consider. Application of PCA to the database of the size 303 × 14 (number of data patterns × number of process variables) found that the first seven principal components account for about 93% of the variance (Table 2). The PC1 and PC2 two-dimensional plot is shown in Figure 3, from which it was found that the 303 data patterns are grouped into four clusters. It was found that three clusters correspond to three products -10#, 5#, and 0#, and the cluster at the bottom-right corner is found to be a cluster that has a high probability of product offspecification. Before we analyze how this can be used to develop operational strategies, it is necessary to
Figure 4. PC1, PC2, and PC3 three-dimensional plot.
validate the clustering result because, as noticed from Table 2, the first two PCs only account for 53% of the variance. For the purpose of validation, the first three PCs are plotted in a three-dimensional diagram (Figure 4). It is found that the cluster at the center of Figure 3 is further divided into two clusters. The adaptive resonance theory (ART2) was also applied to the problem in order to make a comparison. ART2 was developed by Carpenter and Grossberg24 as an unsupervised neural network model for clustering. Whiteley and Davis25 have used ART2
Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000 2381 Table 3. Correlation Coefficients for the First Two Principal Components T1-11 T1-12 T1-33 T1-42 T1-20 F215 T1-09
PC1
PC2
0.3985 0.4349 0.3449 0.4302 0.3235 -0.3850 -0.1937
-0.0245 0.0224 -0.0928 -0.0111 0.1352 -0.0613 -0.2853
T0-00 F205 F204 F101 FR-1 FIQ22 F207
PC1
PC2
0.0864 -0.0570 0.0249 0.1501 -0.0598 -0.1412 0.0159
-0.0786 -0.4408 -0.4411 0.0196 0.5767 0.3996 -0.1808
to fault identification, and we have used it for classification of lubricating base oils.26 Since the algorithm can be found in the above literature, it is not repeated here. Using the first seven PCs, ART2 gives a similar result as demonstrated in Figure 4, and it also divides the cluster at the center of Figure 3 into two clusters, as indicated by the dotted curve in Figure 3. This demonstrates that, for problems having large dimensions, clusters may overlap in a two-dimensional PC display. Nevertheless, for the current problem, it is found that the two clusters at the center of Figure 4 both correspond to product 0#. As a result, in the following discussion, we still use the result of Figure 3. Therefore the strategy for operation should be to operate the process in the region of the bottom left if the desired product is -10#, in the region at the top if the desire product is 5#, and in the region at the middle if the desired product is 0. Another point is that in order to move from producing -10# to 0#, adjusting PC1 is more important than changing PC2. While to switch from producing 0# to 5#, PC2 is more important than PC1. Both PC1 and PC2 are important in avoiding the region at the bottom-right corner which produces offspecification product. It is important to notice that the operational region at the bottom-right corner, which we should avoid in operation, was not anticipated before such an analysis. Close examination gives a more interesting discovery regarding the region at the bottom-right corner of Figure 3. It is very likely caused by operators during product changeover. For example, 117-124 at the bottom-right corner were due to transition from the region of -10# (1-116) to the region of product 0# (125-191). Other data cases in the bottom-right corner can also be explained similarly. Data patterns 211-212 were due to transition from the 5# region (192-210) to the -10# region (213-242); 243-244, due to transition from -10# region (213-242) to 0# region (245-271); 278-288, due to transition from 5# (272-277) to 0# (289-303). It shows that some transitions took longer time. If the knowledge discovered had been known, the transition time could have been reduced. Identification of Variables Responsible for Observed Operational Zones Because PC1 and PC2 are latent variables, it is necessary to link PC1 and PC2 to the original variables in order to provide guidance to operators for monitoring and adjustment during product changeover. To this purpose, variable contribution plots16 are used in this study. The coefficients for defining the first seven principal components of the data are given in Table 3. The contribution plot for PC1 is the plot of PC1 against the coefficients for the 14 original variables and is shown in Figure 5, from which it is found that the most
Figure 5. Contributing plot of PC1.
Figure 6. Changing profile of TI-12 over the 303 data patterns.
Figure 7. Changing profile of TI-42 over the 303 data patterns.
important variables are TI12 (the temperature on tray 20 where the product is withdrawn) and TI42 (the temperature on tray 16 close to the flashing zone). This result is consistent with the analysis based on fundamental principles that the temperatures on the tray where the product is withdrawn and at the flashing zone have most important impact on the side-draw product. Some other variables are found to not be important such as FR-1. The above discovery is confirmed by looking at the change of TI-12 over the 303 data patterns (Figure 6). It clearly shows that, from the value of TI12, it is able to distinguish product -10# from 0# and 5#, but not between 0# and 5#. When the temperature TI-12 is below 230 °C, the product is -10#. When TI12 is between 230 and 250 °C, the product is either 0# or 5#. If the temperature is above 250 °C, it is very likely to produce off-specification products. The changing profile of TI-42 over the 303 data patterns (Figure 7) gives a similar picture. Figure 5 also indicates that some other variables are much less important to PC1, which implies that they are not influential in moving the operation from producing products -10# to 0#. The steam flow rate to the riser tube reactor (FR-1) is an example. From its changing profile (Figure 8), it can be seen that FR-1 does
2382
Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000
Figure 8. Changing profile of FR-1 over the 303 data patterns. Figure 10. PCA monitoring plane.
Figure 9. Contributing plot of PC2.
not reflect the different conditions leading to products -10# and 0#. The contribution plot of PC2 is shown in Figure 9, which indicates that FR-1 is the most influential variable to PC2. The FR-1 profile over the 303 data patterns shown in Figure 8 clearly shows that FR-1 can distinguish product 5# from 0# and -10#, but not between 0# and -10#. When the value of FR-1 is above 8.5, the product is 5#. When FR-1 is between 5 and 8.5, the product is either 0# or -10#. Though not very clear, it can still be seen that when FR-1 is below 6.5, the operation goes into the region producing the offspecification product. The discovery that FR-1 is the most important variable to PC2 and so in distinguishing product 0# from 5# and -10# was not anticipated. In fact, because of the multivariate and nonlinear nature of the problem, some variables which are important to product quality in one operational region may become less important in a different region. This type of knowledge is unique to a specific process unit and was traditionally acquired by operators during operating the process. Discovering operational zones and the relative importance of variables using PCA provides a useful computer tool for extracting such knowledge. The above analysis on the contribution of individual variables to PC1 and PC2, and so to the operational states is also consistent with the observation in Figure 3 that both PC1 and PC2 are influential to the zone at the bottom-right corner. From Figures 6-8 we can see that TI-12, TI-42, and FR-1 all can have significant influence on this zone. Therefore if the intention is to change from producing -10# to 5#, we should first increase TI-12 and TR-42 and then increase FR-1. To avoid off-specification product, we should carefully monitor TI-12, TR-42, and
FR-1 to avoid the region at the bottom-right corner. Of course it is important to be aware that fine tuning of all the variables is necessary but the guidance can help operators move the process from producing one product rapidly to another. It needs to be pointed out that the knowledge obtained from contribution plots is based on statistics analysis of historical data, and therefore is not necessarily causal. It is then up to the engineer, with his/her principle knowledge, to interpret the results. The PC1 and PC2 two-dimensional plane can also be used by operators as a monitoring display, as illustrated by Figure 10. As indicated by Kuespert and McAvoy,27 in many cases, process operation is beyond the capabilities of modern control strategies. Operators’ experience and knowledge play an essential role in such cases. Therefore, it is very useful to develop technologies that can learn from operators. Expert systems have been regarded as a useful technique for recording experts’ expertise. However, studies have shown that human experts are often not able to express their expertise explicitly. Dynamics and interactions of multivariate process variables complicate the problem. An alternative way is to analyze and learn from the past records of operation. PCA is able to reduce the problem dimension and eliminate variable dependencies. As a result we can visually get insight into the problem in a lower dimension, as demonstrated in the case study. Final Remarks Many activities in process operation are beyond the capability of modern process control strategies and require experienced operators. Learning from the past by mining historical operational records provides a useful means for solving this problem.21 This paper has described an application of PCA to a database of 303 cases collected from a refinery fluid catalytic cracking process. Four operational zones were discovered, with three of them corresponding to three product grades and the fourth to a region that has a high probability of product off-specification and is very likely caused by product changeover. Variable contribution plots were used to identify the most influential variables and develop operational strategies for producing desired products. Literature Cited (1) Jaeckle, C. M.; MacGregor, J. F. Product design through multivariate statistical analysis of process data. AIChE J. 1998, 44, 1105-1118.
Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000 2383 (2) Chen, F. Z.; Wang, X. Z. Software sensor design using Bayesian automatic classification and back-propagation neural networks. Ind. Eng. Chem. Res. 1998, 37, 3985-3991. (3) Gillespie, E. S.; Wilson, R. N. Application of sensitivity analysis to neural network determination of financial variable relationships. Appl. Stochastic Models Data Anal. 1998, 13, 409414. (4) Garson, G. D. Interpreting neural-network connection weights. AI Expert 1991 (Apr), 47-51. (5) Milne, L. K.; Gedeon, T. D.; Skidmore, A. K. Classifying dry sclerophyll forest from augmented satellite data: Comparing neural networks, decision tree & maximum likelihood. Proc. Australian Conf. Neural Networks (Sydney) 1995, 160-163. (6) Wong, P. M.; Gedeon, T. D.; Taggart, I. J. An improved technique in porosity prediction: a neural network approach. IEEE Trans. Geosci. Remote Sens. 1995, 33, 971-980. (7) Gedeonb T. D. Data mining of inputs: Analysing magnitude and functional measures. Int. J. Neural Syst. 1997, 8, 209-218. (8) Pearson, K. On lines and planes of closest fit to systems of points in space. Philos. Mag. 1901, 2, 559-572. (9) Hotelling, H. Analysis of a complex of statistical variables into principal components. J Educ. Psychol. 1933, 24, 417-441, 498-520. (10) Kramer, M. A. Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 1991, 37, 233243. (11) Dong, D.; McAvoy, T. J. Nonlinear principal component analysis-based on principal curves and neural networks. Comput. Chem. Eng. 1996, 20, 65-78. (12) MacGregor, J. F.; Kourti, T. Statistical process control of multivariable processes. Control Eng. Pract. 1995, 3, 403-414. (13) MacGregor, J. F.; Jaeckle, C.; Kiparissides, C.; Koutoudi, M. Process monitoring and diagnosis by multiblock PLS methods. AIChE J. 1994, 40, 826-838. (14) Nomikos, P.; MacGregor, J. F. Monitoring batch process using multiway principal component analysis. AIChE J. 1994, 40, 1361-1375. (15) Martin, E. B.; Morris, J.; Papazoglou, M. C. Confidence bounds for multivariate process performance monitoring charts. Preprints of the IFAC workshop on on-line fault detection and supervision in the chemical process industries, Newcastle, U.K., June 1995; 33-42.
(16) Neogi, D.; Schlags, C. E. Multivariate statistical analysis of an emulsion batch process. Ind. Eng. Chem. Res. 1998, 37, 3971-3979. (17) Kresta, J. V.; Marlin, T. E.; MacGregor, J. F. Development of inferential process models using PLS. Comput. Chem. Eng. 1994; 18, 597-611. (18) Santen, A.; Koot, G. L. M.; Zullo, L. C. Statistical data analysis of a chemical plant. Comput. Chem. Eng. 1997, 21 (Suppl.), s1123-s1129. (19) Zhang, J.; Martin, E. B.; Morris, A. J. Fault detection and diagnosis using multivariate statistical techniques. Chem. Eng. Res. Des. 1996, 74A, 89-96. (20) Dunia, R.; Qin, S. J.; Edgar, T. F.; McAvoy, T. J. Identification of faulty sensors using principal component analysis. AIChE J. 1996, 42, 2797-2812. (21) Wang, X. Z. Data mining and knowledge discovery for process monitoring and control; Springer: London, 1999. (22) Chen, J. G.; Bandoni, J. A.; Romagnoli, J. A. Robust PCA and normal region in multivariate statistical process monitoring. AIChE J. 1996, 42, 3563-3566. (23) Negiz, A.; Cinar, A. Statistical monitoring of multivariable dynamic processes with state space models.AIChE J. 1997, 43, 2002-2020. (24) Carpenter, G. A.; Grossberg, S. ART2: Self-organisation of stable category recognition codes for analogue input patterns. Appl. Opt. 1987, 26, 4919-4930. (25) Whiteley, J. R.; Davis, J. F. Knowledge-based interpretation of sensor patterns. Comput. Chem. Eng. 1992, 16, 329-346. (26) Wang, X. Z.; Chen, B. H. Clustering of infrared spectra of lubricating base oils using adaptive resonance theory. J. Chem. Inf. Comput. Sci. 1998, 38, 457-462. (27) Kuespert, D. R.; McAvoy, T. J. Knowledge extraction in chemical process control. Chem. Eng. Commun. 1994, 130, 251264.
Received for review July 8, 1999 Revised manuscript received April 10, 2000 Accepted April 17, 2000 IE9904899