Spectroscopic Monitoring of Batch Reactions for On-Line Fault

monitoring and detection and diagnosis of disturbances. An application of the on-line monitoring of a chemical batch reaction using UV-visible spectro...
0 downloads 0 Views 171KB Size
Anal. Chem. 2000, 72, 5322-5330

Spectroscopic Monitoring of Batch Reactions for On-Line Fault Detection and Diagnosis Johan A. Westerhuis, Stephen P. Gurden, and Age K. Smilde*

Process Analysis & Chemometrics, Department of Chemical Engineering, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands

Chemical batch processes play an important role in the production of high-quality products such as polymers, pharmaceuticals, and biochemicals. In general, batch processes exhibit some common-cause batch-to-batch variation that is considered acceptable. However, sometimes batches also show special variation that is different from the common-cause variation arising from differences in the feedstock, such as impurities, or deviation from optimal operation conditions such as temperature, pressure, and pH. Such abnormal conditions can lead to the production of batches with poor quality. Multivariate statistical process control (MSPC) of batch processes to ensure that the batch progresses to a high-quality product or to detect and diagnose faults was introduced by Nomikos and MacGregor1-3 and further pursued by others.4-8 All these different methods of applying MSPC to batch processes make use of standard engineering process variables such as temperatures, pressures, and flow rates that are measured on a frequent basis. In this paper, the use of spectroscopic techniques

for on-line monitoring of batch reactions for detection and diagnosis of process faults will be introduced. On-line spectroscopy is particularly useful for statistical process control of chemical batch reactions. Spectroscopic techniques that can be implemented on-line, such as UV-visible, NIR, and Raman, provide a rich source of information about conditions within a chemical system. They have the advantage of being fast, robust, and nondestructive and are therefore an interesting alternative to classical process analytical methods such as chromatographic systems. Furthermore, chemical information such as the concentration of spectroscopically active chemical species present in the reaction, changes in solvent conditions, and the presence of spectroscopically active impurities, is made available and can be used to follow whether the reaction is proceeding correctly. This is a major advantage compared to standard process measurements of physical conditions within a chemical process such as temperatures, pH, pressures, and flow rates, where this information may be hard to obtain. These “engineering” variables relate only indirectly to the actual chemistry occurring in the process, the idea being that if the physical conditions are in control, the chemical process will also remain in control. Therefore, fast online spectroscopic techniques are of increasing interest for use in the on-line monitoring of processes for detection and diagnosis of process faults. A large number of applications using spectroscopic techniques to monitor chemical batch reactions have been presented in the literature. On-line UV-visible spectroscopy was used for monitoring batch reactions.9,10 On-line NIR spectroscopy was used to monitor reaction completion11 and powder blending.12 Raman spectroscopy was applied for process and synthesis monitoring.13,14 An extensive list of on-line spectroscopic monitoring can be found in a review by Workman et al.15 Most of these applications deal with the spectral data using a calibration model for calculation of

* Corresponding author: (phone) +31-20-5255062; (fax) +31-20-5255604; (email) [email protected]. (1) Nomikos, P.; MacGregor, J. F. AIChE J. 1994, 40, 1361-75. (2) Nomikos, P.; MacGregor, J. F. Technometrics 1995, 37, 41-59. (3) Nomikos, P.; MacGregor, J. F. Chemom. Intell. Lab. Syst. 1995, 30, 97108. (4) Rannar, S.; MacGregor, J. F.; Wold, S. Chemom. Intell. Lab. Syst. 1998, 41, 73-81. (5) Wold, S.; Kettaneh-Wold, N.; Friden, H.; Holmberg, A. Chemom. Intell. Lab. Syst. 1998, 44, 331-40. (6) Boque´, R.; Smilde, A. K. AIChE J. 1999, 45, 1504-20. (7) Louwerse, D. J.; Smilde, A. K. Chem. Eng. Sci. 2000, 55, 1225-35. (8) Westerhuis, J. A.; Kourti, T.; MacGregor, J. F. J. Chemom. 1999, 13, 397413.

(9) Bijlsma, S.; Louwerse, D. J.; Smilde, A. K. J. Chemom. 1999, 13, 311-29. (10) Quinn, A. C.; Gemperline, P. J.; Baker, B.; Zhu, M.; Walker, D. S. Chemom. Intell. Lab. Syst. 1999, 45, 199-214. (11) Maesschalck de, R.; CuestaSanchez, F.; Massart, D. L.; Doherty, P.; Hailey, P. Appl. Spectrosc. 1998, 52, 725-31. (12) Ward, H. W., II; Sekulic, S. S.; Wheeler, M. J.; Taber, G.; Urbanski, F. J.; Sistare, F. E.; Norris, T.; Aldridge, P. K. Appl. Spectrosc. 1998, 52, 17-21. (13) Svensson, O.; Josefson, M.; Langkilde, F. W. Chemom. Intell. Lab. Syst. 1999, 49, 49-66. (14) Staden van, J. F.; Makhafola, M. A.; Waal de, D. Appl. Spectrosc. 1996, 50, 991-94. (15) Workman, J., Jr.; Veltkamp, D. F.; Doherty, S.; Anderson, B. B.; Creasy, K. E.; Koch, M.; Tatera, J. F.; Robinson, A. L.; Bond, L.; Burgess, L. W.; Bokerman, G. N.; Ullman, A. H.; Darsey, G. P.; Mozayeni, F.; Bamberger, J. A.; Stautberg Greenwood, M. Anal. Chem. 1999, 71, 121R-180R.

This paper presents the general methodology to use spectroscopic measurements directly for on-line process monitoring and detection and diagnosis of disturbances. An application of the on-line monitoring of a chemical batch reaction using UV-visible spectroscopy is discussed in detail. Successful historical batches are used to build a statistical model of the batch reaction. The model uses external information such as the pure spectra of the compounds and their concentration profiles to improve the interpretability. Control charts are developed for on-line monitoring of new batches. It is shown that this model is capable of detecting erroneous batches. In combination with contribution plots, the actual cause of the disturbances can be diagnosed.

5322 Analytical Chemistry, Vol. 72, No. 21, November 1, 2000

10.1021/ac000532y CCC: $19.00

© 2000 American Chemical Society Published on Web 09/23/2000

the concentration of some compounds in the reaction or for other properties that are hard to measure such as conversion, viscosity, etc., which can then be used subsequently in a monitoring scheme. In the present paper, the spectral data will be used much more extensively. A chemical batch reaction will be monitored using spectroscopic measurements for fault detection and diagnosis. A case study of a two-step conversion reaction will be used to show the ideas of the analysis of historical data, model building, postbatch analysis, on-line monitoring, and diagnosis of process faults. A special type of process model will be used that incorporates external knowledge of the chemical reaction at hand. The model also captures differences in the reaction rate between the separate batches. It will be shown that even small disturbances will directly be detected by the on-line monitoring system. Furthermore, it will be shown that fault diagnosis is improved by using a combination of spectroscopic measurements and contribution plots. The philosophy of monitoring chemical batch reactions with spectroscopy for fault detection and diagnosis is very much that of traditional SPC methods where the behavior of new batches is compared against a reference distribution based on historical data from previous successful batches. The central idea is that the systematic variation in the successful batches is modeled. New batches should fit the model and their residuals should be in the same order as the residuals of the historical successful batches. For ease of understanding, the theory of on-line monitoring of chemical batch processes with spectroscopy will be described using a specific example introduced in the Experimental Section. EXPERIMENTAL SECTION The chemical batch reaction used in this paper to introduce the concept of on-line monitoring using spectroscopic techniques has been described previously in the literature.9 A two-step consecutive reaction of 3-chlorophenylhydrazonopropane dinitrile (A) with 2-mercaptoethanol (B) forms an intermediate adduct (C). This adduct is hydrolyzed to give the main product 3-chlorophenylhydrazonocyanoacetamide (D) and ethylene sulfide (E): k1

k2

A + B 98 C 98 D + E

(1)

In this case, B is present in large excess (276:1) with respect to A, and therefore, pseudo-first-order kinetics can be assumed. Only compounds A, C, and D are spectroscopically active in the UVvisible window considered and the reaction is pH dependent. Figure 1 (top plot) shows the UV-visible spectra of the pure reactant (A), intermediate (C), and product (D). The reaction took place in a quartz cuvette using a reactant volume of 2.5 mL with an initial concentration of reactant A of 54 × 10-6 mol‚L-1. A water bath and thermocouple were used to maintain a temperature of 25 °C. A Hewlett-Packard 8453 UVvisible spectrophotometer with diode-array detection was used to record spectra with a path length of 1.00 cm. Spectra within the range 200-600 nm were recorded every 20 s in a total run time of 2700 s. Only the wavelength range 300-500 nm is analyzed here and the first six spectra measured at the start of the reaction were discarded, as they were not found to be reproducible. It has been shown previously9 that applying a consecutive firstorder kinetic reaction model to the measured spectra yields

Figure 1. (Top) UV-visible spectra of reactant, intermediate, and product of the chemical reaction. (Bottom) Concentration profiles of reactant, intermediate, and product of the chemical reaction.

estimates of the reaction rate constants (k1 and k2 in eq 1) for each batch run. From these, the concentration-time profiles for the reactant, intermediate, and product can be calculated. The averages of these estimated concentration profiles taken over all batches are given in Figure 1 (bottom plot). A set of 37 batches, both successful and unsuccessful, was available for development of the monitoring system. In addition to the 37 batches, 9 extra experiments were run for which pH disturbances were introduced. Halfway through the reaction, 10 µL of NaOH with varying concentrations between 0.0251 and 0.4000 mol‚L-1 was added to the reaction mixture. The addition of NaOH led to a maximum increase of 0.2 pH unit. These disturbed batch runs will be monitored on-line to see how quickly the disturbance can be detected. Figure 2 shows the data of a good batch and of a disturbed batch. At the end of the Results section these batches are discussed in detail. THEORY In the Theory section, the different steps of the on-line monitoring will be introduced, and in the Results and Discussion section these steps will be applied to the data described above. It is important to remember that statistical process control schemes for on-line monitoring can only be applied for a specified recipe and when the process is used in the same setup. If the process is changed, e.g., after an optimization step, it might be necessary to update the monitoring schemes for the new setup of the system. Spectroscopic Data from a Chemical Batch Reaction. The spectroscopic data of each of the batches described in the Experimental Section can be arranged as a matrix of size number of wavelengths (201) × number of time points (130). Each of these matrices contains the data for one batch as shown in Figure 2. The UV-visible spectra of the reaction are plotted for 45 min. The first spectrum mainly equals the spectrum of the pure reactant. After a few minutes, the spectrum of the intermediate becomes visible and slowly disappears again. For the modeling and monitoring of the chemical batch reaction, the variation between the batch runs is of main interest Analytical Chemistry, Vol. 72, No. 21, November 1, 2000

5323

Figure 2. Data of a bad batch and a good batch.

and this variation will be modeled. All batches are arranged in a three-way array X(I × J × K), where I is the number of batches, J is the number of wavelengths, and K is the number of time points. Postanalysis of Historical Batches. The first important step in statistical process control of batch reactions is the analysis of historical data to select a set of batches that will be used for the development of the process model. Such a set is considered to capture the common-cause variation that can be expected to be also present for a new batch run. This set of batches is considered to be obtained under normal operation conditions and hereafter will be referred to as the NOC set. If a new batch run is not significantly different from the common-cause variation in the NOC set, the new batch run is considered to be good. If the variation of a new batch run is significantly different from the NOC variation, the new batch is out of control. The main goal of a monitoring system for a chemical reaction is to detect such bad batch runs and find the cause of the disturbance. This cause may be removed for future batch runs, thus improving the process. The detection of bad batches in the historical data can be performed by building a model of the data and checking the control charts to see whether any batches fall outside of the control limits. The ideas of process modeling and multivariate control charts will be discussed below. Model of NOC Data. The focus of this paper is to consider the data as batch process data and to build a model for the online monitoring of the process for fault detection and diagnosis. This means that the main interest is in modeling the differences between the I batch runs caused by slight variations in experimental conditions, such as temperature, pH, and instrumental drift. This is different from other applications using spectroscopic data such as curve resolution to find the identity and concentration of compounds in complex mixtures or the calculation of kinetic parameters for the chemical reaction.9 Therefore, the techniques to deal with the data will also be different. For convenience, in the remainder of the paper, the threeway array X will always be considered matricized (unfolded) to a matrix X(I × JK) where the batch direction is maintained. Now the first J columns of X contain the spectrum obtained at time point k ) 1 for all I batches. 5324

Analytical Chemistry, Vol. 72, No. 21, November 1, 2000

The first step in modeling X is mean centering. Each column of X is centered to mean zero,

xcijk ) xijk - jxjk

(2)

where xijk is the absorbance of batch i at wavelength j and time period k and jxjk is the mean absorbance of all batches for wavelength j at time period k. By mean centering X, for each time period k, the mean spectrum of all I batches for that specific time period is removed from the data. In the remainder of the text, X is always assumed to be mean-centered. By mean centering in the batch direction, the nonlinear trajectory of the whole spectrum in time is removed from the data. This only leaves the deviation from the mean trajectory for all batches. This remaining variation is considered to be the common-cause variation of the process that can be expected for new batches. To describe the variation in the NOC set of batch runs, the total variation (after centering) is divided into systematic variation, which is described in a specific model of the process, and residual variation that cannot be modeled. The process model is what makes the monitoring multivariate because it models the correlation between the variables and time points in the data. A general empirical model of the data can be described as follows:

X ) APT + E

(3)

In this general form, the model X(I × JK) contains the spectral data, APT describes the systematic variation within the NOC data, and the residuals E(I × JK) contain the part of X not described by the model. P(JK × R) describes the nature of the systematic variation and A(I × R) describes its magnitude for each specific batch. Any structure may be applied to P, and examples of PARAFAC, Tucker3, or Tucker1 structures can be found in the literature.2,7,8,16-19 The dimension of the model R is usually much smaller than I and JK. (16) Dahl, K. S.; Piovoso, M. J.; Kosanovich, K. A. Chemom. Intell. Lab. Syst. 1999, 46, 161-180. (17) Bro, R. University of Amsterdam (NL) & Royal Veterinary and Agricultural University (DK), 1998.

Control Charts. Control charts are used in monitoring chemical batch processes to detect whether the variation in the data of a new batch is significantly different from the variation in the set of NOC batches. They are generally based on the residuals statistic (Q-chart) and on the Hotelling’s statistic (D-chart). For both these charts, control limits are calculated using the NOC batch runs. For monitoring new batches, the process data of the new batch xnew(JK × 1) is, after centering, projected on the model.

aTnew ) xTnewP(PTP)-1 eTnew ) xTnew - aTnewPT

(4)

The new scores anew(R × 1) and the new residuals enew(JK × 1) are used to calculate a new D-statistic value and a new Q-statistic value for off-line monitoring of the new batch. If one or both of these statistics are above the control limits, the batch is out of control. This means that the new batch is significantly different from the ones in the NOC set, which are considered to capture the common-cause variation. In other words, the new batch is not in control. Q-Chart. The residuals statistic or Q-statistic equals the sum of the squared residuals of a new batch xnew. After projection of xnew onto the model, residuals enew are obtained, (see eq 4). J

Qnew )

K

∑∑ e

2 new,jk

∼ gχ2h

(5)

j)1 k)1

Here enew,jk is the typical element of enew, the residuals of the new batch. The summation of all squared residuals, Qnew, follows a χ2 distribution with h degrees of freedom and a weight g to correct for the magnitude of Qnew. The weight g and degrees of freedom h can be estimated from the residuals E of the NOC batches.2,20 Control limits of 95 and 99% for the Q-chart can be calculated using the χ2 distribution at the specific significance level. In this paper, the standardized Q-chart is used, which is a more sensitive version of the Q-statistic that relates the size of the residual to its expected value.21 If the Q-statistic is outside its limits, it means that the data do not fit the model, and that a new type of variation is present in the data that was not present in the NOC data. In that case, the model is not valid and the D-statistic should not be used. If the Q-statistic is inside its limits, the model is valid, and the D-chart should be examined. D-Chart. The Hotelling statistic is a Mahalanobis distance in the reduced model space with dimension R between the center of that space and the projection of the new batch on that space.

Dnew ) aTnew S-1anew ∼

R(I2 - 1) F(R, I - R) I(I - R)

(6)

for this batch, and S-1 is the inverse of the covariance matrix of the scores of the set of I NOC batches. F(R,I - R) represents the F-distribution with R and I - R degrees of freedom.2,22 The D-chart considers the systematic variation in the data. If, for a new batch, only a high D-statistic is obtained, the model is still valid, but the batch is outside of the range of the model. Control limits of 95 and 99% for the D-chart can be obtained from the F-distribution at the specific significance levels. On-Line Spectroscopic Monitoring. With on-line spectroscopic monitoring, the chemical batch reaction can be followed during its operation. This can be advantageous for a number of reasons. The first reason for on-line monitoring is safety. If a process is going out of control, it can be stopped before any damage is done to the analyzers or to the reaction vessel. Furthermore, in the case where a bad batch is stopped, a new batch can be started earlier, which leads to a higher production rate. For some batches, the total process time is not constant and may depend on the quality of the feed. On-line monitoring can be used to predict the end time of a batch while it is still running. This can be advantageous with respect to scheduling. Another reason for on-line monitoring is the on-line prediction of process quality. In many cases, the product of high-quality batches is blended with batches that have a lower product quality in order to have the total mixture within specification limits. The early prediction of quality while the batch is still running may also help in scheduling the blending strategy. However, in this paper no models for prediction are developed. A major problem in on-line monitoring is the fact that not all of the data of the complete batch run are available. At time k, the size of the new data xnew equals (Jk × 1), and therefore, xnew cannot be projected on the model P(JK × R) to give scores anew that are used for the calculation of Dnew and the residuals enew. Several approaches to deal with this problem can be found in the literature.2,4,5,7 In this paper, the approach of filling in future measurements (i.e., for the remainder of the batch run) with current deviations is used.2 This approach assumes that if for a new batch xnew, data are available up to time k, the difference between the spectrum of the new batch and the average spectrum of the I NOC batch runs at time k remains constant for the rest of the batch duration. Thus, at each time point k, xnew (partly measured and partly filled in) is available. Then, as in eq 4, for each time point k, anew,k and enew,k can be calculated, and from these Dk can be obtained (using a time-dependent Sk in eq 6). However, the sum of the squared residuals over all time points, Qnew, is not a good indicator of a process disturbance at a specific time point. It is also affected by the errors associated with the filling in of future measurements. Therefore, for on-line monitoring, the squared prediction error (SPEk) is used kJ

SPEk )



e2new,c

c)(k-1)J+1

Here, Dnew is the D-statistic for the new batch, anew is the score (18) Bro, R. Chemom. Intell. Lab. Syst. 1998, 46, 133-47. (19) Wise, B. M.; Gallagher, N. B.; Watts Butler, S.; White, D. D.; Barna, G. G. J. Chemom. 1999, 13, 379-96. (20) Box, G. E. P. Ann. Statistics 1954, 25, 290-302. (21) Westerhuis, J. A.; Gurden, S. P.; Smilde, A. K. J. Chemom. 2000, 14, 33549.

The SPEk is the sum of all squared residuals at time k, summed over all J wavelengths. The SPEk is able to track the particular instant that new variation, not present in the NOC set, appears in the new batch. (22) Tracy, N. D.; Young, J. C.; Mason, R. L. J. Qual. Technol. 1992, 24, 88-95.

Analytical Chemistry, Vol. 72, No. 21, November 1, 2000

5325

Figure 3. Parameters estimated for the gray model.

Fault Diagnosis. If the on-line Dk and SPEk charts indicate that a new batch is out of control, then contribution plots can be used to diagnose the cause of the disturbance.23-25 Contribution plots show how much each part of the spectrum contributes to the statistic that is out of control. If the SPEk caused an alarm at time k, all residuals for that time point can be plotted. The spectrum observed is a difference between the spectrum of the new batch and the mean of all NOC spectra at time k. This difference spectrum can be interpreted to find differences in concentration of the compounds or to identify new compounds in the reaction. The limits in these plots are obtained from the NOC data.25 If the D-statistic is outside the control limits, contribution to the D-statistic should be examined. These contributions show the part of the spectrum that was responsible for the D-statistic to be high. This information can be used for diagnosing the actual cause of the disturbance. Control limits in the contribution plot are obtained from the NOC data.25 In many process models, the scores A of the NOC data are orthogonal. In that case, they can be interpreted separately. This means that if only one of the scores is outside its control limits, then only this score needs to be considered. This usually simplifies the diagnosis of the fault. The control limits for the separate scores can be obtained from the NOC data.2 (23) Nomikos, P. ISA Trans. 1996, 35, 259-66. (24) Miller, P.; Swanson, R. E.; Heckler, Ch. E. Appl. Math. Comput. Sci. 1998, 8, 775-92. (25) Westerhuis, J. A.; Gurden, S. P.; Smilde, A. K. Chemom. Intell. Lab. Syst. 2000, 51, 93-114.

5326

Analytical Chemistry, Vol. 72, No. 21, November 1, 2000

In the model presented in this application, the scores are only slightly correlated, and therefore, the separate scores can also be used for fault diagnosis. Thus, if for a new batch one of the score values is higher or lower than usual, the feature corresponding to that score may be the cause of the new batch to be different from the NOC batches. RESULTS AND DISCUSSION Postanalysis of Historical Batches. During development of the process model it became clear that some of the historical batches did not fit the model. This may have been caused by a number of reasons such as a reaction rate constant that was different from the other batches due to pH or temperature differences or an abnormal baseline drift of the spectrometer. All batches that did not fit the model were removed from the data. In the monitoring charts presented below, the removed batches are clearly identified as different from the NOC batches. The final set of NOC batches consists of 30 batches. However, from these 30 batches, 3 batches were kept aside for validation of the process model, which leaves 27 NOC batches for building the process model. Model of the NOC Data. The 27 NOC batches are used to construct the empirical process model. In this application, a gray model is used as described in previous work.26 The gray model was used to incorporate external information from the process into the model. The white or hard part of the model uses the (26) Gurden, S. P.; Westerhuis, J. A.; Bijlsma, S.; Smilde, A. K. J. Chemom., in press.

Figure 4. Squared residuals in the batch mode, wavelength mode, and time mode.

spectra of the pure reactant, intermediate, and product and the concentration profiles of these compounds. The black or empirical part of the model describes systematic variation that did not correspond to known sources of variation. The white part describes the main part of the systematic variation in the data. This means that most of the variation in the data is well understood. The gray model describes 99.77% of the variation of 27 NOC batches. This high percentage of systematic variation explained by the model is due to the very low noise level of the data. This is very different from the percentages often seen for industrial process data of temperatures and pressures where usually 30-50% of the variation can be described. Figure 3 shows the estimated parameters of the gray model. Scores a1, ..., a3 show the differences between the NOC batches, and b1, ..., b5 are loadings in wavelength mode and c1, ..., c5 are loadings in the time mode. Loadings b1, ..., b3 are the loadings of the white part of the model for the wavelength mode and are fixed to be equal to the spectra of the reactant, intermediate, and product. The corresponding loadings c1, ..., c3 are restricted to equal the concentration trajectories of the reactant, intermediate, and product. Scores a1 show the differences between the NOC batches for this hard information. The black part of the model, scores a2 and a3, and loadings b4, b5, and c4, c5 are more difficult to interpret. The second black component, (a3, b5, c5), is only active in the beginning of the reaction (see c5). In the wavelength mode (b5), a difference spectrum between the spectra of the reactant and the intermediate is observed. The concentration of the reactant is too high and the concentration of the intermediate is too low. This is caused by a lower k1 value than usual. This component thus shows the variation in k1 values between the batches. The first black component (a2, b4, c4) shows a cyclic effect

in the batch mode (a2). It probably shows a baseline drift of the spectrometer due to temperature fluctuations in the reactor that are not perfectly controlled by the water bath. A detailed description of this gray model was presented by Gurden et al.26 The loadings in the wavelength mode, B[b1, ..., b5], and in the time mode, C[c1, ..., c5], are combined to give general model loadings P and scores [a1, ..., a3] together form A, the general model scores (see eq 3). Although the gray model describes 99.77% of the NOC data, there still are some residuals left that cannot be modeled. Figure 4 shows the residuals for the different modes. In the top left plot, the residuals in the batch mode are shown. None of the batches has significantly higher residuals compared to the other batches. The residuals in the wavelength mode (top right) show two spikes in the data, which are artifacts of the instrument that could not be modeled. The bottom plot in Figure 4 shows the residuals in the time mode. The beginning of the batch is hard to model due to mixing effects and therefore has the highest residuals. This is often seen in the modeling of chemical batch reactions. Control Charts. Using the gray model, presented in Figure 3, control charts were developed to monitor the batches off-line, i.e., after the batch run has finished completely. Figure 5 shows the D-chart (top plot) and the standardized Q-charts (bottom plot) calculated from the gray model of the NOC data. In both plots, the dashed line and the solid line represent the 95 and 99% control limits, respectively. The first 27 batches are the NOC batches, represented by small dots, which all fall well within the control limits. Batches 28-30 are the three successful batches that were kept aside for validation of the model. It is clear that these three batches are well within the control limits and thus are classified as good batches. Batches 31-39 are the batches that were Analytical Chemistry, Vol. 72, No. 21, November 1, 2000

5327

Figure 5. Off-line (top) D- and (bottom) standardized Q control charts with 95% (dashed line) and 99% (solid line) control limits. Batches 1-27 are the NOC batches represented by small dots, batches 28-30 are the good validation batches, batches 31-39 are the pH disturbed batches, and batches 40-46 are the outliers removed from the historical data.

intentionally disturbed by adding 10 µL of NaOH after ∼20 min. All of these batches are outside the 95% control limit and seven out of nine are outside the 99% control limit in the standardized Q-chart. Furthermore, batches 33 and 38 are also far outside the 99% control limit in the D-statistic. This means that all of the disturbed batches are classified as erroneous batches. The batches are detected in the Q-chart because a new feature, a raise in pH halfway the reaction, is present in these batches that is not present in the NOC batches and, therefore, also not present in the model. Batches that are outside in both control charts are, besides the pH disturbance described above, also outside the range of the model. The results show that the gray model is capable of distinguishing between good and bad batches. Batches 40-46 are the ones that were detected as outliers in the analysis of the historical data. Batches 43, 45, and 46 are clearly out in the D-statistic and batch 40 is outside the 95% control limit of the D-chart. This means that these batches are outside of the range of the model. All batches except batch 40 are also out in the Q-chart. This means that the residuals are higher than usual and that new variation is present in these batches that is not in the NOC batches. These results show that these batches are clearly different from the NOC batches and this justifies their rejection as outliers. On-Line Spectroscopic Monitoring and Fault Diagnosis of New Batches. In this section, some of the batches that were not used in the model building will be monitored on-line. The on-line monitoring charts will be used to follow whether the batch is still in control. These plots represent the on-line monitoring of the batch. Every 20 s a new measurement is obtained, the future 5328 Analytical Chemistry, Vol. 72, No. 21, November 1, 2000

Figure 6. On-line monitoring (top) D- and (bottom) SPE chart for batch 37. In both plots, the 95% control limit and the 99% control limit are represented by the dashed and solid lines, respectively. It is clearly observed that the batch goes outside the control limits in the SPE chart after 23 min.

measurements are filled in using the current deviations approach, and the scores a1..a3 and the residuals are calculated. Using this, Dk and SPEk are calculated and plotted in these charts. Figure 6 shows the on-line monitoring charts for batch 37. This batch was intentionally disturbed by adding a small amount of NaOH halfway through the reaction to increase the pH of the reaction. In the top plot, the on-line D-statistic is plotted, and the

Figure 7. Contribution plot to SPE for batch 37 after 25 min. The unmodeled residuals plotted here represent the difference between this batch and the mean of the NOC batches. The solid lines represent 99% confidence limits obtained from the NOC batches.

logarithm of the SPE is shown in the bottom plot. The logarithm of the SPE is plotted because of the very low control limits at time 30 min. Using the logarithm of the SPE makes the plot easier to examine. In both plots, the dashed line and the solid line again represent the 95 and 99% control limits, respectively. The problem in this batch starts after 20 min. A small shift in the D-statistic is observed, but the limits are not broken. However, at the same time point, the SPE starts to rise and goes outside the control limit at time 23 min. The SPE keeps increasing until the end of

the batch run. After 23 min, the charts show that something is wrong with this batch, and the problem is consistent. This shows that a small pH disturbance is detected rapidly and easily. To find the actual cause of this disturbance, a contribution plot is used. The contribution of each wavelength to the SPE at 25 min (that is just after the problem was detected) is shown in Figure 7. The contribution plot shows a difference spectrum between this specific batch and the average of all NOC batches after 25 min in this batch run. The solid lines represent 99% control limits based on the residuals of the NOC data. However, these limits should only be used as a guideline to see whether a contribution is large compared to contributions obtained from the NOC batch runs. The contribution plot in Figure 7 shows that in the wavelength region between 380 and 450 nm, where only the intermediate absorbs, the absorbance is higher than usual. This means that the concentration of the intermediate after 25 min is higher than usual for this specific batch. Furthermore, in the wavelength region 310-370 nm, the absorbance is lower than usual. In this region, only the reactant and the product absorb. However, after 25 min, the reactant has almost completely disappeared, meaning that the decreased absorbance can only be caused by a lower than usual concentration of the product. An increased concentration of the intermediate and a decreased concentration of the product is probably caused by a lower than usual reaction rate constant k2, caused by the increase in pH that was intentionally applied to this batch after ∼20 min. The second batch that is monitored on-line is batch 40. This batch was one of the historical batches detected as an outlier. No information on the cause of the disturbance was present. The batch is just outside the off-line control limits in the D-chart, but not in the Q-chart. It is therefore on the borderline of what can

Figure 8. On-line monitoring (top) D- and (bottom) SPE chart for batch 40. In both plots, the 95% control limit and the 99% control limits are represented by the dashed and solid lines, respectively.

Analytical Chemistry, Vol. 72, No. 21, November 1, 2000

5329

be detected and diagnosed using the gray model. The data fit the model but are outside the range of the model. The scores calculated are higher than the scores of the NOC batches. Figure 8 shows the on-line monitoring schemes for batch 40. The SPE is outside its 95% limit for a short while, but the residuals did not describe some specific behavior. After 5 min, the D-statistic breaks the 95% control limit in the D-chart. After 8 min, the highest value of the D-statistic for this batch is reached. It is not far outside the limits as was already clear from the off-line monitoring D-chart. Further investigation shows that score a2 is different from NOC behavior in the beginning of the batch. The scores a1, ..., a3 are nearly orthogonal and can be interpreted separately. This means that batch 40 appears to have some baseline problems. Closer examination of the wavelength range 300-350 nm showed that compound B had some minor absorbance in this region. A possible cause for this disturbance may therefore be that the concentration of compound B for this batch was somewhat higher than for the NOC batches. However, the actual cause of the problem is still not completely understood. Besides the bad batches, the good validation batches that were left out of the NOC set for validation purposes were also monitored. For all of these batches, no violation of the control limits was observed. All of these batches are therefore classified as good batches. In Figure 2, batch 28, a good validation batch, was plotted on the left and batch 37, with the pH disturbance, was plotted on the right. CONCLUSIONS In this paper, the on-line statistical monitoring of a chemical reaction using spectroscopic measurements is presented. A gray

5330

Analytical Chemistry, Vol. 72, No. 21, November 1, 2000

model was used to describe the data. This model is able to incorporate external information of the reaction. Using this model some batches that were intentionally disturbed by adding a small amount of NaOH to slightly increase the pH were easily detected as bad. On-line monitoring of these batches could detect the problem very fast, and using contribution plots, the cause of the problem could be identified. On-line monitoring of chemical reactions using spectroscopic measurements for detection and diagnosis of process faults has great potential. In the application presented in this paper, it was shown that process faults are rapidly detected and that contribution plots can be used as diagnostic tools to identify the cause of the disturbance. More applications need to be performed, but the ideas presented are promising. ACKNOWLEDGMENT These investigations were supported by the Council for Chemical Sciences of The Netherlands Organization for Scientific Research (NWO-CW) with financial aid from The Netherlands Technology Foundation (STW). SUPPORTING INFORMATION AVAILABLE Plot of on-line scores a1, ..., a3 for batch 40 and plot of on-line monitoring D- and SPE chart for validation batch 28. This material is available free of charge via the Internet at http://pubs.acs.org.

Received for review May 11, 2000. Accepted August 14, 2000. AC000532Y