Adaptive ICalman Filtering Sarah C. Rutan Department of Chemistry Box 2006 Virginia Commonwealth University Richmond, VA 23284
Computer methods have become increasingly important as analytical chemists continue their attempts to understand the nature of chemical systems. To interpret the results obtained from analytical experiments, data are fit to some sort of model describing the chemical system. By examining the results from such studies, researchers can explore new models for chemical behavior and identify and quantify the components in complex mixtures. When data obtained experimentally are consistent with the proposed model, the fitting procedure provides valuable information. Several models commonly used in analytical chemistry include simple straight lines, multiple linear models, and nonlinear models. These models can be used to fit data from calibration experiments, multicomponent spectroscopic measurements, and kinetic experiments. These standard models assume that all data points are consistent with the model selected, within the noise of the experiment. This assumption, however, may not be valid. To verify that the model is representative of all data, the residuals of the fit are examined. For simple models, such as straight lines, examination of residuals can sometimes lend insight into the nature of the model's inadequacy. For example, a single large residual indicates an outlier, whereas an observed trend in the residuals might indicate nonlinearity of the data. For complex models, it may not be possible to draw conclusions from the residuals about model adequacy and the nature of the lack of fit. The adaptive Kalman filter is most useful for such cases. The examples described above correspond to situations in which the 0003-2700/91/0363-1103A/$02.50/0 0 1991 American Chemical Society
I
4' J 1
ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15,1991
1103 A
model selected is at least partially correct. I t would be convenient to interpret the modeling results for the correct portion of the model and still gain some insight into the nature of the model inadequacy for the incorrect portion. For example, a signal obtained from an analytical instrument gives information about the analyte of interest yet also contains contributions from interferences such as background signals. Subtraction of a measured background signal from an analyte response is a standard way to correct this; however, if the background response is not precisely characterized in terms of either amplitude or shape, the background subtraction step will fail to completely remove the interference from the analytical signal. Adaptive Kalman filters have been useful in these instances ( 1 , Z ) . Another example of the use of the adaptive Kalman filter is the spectroscopic characterization of chemical species that cannot be physically isolated from related species with similar spectral characteristics because of equilibrium considerations. In this case, the spectral characteristics of the interfering species may be known and the adaptive Kalman filter can be used to determine the spectral characteristics of the target species. The Kalman filter was developed in 1960 by R. E. Kalman (3) for processing data for problems in orbital mechanics. Several reviews on the applications of Kalman filters to analytical problems have appeared recently (4-7). This A/C INTERFACE will focus on the adaptive modification of the Kalman filter that is used to fit chemical data when the model is incomplete or inaccurate. A brief summary of the regular Kalman filter will be given before expanding on the adaptive modification of the algorithm. (Throughout this discussion, scalars are denoted by lower case italic characters, vectors by lower case bold characters, and matrices by upper case bold characters. A superscript T denotes a transposed vector.) Kalman filter algorithm
In its simplest form, the Kalman filter is no different from the recursive least - squares fitting approach originally suggested by Gauss and discussed by Young (8).A recursive procedure processes the data points one at a time. The previous best estimate for the parameters (e.g., mean, slope, intercept) is used in computing the updated estimate of the parameter 1104 A
Figure 1. Recursive parameter estimation.
scalar value ZJ is,the noise contribution to the measurement, which has a variance of r,. A simple straight line calibration model takes the form
jht line m’
MuIticomponent Beer’s law model hT =
[€A
Ea
k1
riyuic L. Lalman filter mode, information for straight line and multicomponent Beer’s law expressions.
for each successive data point. To calculate a mean value, the recursive estimation procedure takes the form shown at the top of Figure 1, where k is the number of V a l ues processed, x, is the most recent measurement, and 3, and F,k-l are the means of k and k- 1responses, respectively. The Kalman filter update equation (the central equation of the algorithm), shown at the bottom of Figure 1,takes a similar form: x, is a vector containing all parameters to be estimated from the fit after k responses have been measured and filtered. The vector g, is the Kalman gain, and hz is the measurement function vector, which describes the relationship between the kth measurement, z,, and the most recent best estimates for the model parameters contained in the vector x ~ - ~ . For the examples described in this article, the Kalman filter model equation for the measurement process is Equation 1 illustrates a linear model, where the measurement function is described by a row vector, hz The
ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15,1991
2, = (m c,) + b + 0, (2) where c, is the concentration of the kth standard solution. In this case, the values for the standard concentrations c, are known and m and b are the parameters to be determined. Equations 1 and 2 a r e equivalent when the h r a n d x, vectors are specified for the straight line model, as shown in Figure 2. Another important model in analytical chemistry is represented by the multicomponent Beer’s law expression
2,
= (EA,,
c,> + (EB,,
cB>
+ 0, (3)
where and E ~ , , are the molar absorptivities of species A and B, respectively, at wavelength &, (the cell pathlength is assumed to be 1cm). In this instance, the values for the molar absorptivities are known, and the concentrations, cA and cB, are the parameters to be determined from the filtering procedure. The hzand x, vectors required for the solution to this problem are shown in Figure 2. This model allows for the resolution of overlapped spectra so that the concentrations of each individual species can be determined, provided that a pure component spectrum for each of the chemical species has been measured. Figure 3 shows a two-component overlapped spectral response fit to the model expressed by Equation 3. The spectral response shapes of the contributing components are known and are used to generate the hz vector for each wavelength h,. To use the Kalman filter for these problems, a weighting factor called
the Kalman gain, gk, must be calculated for each data point. The most important factor affecting the gain calculation is the variance of the measurement noise, r,. The gain is inversely proportional to the variance of the measurement noise. This means that relatively noise-free data will be weighted heavily in the update calculation, whereas noisy data will be given correspondingly less weight (see Figure 1). Because the data are processed point by point, it is fairly easy to change the variance of the noise (and hence the weighting factors), yielding a convenient implementation of weighted least - squares fitting. For the regular Kalman filter, value(s) for r k must be determined before beginning the fitting process. The covariance matrix, P k , is also computed during the Kalman fitting routine and describes the error in the parameter estimates contained in the vector xk after k measurements have been processed by the filter. Once the f i t t i n g process i s complete, t h e square roots of the diagonal elements of Pk give the standard deviations of the parameter estimates. When the
Kalman filter is used as described above, the results obtained are identical to those obtained using standard linear least squares. However, because the calculations are done recursively, a n additional diagnostic of the fit quality is available. The innovations sequence, v,, is defined as the difference between the predicted and the actual response and is given by the bracketed term in the update equation shown in Figure 1. These values are also known as on-line residuals. They are simply residuals of the fit, but they are calculated during the fitting process instead of being computed after the fitting process is complete. Once the first few data points have been processed, this innovations sequence should resemble a zero-mean, whitenoise process (provided that the original data are affected only by zeromean, white-noise processes). Characteristics of the innovations sequence under these conditions are shown at the bottom right of Figure 3. When the chosen model is incomplete or inaccurate, the normal Kalman filter fitting process will be altered by the presence of data points
inconsistent with t h a t model. The values for the on-line residuals will be large and will directly affect the computation of the updated estimates for the parameters. In turn, these values will be adjusted by large amounts and will usually result in final parameter estimates that are inaccurate. Adaptive filters are designed to overcome this limitation of the normal Kalman filter algorithm. Adaptive Kalman filter algorithm
For adaptive Kalman filters, the measurement variance, r k , plays an important role. The value for the measurement variance can be adjusted to compensate for the presence of model errors or outlying d a t a points. The idea is to attribute data points that are inconsistent with the model to random noise by artificially increasing the measurement variance so that the data points are not used to corrupt the parameter estimates. The variance of the measurement noise is recalculated as
(4)
I
I
Figure 3. Kalman filter (KF) fit (top right) and innovations sequence (bottom right) for fitting noisy data to an accurate two-component Beer’s law model (left). ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15, 1991
1105 A
A/C INTERFACE where q is the number of points corresponding t o a p r e d e t e r m i n e d smoothing window and j is the index. Equation 4 effectively allows the filter to disregard any data points associated with large values for the innovations sequence, thereby giving accurate estimates for the parameters. In addition? examination of the resulting fit residuals can help to determine the nature of the modeling errors. For most adaptive filters used for analytical applications, the method for “turning off)’ the filter to “bad” data points is based on the computation of Equation 4 (9). In contrast to the regular Kalman filter, computation of the value(s) for r, for the a d a p t e d f i l t e r i s b a s e d on t h e progress of the fitting process. When the innovations sequence is large, the calculated value for r, takes a large value, a result that affects the calculation of the Kalman gain. The gain becomes very small, effectively turning off the filter to incoming measurements, and the parameter estimates are not significantly altered by the update calculation shown in Figure 1. If the innovations sequence
Figul, -. Kaln,,,
I
values decrease to a range consistent with the known measurement errors, the filter can “open up” to subsequent responses. This approach de pends on the model errors, or outlyi n g d a t a p o i n t s , t o follow a significant trend within the q data point window used to average the innovations sequence in Equation 4. Multicomponent Beer’s law model The principles of the adaptive filter can be demonstrated by using a multicomponent Beer’s law model. Figu r e 4 shows the results obtained for filtering a hypothetical spectrum of a two - component mixture, where one of the components is “left out7,of the filter model. Here, spectral characterization of an unknown species in the presence of a known, spectrally similar species is desired. When the adaptive algorithm begins to filter the region of data where the missing component contributes, the filter is turned off and the concentration estimate for the known component is not affected by the presence of the second, contaminating component. In addition, the innovations sequence of the fit gives a more accurate esti-
,,,ter and adaptib, AaliIIdn filtei
[Ai\i
,nnc;vatiorl, sequGnce fGI Iliacbul’ats fllulricomponent mode,.
The second component is not included as part of the model information.
1106 A
mate for the shape of the unmodeled component, compared with the inno vations sequence obtained for the regular Kalman filter algorithm (Figure 4). When the regular Kalman filter is used, the contribution of the known component to the total signal is overestimated, as shown by the fit in the upper right of Figure 5. The adaptive filter algorithm can be used to avoid significant overestimation of the concentration for this component (also shown in Figure 5, bottom right). The main limitation of this approach is that the data must be consistent with the model for some portion of the observed, nonzero response. For example, if the unmodeled component is completely overlapped by components included in the model, the concentrations of the known components may be overestimated by the adaptive filter. For the system shown in Figure 5, the concentration of the component included in the model is slightly overestimated by the adaptive filter for this reason. When the above approach is used, the first few points processed by the filter should be modeled accurately.
ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15, 1991
If this is not the case, a simplexoptimized adaptive filter is useful (10). This algorithm is based on repetitive passes through the data, where the diagonal elements of the covariance matrix, P,, a r e minimized, after the last available data point has been processed. This step is equivalent to fitting the maximum amount of data consistent with the model selected. Other methods for optimization of the adaptive filter have been investigated, and minimization of the area under the innovations sequence has given accurate parameter estimates with reduced computation times (11). Zero-lag adaptive Kalman filter
Another difficulty occurs if the data and the model are inconsistent for a single point rather t h a n for a sequence of several data points. In this case, Equation4 can still be used to calculate corrected r, values when the smoothing window, q, is set to 1. The filter processes the data twice. The first pass is used to establish the values for r,, which will be used during the second pass of the filter. This method is called a zero-lag adaptive
r
filter because the corrected r, values are used for computing the updated estimates based on the same data point. This approach has been used successfully to reject one or more outlying data points from calibration data sets fit to the straight line model described above (12).The main restriction is that some correct data points must be processed initially by the filter. For problems of this type the data can be processed in any order desired. Therefore, the data points can be ranked in approximate order of reliability, and accurate results from the adaptive filter should be obtained. One convenient way of processing data points in a different order is to filter them in reverse. If this processing is not successful, the data points must be rearranged in a different order so that the first few points processed are accurately modeled. Modeling of gas-liquid partition coefficients
To illustrate the application of the zero - lag filter, consider linear free energy models for examining the fac-
tors contributing to gas-liquid partition coefficients, Ki, given by Ki = [Cilg / [Cil, (5) and where [Ci] is the concentration of a solute i in &e gas phase and [Cil, is the concentration of the solute in solvent s. A better understanding of the chemical and physical contributions to these equilibrium constants is important in many areas, such as chromatography, because devices such as Snyder’s solvent triangle are based on gas-liquid partition coefficient data (13). These equilibrium constants are normalized by the partition coefficient of a similar - sized alkane solute, giving K: values. The logarithms of these normalized partition coefficients for selected probe solutes are fit to a multiple linear model of dipolarity/polarizability and hydrogen bonding parameters for several com mon solvents (14). The fit obtained for toluene as a solute is shown in Figure 6a. These fit results show evidence of errors in the model; however, no conclusions can be drawn from the pattern of the residuals. In addition, the coefficient
lyule 5. Lalman filter and adaptive Kalman filter fit results for inaccurate multicomponent model.
The second component is not included as part of the model information.
ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15,1991
1107 A
A/K INTERFACE
U
Figure 6. Predicted versus actual values of the logarithm of the gas-liquid partition coefficient for toluene in 44 solvents using (a) normal Kalman filter algorithm and (b) zero-lag adaptive Kalman filter algorithm. Solid squares represent the alcoholic solvents included in the data set. Open squares represent all other solvents.
values obtained, describing the hydrogen bonding interactions between the solute and solvent, are not physically logical. When the zero-lag adaptive filter is used to fit the data, substantially different fit results are obtained, as shown in Figure 6b. A clear trend is observed in the residuals for the alcoholic solvents in the data set, in contrast t o the results obtained using standard regression procedures such as the regular Kalman filter. 1108 A
ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15,1991
These results allowed an improved model to be postulated, based on the ability of alcoholic solvents to selfassociate through hydrogen bonding interactions. In this case, examining the residuals from a traditional multiple linear regression procedure did not help to elucidate the nature of the model error. I n addition, the adaptive filter fit yielded parameter estimates that were more consistent with an intuitive evaluation of the chemical system (14).
There is nothing magic about using the adaptive Kalman filter. The results are obtained from fitting only those portions of the data that can be accurately modeled. However, this method does not require the analyst to decide which data points to use and which to omit from the fitting procedure. The use of this algorithm can aid in the development of chemically valid models, a result that cannot always be obtained by examining the residuals of simple multiple linear least-squares fits. In addition, more accurate parameter estimates can be obtained, despite the presence of errors in the model.
KONTES precision shrunk, ground and polished high resolution tubes are geometrically true to our specifications. They are guaranteed to auto-lock and are especially suited to high frequency systems (greater than 200 MHz). Each and every one of our tubes is checked for camber and concentricity using NEW technology developed by KONTES. Variations in your readings will reflect sample differences rather than tube inconsistencies.
The author acknowledges support from the U.S. Department of Energy, Grant No. DE-FG0588ER.13833.
Add to this our competitive pricing and quick delivery; there's no reason to buy your high performance tubes elsewhere. Ask for details.
References (1)Gerow, D.D.;Rutan, S. C. Anal. Chim. Acta 1986,184,53. (2)Gerow, D.D.;Rutan, S. C. Anal. Chem. 1988,60,847. (3) Kalman, R. E.J. Basic Eng. 1960,82, 34. (4)Brown, S.D. Anal. Chim. Acta 1986, 181,1. ( 5 ) Rutan, S.C. J. Chemometrics 1987, 1, 7. (6) Rutan, S. C. Chemometrics and Zntelligent Luboratoy Systems 1989,6,191. (7)Rutan, S.C. J. Chemometrics 1990,4, 103. (8) Young, P. Recursive Estimation and Time-Series Analysis; Springer-Verlag: New York, 1984. (9)Rutan, S.C.;Brown, S. D. Anal. Chim. Acta 1984,160,99. (10)Rutan, S.C.; Brown, S. D. Anal. Chim.Acta 1985,167,39. (11)Wilk, H.R.; Brown, S. D. Anal. Chim. Acta 1989,225,37. (12)Rutan, S.C.; Carr, P. W. Anal. Chim. Acta 1988,215,131. (13)Snyder, L.R. J. Chromatogr. Sci. 1978,16, 223. (14)Rutan, S.C.;Carr, P. W.; Taft, R. W. J. Phys. Chem. 1989,93,4292.
KIMBLE
Your most complete source for laboratory glassware products.
Also available through major scientific distributors. Call Toll-Free 1-800-223-7150.
See us at Pittcon Booth 4838. CIRCLE 78 ON READER SERVICE CARD w
The National Institute o Standards and Technology has developed a series of SRM's to serve as calibrants, test mixtures, and standardization materials for Quality Control of analytical instrumentation and methodology. MEASUREMENTS and STANDARDS are important to everyone who needs quality. NlST has over 1,000 Standard Reference Materials that can help you calibrate instruments and check on measurement accuracy. For more information phone or write for a free catalog.
Sarah C. Rutan is associate professor of chemistry a t Virginia Commonwealth University. She received her B.S. degree in chemistry in 1980 from Bates College (Lewiston, ME) and her Ph.D. in analytical chemistry in 1984 from Washington State University. Her research interests include chemometrics applied to solving problems in chromatographic and spectroscopic analyses.
Telephone (301) 975-OSRM (6776) FAX (301) 948-3730
=
-
I&
Z".
STANDARD REFERENCE MATERIAL PROGRAM Building 202, Room 204 National Institute of Standards and Technology Gaithersburg, MD. 20899
CIRCLE 98 ON READER SERVICE CARD
ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15,1991
1109 A