Adaptive Kalman Filtering Sarah C. Rutan Department of Chemistry
Box 2006 Virginia Commonwealth University Richmond, VA 23284
Computer methods have become increasingly important as analytical chemists continue their attempts to understand the nature of chemical systems. To interpret the results obtained from analytical experiments, data are fit to some sort of model describing the chemical system. By examining the results from such studies, r e s e a r c h e r s can explore new models for chemical behavior and identify and quantify the components in complex mixtures. When d a t a obtained experimentally are consistent with the proposed model, the fitting procedure provides valuable information. Several models commonly used in analytical c h e m i s t r y i n c l u d e simple straight lines, multiple linear models, and n o n l i n e a r models. These models can be used to fit data from calibration experiments, multicomponent spectroscopic measurements, and kinetic experiments. These s t a n d a r d models a s s u m e that all data points are consistent with the model selected, within the noise of the experiment. This assumption, however, may not be valid. To verify that the model is representative of all data, the residuals of the fit are examined. For simple models, such as straight lines, examination of residuals can sometimes lend insight into the nature of the model's inadequacy. For example, a single large residual indicates an outlier, whereas an observed trend in the residuals might indicate nonlinearity of the data. For complex models, it may not be possible to draw conclusions from the residuals about model adequacy and the nature of the lack of fit. The adaptive Kalman filter is most useful for such cases. The examples described above correspond to situations in which the 0003 - 2700/91/0363 -1103A/$02.50/0 © 1991 American Chemical Society
A/C INTERFACE ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15, 1991 · 1103 A
A/C
INTERFACE
model selected is at least partially correct. It would be convenient to in terpret the modeling results for the correct portion of the model and still gain some insight into the nature of the model inadequacy for the incor rect portion. For example, a signal obtained from an analytical instru ment gives information about the an alyte of interest yet also contains c o n t r i b u t i o n s from i n t e r f e r e n c e s such as background signals. Subtrac tion of a measured background signal from an analyte response is a stan dard way to correct this; however, if the background response is not pre cisely characterized in terms of ei ther amplitude or shape, the back ground subtraction step will fail to completely remove the interference from the analytical signal. Adaptive Kalman filters have been useful in these instances (2, 2). Another example of the use of the adaptive Kalman filter is the spec troscopic characterization of chemi cal species that cannot be physically isolated from related species with similar spectral characteristics be cause of equilibrium considerations. In this case, the spectral characteris tics of the interfering species may be known and the adaptive Kalman fil ter can be used to determine the spectral characteristics of the target species. The Kalman filter was developed in 1960 by R. E. Kalman (3) for pro cessing data for problems in orbital mechanics. Several reviews on the applications of Kalman filters to an alytical problems have appeared re cently (4-7). This A/C INTERFACE will focus on the adaptive modifica tion of the Kalman filter that is used to fit chemical data when the model is incomplete or inaccurate. A brief summary of the regular Kalman fil ter will be given before expanding on the adaptive modification of the algo rithm. (Throughout this discussion, scalars are denoted by lower case italic characters, vectors by lower case bold characters, and matrices by upper case bold characters. A su perscript Τ denotes a transposed vec tor.) Kalman filter algorithm In its simplest form, the Kalman fil ter is no different from the recursive least-squares fitting approach origi nally suggested by Gauss and dis cussed by Young (8). A recursive pro cedure processes the data points one at a time. The previous best estimate for the parameters (e.g., mean, slope, intercept) is used in computing the updated estimate of the parameter
Recursive mean
Xk
=
New parameter
Kalman filter update
x
k
Χ*.,
+
Old parameter
"
*k-%
+
(1A)
(**
Weight factor
Actual response
9*
-
l*H -
%ι)
Predicted response
< h I- X *-1>J
Innovations sequence, ν λ
Figure 1. Recursive parameter estimation.
scalar value vk is the noise contribu tion to the measurement, which has a variance of rk. A simple straight line calibration model takes the form
Straight line model
**m[£\
hj - lck 1]
zk = (m · ck) + b + vk
Multicomponent Beer's law model
"* • [sj]
Κ " Ι£ΑΛ eB,/rl
Figure 2. Kalman filter model information for straight line and multi component Beer's law expressions.
for each successive data point. To calculate a mean value, the re cursive estimation procedure takes the form shown at the top of Fig ure 1, where k is the number of val ues processed, xk is the most recent measurement, and xk and χk_1 are the means of k and k-1 responses, re spectively. The Kalman filter update equation (the central equation of the algorithm), shown at the bottom of Figure 1, takes a similar form: xA is a vector containing all parameters to be estimated from the fit after k re sponses have been measured and fil tered. The vector gk is the Kalman gain, and h j is the m e a s u r e m e n t function vector, which describes the relationship between the Ath mea surement, zk, and the most recent best estimates for the model parame ters contained in the vector χ^_χ. For the examples described in this article, the Kalman filter model equa tion for the measurement process is ** = (!>* * * * ) + **
(1)
Equation 1 illustrates a linear model, where the measurement function is described by a row vector, h j . The
1104 A · ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15, 1991
(2)
where ck is the concentration of the Mh standard solution. In this case, the values for the standard concen trations ck are known and m and b are the parameters to be determined. Equations 1 and 2 are equivalent when the h j a n d x^ vectors are spec ified for the straight line model, as shown in Figure 2. Another important model in ana lytical chemistry is represented by the multicomponent Beer's law ex pression % = (eA,A ' CA) + (e B < ! · CB) + vk (3)
where eAk and e B k are the molar absorptivities of species A and B, re spectively, at wavelength Xk (the cell pathlength is assumed to be 1 cm). In this instance, the values for the mo lar absorptivities are known, and the concentrations, cA and c B , are the parameters to be determined from the filtering procedure. The h j a n d χΑ vectors required for the solution to this problem are shown in Figure 2. This model allows for the resolution of overlapped spectra so that the con centrations of each individual species can be determined, provided that a pure component spectrum for each of the chemical species has been mea sured. Figure 3 shows a two-compo nent overlapped spectral response fit to the model expressed by E q u a tion 3. The spectral response shapes of the contributing components are known and are used to generate the h j vector for each wavelength λ^.. To use the Kalman filter for these problems, a weighting factor called
the Kalman gain, gk, must be calcu lated for each data point. The most important factor affecting the gain calculation is the variance of the measurement noise, rk. The gain is inversely proportional to the vari ance of the measurement noise. This means t h a t relatively noise-free data will be weighted heavily in the up date calculation, whereas noisy data will be given correspondingly less weight (see Figure 1). Because the data are processed point by point, it is fairly easy to change the variance of the noise (and hence the weighting factors), yielding a convenient imple mentation of weighted least-squares fitting. For the regular Kalman fil ter, value(s) for rk must be deter mined before beginning the fitting process. The covariance matrix, Vk, is also computed during the Kalman fitting routine and describes the error in the parameter estimates contained in the vector xk after k measurements have been processed by the filter. Once the f i t t i n g p r o c e s s is c o m p l e t e , t h e square roots of the diagonal elements of Pk give the standard deviations of the parameter estimates. When the
Kalman filter is used as described above, the results obtained are iden tical to those obtained using stan dard linear least squares. However, because the calculations are done re cursively, an additional diagnostic of the fit quality is available. The innovations sequence, vk, is defined as the difference between the predicted and the actual response and is given by the bracketed term in the update equation shown in Fig ure 1. These values are also known as on-line residuals. They are simply residuals of the fit, but they are cal culated during the fitting process in stead of being computed after the fit ting process is complete. Once the first few data points have been pro cessed, this innovations sequence should resemble a zero-mean, whitenoise process (provided that the orig inal data are affected only by zeromean, white-noise processes). Char acteristics of the innovations sequence under these conditions are shown at the bottom right of Figure 3. When the chosen model is incom plete or inaccurate, the normal Kal man filter fitting process will be al tered by the presence of data points
inconsistent with t h a t model. The values for the on-line residuals will be large and will directly affect the computation of the u p d a t e d esti mates for the parameters. In turn, these values will be adjusted by large amounts and will usually result in fi nal parameter estimates that are in accurate. Adaptive filters are de signed to overcome this limitation of the normal Kalman filter algorithm. Adaptive Kalman filter algorithm For adaptive K a l m a n filters, t h e measurement variance, rk, plays an important role. The value for the m e a s u r e m e n t variance can be ad justed to compensate for the presence of model e r r o r s or outlying d a t a points. The idea is to attribute data points that are inconsistent with the model to random noise by artificially increasing the m e a s u r e m e n t v a r i ance so that the data points are not used to corrupt the parameter esti mates. The variance of the measure ment noise is recalculated as
r*-(l/«)(I ν · ν . ) i.i
"-'
"-'
hJ.P^h,
(4)
KF Wavelength (nm)
Wavelength (nm)
Wavelength (nm)
Figure 3. Kalman filter (KF) fit (top right) and innovations sequence (bottom right) for fitting noisy data to an accurate two-component Beer's law model (left). ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15, 1991 · 1105 A
A/C
INTERFACE
where g is the number of points corr e s p o n d i n g to a p r e d e t e r m i n e d smoothing window and / is the index. Equation 4 effectively allows the filter to disregard any data points associated with large values for the innov a t i o n s sequence, t h e r e b y giving accurate estimates for the parameters. In addition, examination of the resulting fit residuals can help to determine the nature of the modeling errors. For most adaptive filters used for analytical applications, the method for "turning off" the filter to "bad" data points is based on the computation of Equation 4 (9). In contrast to the regular Kalman filter, computation of the value(s) for rk for the a d a p t e d f i l t e r is b a s e d on t h e progress of the fitting process. When the innovations sequence is large, the calculated value for rk takes a large value, a result t h a t affects the calculation of the Kalman gain. The gain becomes very small, effectively turning off the filter to incoming measurements, and the parameter estimates are not significantly altered by the update calculation shown in Figure 1. If the innovations sequence
values decrease to a range consistent with the known measurement errors, the filter can "open u p " to subsequent responses. This approach depends on the model errors, or outlyi n g d a t a p o i n t s , to follow a significant trend within the q data point window used to average the innovations sequence in Equation 4. Multicomponent Beer's law model The principles of the adaptive filter can be demonstrated by using a multicomponent Beer's law model. Figure 4 shows the results obtained for filtering a hypothetical spectrum of a two-component mixture, where one of the components is "left out" of the filter model. Here, spectral characterization of an unknown species in the presence of a known, spectrally similar species is desired. When the adaptive algorithm begins to filter the region of data where the missing component contributes, the filter is turned off and the concentration estimate for the known component is not affected by the presence of the second, contaminating component. In addition, the innovations sequence of the fit gives a more accurate esti-
mate for the shape of the unmodeled component, compared with the innovations sequence obtained for the regular Kalman filter algorithm (Figure 4). When the regular Kalman filter is used, the contribution of the known component to the total signal is overestimated, as shown by the fit in the upper right of Figure 5. The adaptive filter algorithm can be used to avoid significant overestimation of the concentration for this component (also shown in Figure 5, bottom right). The main limitation of this approach is that the data must be consistent with the model for some portion of the observed, nonzero response. For example, if the unmodeled component is completely overlapped by components included in the model, the concentrations of the known components may be overestimated by the adaptive filter. For the system shown in Figure 5, the concentration of the component included in the model is slightly overestimated by the adaptive filter for this reason. When the above approach is used, the first few points processed by the filter should be modeled accurately.
Unmodeled component
Unmodeled component Modeled component
Wavelength (nm)
Unmodeled component Wavelength (n m)
Wavelength (nm) Figure 4. Kalman filter and adaptive Kalman filter (AKF) innovations sequence for inaccurate multicomponent model. The second component is not included as part of the model information.
1106 A · ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15, 1991
If this is not the case, a simplexoptimized adaptive filter is useful (JO). This algorithm is based on repetitive passes t h r o u g h the d a t a , where the diagonal elements of the covariance m a t r i x , P^, a r e m i n i mized, after the last available data point has been processed. This step is equivalent to fitting the maximum amount of data consistent with the model selected. Other methods for optimization of the adaptive filter have been investigated, and minimization of the area under the innovations sequence has given accurate parameter estimates with reduced computation times (11). Zero-lag adaptive Kalman filter Another difficulty occurs if the data and the model are inconsistent for a single point r a t h e r t h a n for a sequence of several data points. In this case, Equation 4 can still be used to calculate corrected rk values when the smoothing window, q, is set to 1. The filter processes the data twice. The first pass is used to establish the values for rk, which will be used during the second pass of the filter. This method is called a zero-lag adaptive
filter because the corrected rk values are used for computing the updated estimates based on the same data point. This approach has been used successfully to reject one or more outlying data points from calibration data sets fit to the straight line model described above (12). The main restriction is t h a t some correct data points must be processed initially by the filter. For problems of this type the data can be processed in any order desired. Therefore, the data points can be ranked in approximate order of reliability, and accurate results from the adaptive filter should be obtained. One convenient way of processing data points in a different order is to filter them in reverse. If this processing is not successful, the data points must be rearranged in a different order so t h a t the first few points processed are accurately modeled. Modeling of g a s - l i q u i d partition coefficients
To illustrate the application of the zero-lag filter, consider linear free energy models for examining the fac-
tors contributing to gas-liquid partition coefficients, Kit given by A| = [C i ] | ./[C 1 ].
(5)
and where [Q] g is the concentration of a solute i in the gas phase and [CJa is the concentration of the solute in solvent s. A better understanding of the chemical and physical contributions to these equilibrium constants is important in many areas, such as c h r o m a t o g r a p h y , because devices such as Snyder's solvent triangle are based on gas-liquid partition coefficient data (13). These equilibrium constants are normalized by the partition coefficient of a similar-sized alkane solute, giving K" values. The logarithms of t h e s e normalized p a r t i t i o n coefficients for selected probe solutes are fit to a multiple linear model of dipolarity/polarizability and hydrogen bonding parameters for several common solvents (14). The fit obtained for toluene as a solute is shown in Figure 6a. These fit results show evidence of errors in the model; however, no conclusions can be drawn from the pattern of the residuals. In addition, the coefficient
Modeled component
Unmodeled component
Modeled component
Wavelength (nm)
Modeled component
Wavelength (nm)
Wavelength (nm)
Figure 5. Kalman filter and adaptive Kalman filter fit results for inaccurate multicomponent model. The second component is not included as part of the model information.
ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15, 1991 · 1107 A
A/C
INTERFACE
LEGAL NOTICE U.S. POSTAL SERVICE STATEMENT OF OWNERSHIP, MANAGEMENT AND CIRCULATION
(a)
(Required by 39 U.S.C. 3685) 1 A. Title of publication: Analytical Chemistry 1B. Publication no.: 00032700 2. Date of filing: Oct 1,1991 3. Frequency of issue: Semimonthly 3A. No. of issues published annually: 23 3B. Annual subscription price: $31.00 4. Complete mailing address of known office of publication: American Chemical Society 1155 16th Street, N.W. Washington, D.C. 20036 5. Complete mailing address of headquarters of general business offices of the publisher American Chemical Society 1155 16th Street, N.W. Washington, D.C. 20036 6. Names and addresses of publisher, editor and managing editors: Royce W. Murray Department of Chemistry University of North Carolina Chapel Hill, NC 27514 7. Owner: American Chemical Society 115516th Street, N.W. Washington, D.C. 20036 8. Not applicable 9. The purpose, function, and nonprofit status of this organization and the exempt status for Federal income tax purposes has not changed during the preceding 12 months. 10. Extent and nature of circulation Average No. Copies Each Issue During Preceding 12 Months
A. Total no. copies printed B. Paid Circulation 1.Sales through dealers and carriers, street vendors and counter sales 2.Mail subscriptions C. Total paid circulation D. Free distribution by mail, carrier or outer means, samples complimentary, and other free copies E. Total distribution F. Copies not distributed 1.Office, use, leftover, unaccounted, spoiled after printing 2.Return from news agents G. Total
Log K" (actual) (b)
Actual No. Copies of Single Issue Published Nearest to Rung Oate
25,231
24,136
0
0
21,275 21,275
21,345 21,345
2,116 23,391
1,565 22,190
Log K" (actual)
1,840
1,946
0
0
25,231
24,136
11. I certify that the statements made by me above are correct and complete Director, Publications Division-ACS Robert H. Marks
Figure 6. Predicted versus actual values of the logarithm of the gas-liquid partition coefficient for toluene in 44 solvents using (a) normal Kalman filter algorithm and (b) zero-lag adaptive Kalman filter algorithm. Solid squares represent the alcoholic solvents included in the data set. Open squares represent all other solvents.
values obtained, describing the hydrogen bonding interactions between the solute and solvent, are not physically logical. When the zero-lag adaptive filter is used to fit the data, substantially different fit results are obtained, as shown in Figure 6b. A clear trend is observed in the residuals for the alcoholic solvents in the data set, in cont r a s t to the results obtained using standard regression procedures such as the regular Kalman filter.
1108 A · ANALYTICAL CHEMISTRY, VOL. 63, NO. 22, NOVEMBER 15, 1991
These results allowed an improved model to be postulated, based on the ability of alcoholic solvents to selfassociate through hydrogen bonding interactions. In this case, examining the residuals from a traditional multiple linear regression procedure did not help to elucidate the nature of the model error. In addition, the adaptive filter fit yielded parameter estimates that were more consistent with an intuitive evaluation of the chemical system (14).
There is nothing magic about using the adaptive Kalman filter. The results are obtained from fitting only those portions of the data that can be accurately modeled. However, this method does not require the analyst to decide which data points to use and which to omit from the fitting procedure. The use of this algorithm can aid in the development of chemically valid models, a result that cannot always be obtained by examining the residuals of simple multiple linear least-squares fits. In addition, more accurate parameter estimates can be obtained, despite the presence of errors in the model.
KONTES precision shrunk, ground and polished high resolution tubes are geometrically true to our specifications. They are guaranteed to auto-lock and are especially suited to high frequency systems (greater than 200 MHz). Each and every one of our tubes is checked for camber and concentricity using NEW technology developed by KONTES. Variations in your readings will reflect sample differences rather than tube inconsistencies.
The author acknowledges support from the U.S. Department of Energy, Grant No. DE-FG0588ER13833.
Add to this our competitive pricing and quick delivery; there's no reason to buy your high performance tubes elsewhere. Ask for details.
References (1) Gerow, D. D.; Rutan, S. C. Anal. Chim. Acta 1986, 184, 53. (2) Gerow, D. D.; Rutan, S. C. Anal. Chem. 1988, 60, 847. (3) Kalman, R. E. / Basic Eng. 1960, 82, 34. (4) Brown, S. D. Anal. Chim. Acta 1986, 181, 1. (5) Rutan, S. C. / Chemometrics 1987, 1, 7. (6) Rutan, S. C. Chemometrics and Intelligent Laboratory Systems 1989, 6, 191. (7) Rutan, S. C. / Chemometrics 1990, 4, 103. (8) Young, P. Recursive Estimation and Time-Series Analysis; Springer-Verlag: New York, 1984. (9) Rutan, S. C ; Brown, S. D. Anal. Chim. Acta 1984, 160, 99. (10) R u t a n , S. C.; Brown, S. D. Anal. Chim. Acta 1985, 167, 39. (11) Wilk, H. R.; Brown, S. D. Anal. Chim. Acta 1989, 225, 37. (12) Rutan, S. C ; Carr, P. W. Anal. Chim. Acta 1988, 215, 131. (13) Snyder, L. R. /. Chromatogr. Sci. 1978 16 223 (14) RutaA, S. C.; Carr, P. W.; Taft, R. W. J. Phys. Chem. 1989, 93, 4292.
most complete KIMBLE Your source for laboratory glassware products. KONTES Also available through major scientific distributors. Call Toll-Free 1-800-223-7150. See us at Pittcon Booth 4838 . CIRCLE 78 ON READER SERVICE CARD
NIST
UNITED STATES DEPARTMENT OF COMMERCE NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY
The National Institute of Standards and Technology has developed a series of SRM's to serve as calibrants, test mixtures, and standardization materials for Quality Control of analytical instrumentation and methodology.
NBS
ISA N