Predictive mode of kinetic analysis and transient ... - ACS Publications

Predictive mode of kinetic analysis and transient detection responses. Evaluation of a recursive algorithm. Israel. Schechter. Anal. Chem. , 1992, 64 ...
0 downloads 0 Views 646KB Size
Anal. Chem. 1992, 64, 2610-2614

2810

Predictive Mode of Kinetic Analysis and Transient Detection Responses. Evaluation of a Recursive Algorithm Israel Schechter Max-Planck-Institut fur Quantenoptik, 0-8046 Garching, Germany

The extrapolation problems involved in predictive kinetic analyrls or in determinationsusing dow ion-selective potentiometrlc electrodes have been studied. A simple recursive algorithm which performs the extrapolation, In cases where the mathematlcaifunction Is not known, has been suggested and evaluated. The aigorlthmisbest for extrapdateingratlonal functions: however, It produces reasonable rewits with nonratlonalfunctlons as well. This method can be appiled to the extrapolation of any instrumental response used in anatyikai chemistry. The expectederror inthe determinations has been rtudkd as a function of the experimentalnoise level, the devlation from rational generating functlons, the time domain of the measurements, and other parameters.

INTRODUCTION The analytical chemists face routinely extrapolation problems; for example, in applying ion-selective potentiometric electrodes where too long a time is needed to reach the steadystate response or, in kinetic determinations, where a final response must be predicted using a tabulated set of data. In addition to these examples, there are numerous other applications, where some instrumental response is being measured as a function of time and one wants to predict its value at another time, usually at infinity. There is no problem when the mathematical function that describes the time response of the instrument is known. Unfortunately, usually this is not the case. Except for some cases, where the kinetic schemes are known, the analytical chemist needs to extrapolate the measured data without the knowledge of the mathematical function. Thispaper suggests a simple method to handle such problems, and it evaluates the possibilities and the quality of its results. This method takes into account that the experimental points always include some noise. Several different approaches to this problem have been carried out: Of course, there is the trivial "intuitive extrapolation'' that is being often carried out (sometimes quite successfully), but will not be discussed here. In the field of analysis by ion-selective electrodes, several functions have been suggested to fit the experimental data. A linearized hyperbolic function has been introduced by Miiller.' Modified hyperbolic models have been suggested by Massartand BuckZs and simple first-order models (exponential) were analyzed an extensive comparative study has by P a r d ~ e .Recently, ~ evaluated the various functions for the transient responses of ammonia selective potentiometric electrodes.5 Many other studies have been carried out as well. In the field of kinetic analysis, multivariable functions have been introduced to (1)Miiller, R. H. Anal. Chem. 1969,41,113A. (2)Martens, J.; Winkel, P. V.; Massart, D. L. Anal. Chem. 1976,48, 272. (3)Buck, R. P. Ion-Sel. Electrode Reu. 1982,4,3. (4)Pardue, H.L.; McNulty, P. J. Anal. Chem. 1988,60,1351. Pardue, H. L.; Pagan, G. Anal. Chem. 1992,64,1269. (5)Love, M. D.; 0003-2700/92/0364-26 10$03.00/0

cover general order reactions"8 and other generalized kinetic schemes.'+12 In these cases it is supposed that a t least the general form of the function is known. In the previously mentioned approaches the data are fitted to the parametric functions, and the parameters that provide the best fit are found out. Then, the extrapolation is carried out using this function and the optimized parameters. This method is excellent when it is known that the function that describes the experimental response is of the general form of the fitting function, otherwise the success of the extrapolation is not guaranteed. The reason is that the best fit in a certain range has nothing to do with the extrapolation problem. One can achieve perfect fit in the measured time domain just because the function has enough free parameters or due to a coincidence, and the predicted value can be far away from the real one. With regard to the problem of the slow ionselective electrode's responses, even using the best fit in the whole time range for severalmeasurements does not guarantee success in the next measurement's extrapolation. In this case the parameters depend on the "history" of the detector, and one can never be certain whether or not the good fit is due to too many free parameters in the model. The conclusion from these and other applications is that an extrapolation method that does not assume a specificmodel is required. Since the previously mentioned unsafe methods have been used several times, we shall emphasize this point by an example: Let us assume that the real response function is described by the Miiller-suggested function for ion selective electrodes:

where SOis the initial response, S, is its steady-state value, and 7 is the half-life time. We are provided with the measurements of S ( t )in the time range from t = 0 to t = 200 s, and we wish to know the steady-state value, Se. Since we do not know the real function, we may try to fit the following functions to the data: with kl and S, as free parameters, or the function (3)

with the free parameters S,, kl, kz, and ks. The results are shown in Figure 1. The solid line describes the real function and the circles describe some of the experimental data. (The (6)Larsson, J. A.;Pardue, H.L. Anal. Chim. Acta 1989,224,289. (7)Larsson, J. A,;Pardue, H. L. Anal. Chem. 1989,61, 1949. (8)Schechter, I. Anal. Chem. 1992,64,729. (9)Mieling, G.E.;Pardue, H. L. Anal. Chem. 1978,50, 1161. (10)Hamilton, S.D.; Pardue, H. L. Clin. Chem. ( Winston-Salem,NC) 1982,28,2359. (11)Schechter, I. Anal. Chem. 1991,63, 1303. (12)Schechter, I.; SchrMer, H. Anal. Chem. 1992,64,326. 0 1992 American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 64, NO. 21, NOVEMBER 1, 1992

2611

100

::I[

30

*1O0

1 0

200

400

600

800

1000

1200

time Figure 1. The best fit does not provide the best extrapolation. The soiMllneisthefunctionofeql,wtthS,= 1 0 0 , & = 0 , ~ = 150" clrclesare some of the syntheticdata, produced with a Gaussian nolse of 0.5%. The dotted ilne Is the best fit of eq 2 and the dashed line Is the best fit of eq 3. The stars are a series of extrapolations obtained by the proposed algorithm.

real function approaches 100 at infinity.) The dotted line is the best fit of eq 2 to the experimental points. It has been obtained by the Levenberg-Marquardt nonlinear least squares algorithm.'3J4 It provides a good fit to the experimental points; however, it approaches the value of 67 at infinity, which leads to 33 9% error. The dashed line is the best fit of eq 3 to the experimental points. Also in this case, the fitness is excellent in the experimental measurements domain; however, this function approaches -200 at infinity, thus introducing a 100% error. The stars that prefectly follow the real function have been produced by the method proposed here. These points have been calculated without the knowledge of the real function. The calculated points continue to fit the real function until the steady-state time has reached (not shown in the figure). In the following, this method is described and evaluated.

THE ALGORITHM The proposed method is based on the simple recursive Stoer-Bulirsch algorithm.15 The Stoer-Bulirsch algorithm has been originally developed for the interpolations involved in numerical integration of differential equations. The original algorithm guarantees perfect extrapolation, without knowing the function that generated the experimental points, when two severe requirements are fulfilled The first is that the unknown function that generated the experimental points is rational (quotient of polynomials). The second is that the experimental data are free of noise. It is not obvious how restricting the first demand is, since many of the nonrational functions may be approximated by rational functions. In most cases a polynomialapproximation is adequate. However, functions with poles may be approximated as well by rational functions. The second demand may not be fulfiied by any experiment, since some level of noise is always present. Another trivial demand requires that there are enough measured points. Of course, one cannot expect a successful extrapolation by only 2-3 points. This issue has been studied, and we have found that any reasonable number of measurements (which are regularly obtained in analytical chemistry) is adequate for the extrapolation. The taek of thispaper is to evaluate the performance of the algorithm for real analytical chemistry problems where noise

Figure 2. The tableau used for the extrapolation algorithm. Only the dlfferences, C and 0, are calculated and the final value of $ S, Is obtained by any Initial point S, and the set of the differences.

is present, and the function may be nonrational. In the following, we shall show that the algorithmmay also be applied in the realistic cases, and the expected errors of the extrapolation introduced by deviationsfrom the original algorithm requirements are analyzed. It is emphasized that the result of the algorithm is an extrapolation point at some future time. It may be carried out at very long times (effectivelyinfinity) or at shorter times. The difficulties of the calculation do not depend on the extrapolationrange. One can even calculatethe extrapolation at a series of time points, thus providing the whole function; however, the explicit rational function may not be found by this method. The Stoel-Bulirsch algorithm has been originally developed for other purposes, where the extrapolation has to be based on a very limited number of points, practically with no noise. In our applications,we usually have many readings, with some internal noise. Let us assume that we have m 1 data points, Si,measured at m + 1times, t i (i = 0, ...,m).The time intervals may be irregular. We are looking for a rational function of t passing through all m + 1 points. The aim is to calculate SE,the value of this function at the extrapolation time t ~We . shall define a 'Nevill's Tableau" that will lead by recursion relations to the desired value. First, let R: = SO,RY = SI, ..., R: = S,. (The superscript counts the columns in the tableau and the subscript counts the data points.) Now, let Ri be the value at t~ of the rational function passing through both (to, SO)and ( t l ,SI). Likewise, R:,Ri, ...,Rk. This procedure is to be continued until the value of R r is calculated. This value is the approximation to SE,the desired extrapolation. By the way it has been constructed, this is the value of the rational function that passes through all data points at time t E . The rational generated functions are so called diagonal, with the degrees of numerator and denominator equal (or, with the degree of the denominator larger by 1,if m is odd). 'i'he recursive tableau formed by the various R values is shown in Figure 2. The recursive formula for creating the various R's can be found in the mathematical literature.'6J' However, for our purpose, only the differences between neighbor R's are of importance. Once we know these differences,we can easily calculate the desired value of SE(at the right end of the tableau) by choosing a route starting at any initial point (at the left) and moving to the right. These differences are indicated as C and D in Figure 2. They are calculated by the following recursion formulas:

+

~

(13) Levenberg, K.Q.Appl. Math. 1944,2, 164. (14) Marquardt, D.SZAM J. Appl. Math. 1963,11,431. (16) Stoer, J.; Bulirsch, R. Introduction to Numerical Analysis; Springer Verlag: New York, 1980.

~~

(16) Hall, G.;Watt, J. M., Eds. Modern Numerical Methods for Ordinary Differential Equations; Clarendon Preas: Oxford, 1976. (17) Acton, F. S.Numerical Methods that Work; Harper and Row: New York, 1970.

2612

ANALYTICAL CHEMISTRY, VOL. 64, NO. 21, NOVEMBER 1, 1992

1-

'

Jq-

Coding these formulas to calculate SE,the objective of the extrapolation, needs only ca. 20-30 lines. An example of such a computer program is the subroutine RATINT in refla which can be easily applied to our purpose. EXPERIMENTAL SECTION The principle of the algorithm evaluation has been to produce syntheticdata and then to use the algorithm for the extrapolation problem, with no information on the generating function. The data have been produced by rational and nonrational functions, and a Gaussian noise has been added. The Gaussian noise has been generated by the Box-Miiller algorithm. In order to prevent sporadic results,many replicateexperiments with the same noise level have been carried out, and usually only the mean values are reported. Each single extrapolation accepts as an input the experimental points, (t,S(t)),and a time t ~ The . output is the extrapolated signal S ( ~ Eat)time , tE. The value of S, is obtained by extrapolatingto a time which is practically infinite. (In case of doubts,the value of t~ can be doubled,and the new extrapolated signal can be compared to the old one). In this way, the value of AS, = S, - So(which is of importance in kinetic analysis) can be calculated. The computer codes for producing the data, for the extrapolation,and for finding the least square fits have been written in FORTRAN 77. An IBM 3090 computer has been used, with VS-FORTRAN Version 2 Compiler. The peformance of the algorithm may depend on many variables; thus its evaluation is a multidimensional problem. Numerous experiments have been carried out in order to systematically evaluate the algorithm. Some of the interesting results are presented in the following section.

RESULTS AND DISCUSSION Many of the experiments have been carried out by using the following formula to generate the synthetic data:

With a = 1,this formula is the function suggested by Miiller for fitting the ion-selectiveelectrode's responses and has been frequently used in electrochemistry. The parameter a has been introduced to enable deviations from rational functions, since S ( t ) is rational only for integer a values. The term is such that the original meaning of T (the time required to get the value of S ( t ) = (S, S0)/2) is being maintained for all a values. Figure 3 shows this function for various a values and noise levels. The Effect of Deviation from Rational Generating Function. The difficulty of the extrapolation problem depends on the way the function approaches ita steady-state limit. As can be seen from Figure 3, this characteristic depends on the value of a. Functions with higher a values approach the steady-state values faster. Therefore, studying the effect of the value of a on the extrapolation success is not a simple task. In order to somehow compensate for this problem and make a comparison possible, the time range of the experimental points has been chosen to be from 0 to t99%/3. tw% is the time required to reach 99% of S, the steady-state value.

+

(18)Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T. Numerical Recipes; Cambridge University Press: Cambridge, 1989.

P

Flgurr 3. The function of eq 6 for a = 0.2, 0.6, 1.0, 1.4, and 1.8. S,

= 100, So = 0, and T = 150. The standard deviation of the noise level has been 1% and 0.5%. 16 14

%

12

\

10

L 0 L L

a

w

6 4

2 0

.5

1

2

1.5

2.5

3

a Flgurr 4. The average error in the extrapolation of eq 6 as a function of a. So = 1, S. = 100, r = 150, &= 0, & = tsB%/3,and the number of data points = 40. (0)noise level of 0.5 %, (x) noise level of 0.01 %. Note the remarkable decrease in the error for integer values of a, which form ratlonal functions.

It means that for high a values, shorter time domains of the experimental data collection is being used. The average error of the extrapolations, obtained by the proposed algorithm, as a function of a is shown in Figure 4. The data have been simulated by eq 6, and noise levels of 0.5% and of 0.01% have been added. The error is the deviation of the extrapolation result from the exact value. The extrapolation has been carried out at a time where St reaches 99% of S. The results show that the algorithm performs better for rational functions (integer values of a); however, it provides reasonable results with nonrational functions. The overall increase in the error with the values of a is not of importance. This effect is due to the specific way of the experiments' comparison. Other time-measurement programs produce different results; however, the characteristic decay at integer values always remains. It is also shown that a high noise level leads to higher errors. A more quantitative study of the noise level is presented in the following. T h e Effect of t h e Noise Level and of t h e Extrapolation Ratio. For a given generating function, the results of the extrapolation depend mainly on the noise level of the experimental readings and on the time program of the measurements. The measurement program consists of tmin, the starting time, and oft,, the time corresponding to the final measurement. In addition to these variables, the results depend on the noise level. Since the error of the extrapolation can be presented as a function of only two variables a t a time, the problem has been separated. The particular chosen values

ANALYTICAL CHEMISTRY, VOL. 64, NO. 21, NOVEMBER 1, 1992

a

ix

15

10

.... .,. . ..... , . . . .

..

...'..

, . . . . ......

.

2613

..

.. ......

,..... . . . _ . .

\

Flgure 6. The average error of the extrapolatlon as a function of fmk and the standard deviation of the data noise. The time range & f,,,, has been kept constant (=400). The maxlmal shown noise level Is of 0.75%. The extrapolation has been calculated at t = 3000.

._........

Flgwe S. The average error of the extrapolation as a functlon of the extrapolatlon ratlo and the standard deviation of the data noise: (a) eq 6 wlth a = 1, (b) eq 6 wlth a = 1.2, (c) eq 3 wlth k j = 0.001, k2 0.002, k3 = 0.003.

of tminand t,, have no general meaning unless they are scaled to some intrinsic time of relevance to the problem. In our case, a significant variable is the extrapolation ratio. This is the ratio of the time where the value of S(t) is being the time of the last calculated by extrapolation to ,t measurement. This value indicates the relative extrapolation range. Figure 5 shows the average error in the extrapolation as a function of the extrapolation ratio and the standard deviation of the signals' noise. For a rational function (Figure 5a), the error is low and depends mainly on the noise level; however, a slight increase is observed for high extrapolation ratios. For a nonrational function, the error significantly depends on both the extrapolation ratio and the noise level. This effect

has been demonstrated in parts b and c of Figure 5 for two different nonrational functions. It should be emphasized that the high extrapolation ratios in the figures correspond to very difficult extrapolation problems, where the measurements are stopped at about 60% of the process. Viewing the plots in Figure 3 shows that it is difficult to perform this extrapolation "intuitively". Taking into account that the algorithm knows nothing about the generating function, the results can be considered successful. The Effect of the Time Domain of the Measurements. The issue studied in this section is whether there is a preferred time domain for the measurements. In this study, the time interval ( t , - tmin) has been kept constant, and the initial time, tmb,has been varied. The results are shown in Figure 6. The calculations at several noise levels posses the same characteristics: There is an increase in the error with tminuntil a maximum value is obtained, and then the error decreases again. The initial increasein the error demonstrates the importanceof the points at the beginning of the process. The points at this stage are more informative; thus they lead to a better extrapolation. This is due to the fact that the signal has a significant change in this domain and due to the characteristics of the algorithm. The polynomial ratio is much influenced by the arguments in the vicinity of zero. The improvement of the results (decrease in the error) at high t,b is due to the decrease of the extrapolation ratio. Higher t,h means that the measurements are carried out closer to the desired extrapolation point. The overall effect of these two opposing influences is a curve, as shown in Figure 6. Number of Data Points. As previously mentioned, the algorithm finds the rational function that passes through all them + 1 experimentalpointa. The degree of thedenominator is equal to the degree of the numerator (if m is even) or is larger by one (if m is odd). The sum of the degrees of the numerator and denominator is equal to m. Therefore, as the number of points is larger, the polynomials that makes the rational function are of higher degree. This high degree may be needed to properly approximate some nonrational functions. Therefore, better results are expected from numerous data readings, especially for nonrational functions. This effect has been tested by varying the number of points from 3 to 600 for various noise levels. A slight decrease in the error as a function of the number of points has been observed in the range of 3-100 points. Almost no change has been observed over 100 points. At low noise levels, 15-30 points provide acceptable results. At moderate noise, ca. 50 points are needed, and at high noise levels it might be needed to use over 100 points.

2614

ANALYTICAL CHEMISTRY, VOL. 64, NO. 21, NOVEMBER 1, 1992

An additional effect is the parity of the number of data points used in the calculation. The parity is of importance since it determines the difference between the degrees of the numerator and the denominator. Considerably better results have been obtained for an even number of points, for all studied cases. This may stem from the nature of the functions that we are interested in, which are characterized by possessing no poles and by approaching asymptotically a definite value. Smoothing the Data. Experimental data contain some noise, and the algorithm provides a function that passes through all the points. Therefore, it has been considered for use to smooth the data prior to the application of the algorithm. Several smoothing and noise-reducing algorithms have been tested. Cubic spline approximations have been successful, especially when taking into account the correct estimates of the noise level. If this information is not available, the least squares smoothing algorithm of Savitzky and GolaylSJO provides better results. A proper smoothingof the data improves the average results by a factor of 2. The Savitzky and Golay algorithm provides a better stability of the results and is less risky in this sense. However, the result of all smoothing processes must be carefully examined before applying the extrapolation algorithm. An unproper smoothing introduces huge errors. All smoothing programs have some parameters that must be adjusted to the specific data points. The adjustment can be carried out by varying the parameters and examining the results; however, we could not automate this process. All attempts to automate the smoothing and to apply it without a careful examination resulted in a sporadic increase of the errors. Therefore, pretreating the data by smoothing algorithms cannot be generally recommended, but it may improve the results of those familiar with this art. The Effect of a Time Mismatch. In regular experimental conditions it is difficult to precisely determine the exact starting time of the processes. In the case of kinetic determinations this effect may be caused by the time needed to mix the reagents. In the case of electrochemical determinations, erratic responses are often observed during the first 10 s after immersion of the electrodes in the solution. The result of these effects may be a mismatch between the internal timing of the system and the externally measured time. Therefore, the sensitivity of kinetic algorithms to this effect is of considerable importance. The algorithm suggested here has been tested under conditions of various time mismatches. The tests have been carried out by introducing an artificial gap between the time points used to generate the data and the time points used for the extrapolation process. No sensitivity has been found. The extrapolation success has not been reduced even by huge time mismatches. The reason is that an argument-shifted rational function is a rational function as well. This is a considerable advantage compared to the function fitting algorithms which are known to be sensitive, to some extent, to time mismatches.

CONCLUDING REMARKS The proposed algorithm, which has been originally developed as a predictor in numerical integration of differential equations, has proved to be useful also in analytical chemistry. Its main characteristic of being a stable predictor, which is responsible for its success in numerical integrations, enables its usage as an extrapolator in the cases of interest in analytical chemistry. The algorithm is suitable when the data are noisy and no information on the generating function is available, or in cases (19) Savitzky, A.; Golay, M. J. E. Anal. Chem. 1964, 36, 1627. (20) Steinier, J.;Termonia, Y.;Deltour, J. Anal. Chem. 1972,44,1909.

where the starting instant of the process is not well-defined. It works best when the generating function happens to be rational. However, extrapolation may be carried out with nonrational functions as well. When the function is not rational, the algorithm is more sensitive to the noise level, which means that in such cases a low experimental noise level is essential. It has been shown that the extrapolation success depends on the extrapolation ratio, Le., how far the finalmeasurement is from the desired point. For rational functions very high extrapolation ratios may be used. The error of the extrapolation depends mainly on the noise level and only to a small extent on the extrapolation ratio. For nonrational functions, the error has a significant dependence on the extrapolation ratio. In all the examined cases, the extrapolation results have been perfect at the limit of almost no noise in the data. At this limit rational functions may be extrapolated to any extent with no error, and nonrational functions provide only nonsignificant errors in such conditions. This means that a safe extrapolation may always be carried out when good experimental data are handled. The main disadvantage of the proposed algorithm is the risk of convergence to a completely wrong value. This may happen when the data are very noisy (especially with nonrational functions). The reason is that a specific arrangement of the experimental points may just fit better to another rational function that happens to have a different behavior at infinity. However, this risk can be minimized by several simple ways: The best way is to use the algorithm and to extrapolate to a series of points at various future times (and not just at infinity). Thus, one can observe the behavior of the function by followingthe extrapolated points. Another alternative is to extrapolate to several points around the desired value, and to take the average. In this way, the risk of a wrong result in minimized. Another approach to performing the extrapolation in the case of very noisy data is to use several subsets of the data. However, it should be emphasized that the above-mentioned techniques may be needed only in special cases, and usually, when good data are available, a simple extrapolation is adequate. The original development of the algorithm provides an easy way to estimate the error of the extrapolation. This is just the last value of C or D used (eqs 4 and 5). However, it has been found that it provides no useful information in the cases of noisy data, in which we are interested. Since this algorithm is intended to be used when the generating function is not known, it is difficult to predict the accuracy of the results. However, there is a way to gain some information on the expected accuracy: It is based on the fact that not too many experimental points are needed in this algorithm, and one can easily obtain more data than needed (see the section on the effect of number of data points). Different subsets of the data may be used for the extrapolation, and the deviations can be noted, thus indicating the expected error. The previously mentioned technique can be used in order to improve the results in the case of very noisy data. Moreover, in such cases one should vary the starting point of the algorithm. The algorithm calculates a series of differences that have to be added to a data point in order to get the extrapolation value; therefore, the final value includes the noise of this particular point. When processing noisy data the algorithm may be applied to variouspoints, thus averaging out the noise of the particular points.

RECEIVED for review May 20, 1992. Accepted July 29, 1992.