ANALYTICAL CHEMISTRY, VOL. 51, NO. 7, JUNE 1979
829
Characterization of Heterogeneous Kinetic Parameters from Voltammetric Data by Computerized Pattern Recognition R. A. DePaima' and S. P. Perone" Department of Chemistry, Purdue University, West La fayette, Indiana
A method for the estimation of heterogeneous kinetic parameters of an electrochemical reaction has been developed. The parameters are extracted from a single stationary electrode polarogram by a comparison of the real polarogram with a reference file of theoretical poiarograms. The Fourier transform was used to represent the voltammogram, and the Euclidean distance in 15-dimensional Fourier space was used as the similarity measure. The problems of noisy real data were minimized by using the natural weighting of lower frequency Fourier coefficients. Five chemical systems of known behavior were used to demonstrate the accuracy of this technique. The on-line computer procedure allows rapid (less than 20 s) prediction of three kinetic parameters, $, CY, and n, from a single voltammogram. In addition, the procedure provides important diagnostics which can be used to evaluate the confidence of the prediction. These diagnostics were demonstrated in the analysis of the Zn(I1) data and the severely overlapped doublet data.
T h e shape of a stationary electrode polarogram is affected by a t least three factors: the reversibility of the electron transfer, t h e symmetry factor, and the number of electrons involved in the transfer. T h e reversibility of an electron transfer is dictated by the standard heterogeneous rate constant for the electron transfer, k,, and the measure of the symmetry factor is cy. Many methods have been developed t o measure these parameters for electrochemical reactions. Gaur and Goswami (1) plot current functions as a function of dc potential from classical polarograms and extract the parameters from the graph. Faradaic rectification (2) which applies a sinusoidal signal a t constant dc potential t o the electrodes and measures the current as a function of frequency can be applied to obtain the kinetic parameters. T h e associated technique of ac polarography ( 3 , 4 ) also provides a measure of the kinetic parameters. Cyclic voltammetry ascertains k , from the difference in potential of the cathodic and anodic peaks, and can also be used to determine CY ( 5 ) . T h e method presented here is sensitive to the curve shape dependence on these kinetic parameters and uses these effects to predict the parameters. In this method, the voltammogram is treated as a generalized waveform and is represented by its Fourier transform coefficients. This method of representation is more compact and appropriate, and has been used previously for electrochemical data (6, 7 ) . In order to predict t h e kinetic parameters from a voltammogram, the voltammogram is matched with a reference set of theoretical voltammograms. T h e prediction procedure used here is a variation of the k-nearest neighbor (kNN) pattern classification algorithm (8). Each voltammetric waveform with known kinetic parameters is represented in Fourier feature space, with the assumption t h a t waveforms derived from similar kinetic parameters will cluster in the same region of space. T h e kinetic parameters for a n unknown system are determined from the averages of those of its k nearest neighbors in Fourier feature space.
47907
Table I. Synthesis Parameters for Voltammetric Training Set
+
values
values n values
01
20.0, 10.0, 5.0, 2 . 0 , 1.0, 0.5, 0 . 2 , 0.1, 0.05, 0.02, 0.01 0.2, 0 . 3 , 0.4, 0.5, 0 . 6 , 0.7, 0.8 0.8-1.1, 1.8-2.2, 2 . 7 - 3 . 3
T h e prediction procedure was evaluated by using five systems of known electrochemical behavior. Their kinetic parameters were predicted by an on-line computer system in less than 20 s from a single voltammogram. T h e procedure is applicable as a rapid, simultaneous estimate of the kinetic parameters. Inherent in the procedure are checks on the validity of the procedure.
EXPERIMENTAL Electrochemical. Stationary electrode polarograms for five metal ions with known heterogeneous kinetic parameters were obtained with the computer-controlled instrument described earlier (6, 9). The electrochemical cell is of a standard threeelectrode design with a dropping mercury working electrode, a platinum counter electrode, and a saturated calomel reference electrode. The ensemble averaging experiments were performed so that each voltammogram was acquired near the end of the life of a 5-s mercury drop. All solutions were prepared with the use of reagent grade chemicals and distilled, de-ionized water. Each solution was thoroughly deoxygenated prior to analysis by purging with purified, solvent-saturated nitrogen ( I O ) for 20 min. The 250-point data curves were obtained for each electroactive species at 1.00 V/s, and a data resolution of 2.0 mV/point. Voltammograms which contained obvious distortions such as excessive noise, dicontinuities, or ADC overflows were discarded. Computer Systems. The instrumentation computer used was a Hewlett-Packard 2115A with 8K words of core memory. Peripherals include a paper tape reader and punch, a Tektronix 601 storage display monitor, and a Teletype. Data acquisition and hardware control subroutines were written in HP assembly language and called by main programs written in BASIC. The pattern recognition processor was a Hewlett-Packard 2100s minicomputer with 32K words of core memory. Peripherals include a 6-Mbyte moving head disc drive (HP-7900),paper tape reader and punch, a Tektronix 603 storage display monitor and a 4012 graphics display terminal, a Centronics 306 serial printer, a Calcomp 565 digital plotter, and a Teletype. All pattern recognition programs were written in FORTRAN IV and operated under the H P DOS-M executive. Data are transmitted from the laboratory computer to the pattern recognition computer by a bidirectional, high speed, 16-bit parallel interface. The interface consists of two Hewlett-Packard 12930 universal interface cards and cable. Data are transmitted between 10 and 20 kHz under program control. Pattern Recognition Procedures. The training set consisted of 2310 synthetic curves which were generated with the numerical solution (11) of Nicholson. Synthetic curves were obtained for 11 $ values, 7 a values, and 3 n ranges as given in Table I. Psi is calculated from Equations 1, 2, and 3 ,
'Present address: The Procter & Gamble Company, Winton Hill Technical Center, 6071 Center Hill Rd., Cincinnati, Ohio 45224. 0003-2700/79/0351-0829$01 .OO/O
C 1979 American Chemical Society
830
ANALYTICAL CHEMISTRY, VOL. 51, NO. 7, JUNE 1979
0.7
0 .6
0.5
0.4 0.3 /
L
,
T
"
. l . T l ' i
,c
Figure 1. Effect of n on shape of stationary electrode polarogram, $ = 10.0, cy = 0.5. PSI
Figure 3. Effect of a on shape of stationary electrode polarogram. $ = 0.10, r = 2
Table 11. Average Percent Deviations of Predicted Kinetic Parameters from Synthetic Data
0.02 0.10 0 .a0
2.0 10.
.-,1s>c
Flgure 2. Effect of $ on shape of stationary electrode polarogram. n = 2, cy = 0.5
where u is the scan rate and the other symbols have their usual significance. For each combination of +, a , and n,ten curves were generated with a random number generator to vary n within the range. For the prediction set, each voltammogram received identical numerical treatment so that only the shape of the voltammogram would influence the classification, not the magnitude or peak potential. The synthetic data were processed by first scaling the peak height of each voltammogram to 1.0; selecting 96 points before and including the peak, and 32 points after the peak; pseudo-rotating (12) this selected portion of the voltammogram; and then taking the Fourier transform of the rotated curve taken by SUBROUTINE FORT (13). This process of feature extraction eliminates peak magnitude and peak position information. Real data voltammograms were processed in an identical procedure except, (1) the curves were background-corrected by blank subtraction, and (2) after the voltammogram was scaled to 1.0, it was smoothed by application of a Fourier transform smooth (14).
Real data whose kinetic parameters were predicted by this method were obtained under digital control with the laboratory computer system. The data were displayed on the storage display monitor and, if acceptable, were transferred via the computercomputer interface to the pattern recognition computer which transforms the voltammogram to the Fourier domain.
RESULTS AND DISCUSSION The effects of each of the heterogeneous kinetic parameters on the shape of stationary electrode polarograms are illustrated in Figures 1, 2, and 3. Since the prediction procedure eliminates peak height information by scaling all curves, and it eliminates peak position information by selecting the region of the voltammogram relative to the peak, only curve shape effects are important. In Figure 1,the effect of an increased number of transferred electrons is t o sharpen the voltammogram. The effect of decreasing $ in Figure 2 is to broaden the voltammogram. The influence of cy as shown in Figure 3 is only seen under quasi-reversible and irreversible conditions
N o weighting of nearest neighbors.
Autoscaled data. Reciprocal relative distance weighting. Autoscaled data. Reciprocal relative distance weighting. Non-autoscaled data.
1L 35.4
ca
n
19.3
2.6
28.5
16.2
3.1
85.6
17.2
5.2
(Le., for $ < 7 ) . An increased cy value sharpens the voltammogram, and this effect is more pronounced under more irreversible conditions. Since the prediction procedure will extract all three kinetic parameters from a single voltammogram, the best choice of conditions is where all three factors influence the wave shape. If the electrochemical process is reversible, a cannot be determined accurately since it will not affect the wave shape. In order to implement the prediction procedure, a more compact representation of waveshape information from the voltammogram is desired. Such a representation is the Fourier transform as suggested by Foley (15). T h e synthetic data curves were represented by their Fourier transform coefficients according to the procedure in the Experimental section, and the data were autoscaled (16). T h e curves were stored in a random order in the disc file. The potential for success of this prediction procedure was demonstrated by dividing the training set into two subsets and using one of these as a synthetic prediction set. T h e kinetic parameters were predicted for the first 50 members of the training set with the use of the remaining training set for comparison. (Because of the large size and random order of the data set, this approach is comparable to a leave-one-out procedure.) The first 20 Fourier coefficients were used in the distance calculation and the four nearest neighbors in the prediction calculation. The predicted kinetic parameters were an average of the parameters of the four closest neighbors. The results in Table I1 are the average percent deviations of the 50 predictions. These percentages are good considering the quantization in the training set and the large range of conditions for the electrochemical behavior. The desire for a more continuous parameter value in the training set is offset by the enormous number of patterns required when the range of the kinetic parameters is considered. Even with these errors, the prediction procedure clearly distinguishes between quantized parameter levels. An improvement in prediction accuracy is achieved if a weighted prediction calculation is used. The reciprocal of the relative distance from the pattern in question to the nearest neighbors in the training set was used to calculate a weighted average for prediction, and these results are given in row 2 of Table 11.
ANALYTICAL CHEMISTRY, VOL. 51, NO. 7, JUNE 1979
831
Table 111. Literature Values of Heterogeneous Kinetic Parameters for Electrochemical Systems
D, x 106, system
k,, cmls
C d 2 +in 1.0 M Na2S0, TI' in 1.0 M KNO,
M n 2 +in 1 . 0 M KCI P b 2 +in 1 . 0 M HCIO, P b 2 +in 0.9 M NaCIO, 0.1 M NaOH
0.15 ( 1 7 , 18) 0.3 (20) 5X (23) 0.9 ( 2 4 ) 6.0 x 10.'- ( 2 4 )
+
Table IV. Fourier Coefficients for a Typical Voltammograma coefficient no.
value 3.19 7.91
1
3 5 7 9
- 11.3
-0.219 4.02 - 1.41 - 1.15 0.744 0.0955 - 0.327 0.0261 0.0677 -0.0524
11
13 15 17 19 21 23 25 a
Coefficients ordered as increasing frequencv
T h e accuracy of this prediction procedure was tested with real electrochemical systems of known behavior. T h e major prerequisite of this procedure of kinetic parameter evaluation is that the electrochemical behavior of the chemical system in question must match the assumptions under which the theoretical voltammograms were generated. The theory which was used here assumes an uncomplicated electrode process characterized only by heterogeneous electron transfer kinetics and diffusion mass transfer. Coupled chemical reactions or adsorption on the electrode surface will alter the shape of the voltammogram so that it will not be represented by a theoretical curve in the data file. The chemical systems listed in Table I11 were chosen because their kinetics are well characterized and they are assumed to be uncomplicated electrode processes. In order to calculate $ as used by Nicholson ( I I ) , the heterogeneous rate constant, the transfer coefficient, and the diffusion coefficients must be known. For each chemical system studied here, these values are listed in Table 111. T o demonstrate the application of this procedure to real chemical systems, we have conducted voltammetric experiments as nearly as possible under the same conditions as other
LY
0.30 ( 1 7 , 18) 0.8 ( 2 0 ) 0.35 ( 2 3 ) 0.5 ( 2 4 ) 0.51 ( 2 4 )
D, x
lob,
cm 1s
cm'/s
6.00 ( 4 ) 17.5 ( 2 1 ) 7.2 ( 2 3 ) 10.0 ( 2 4 ) 10.0 ( 2 4 )
16.1 ( 1 9 ) 9.9 ( 2 2 ) 9.0 ( 2 3 ) 13.6 ( 2 5 ) 13.6 (25)
workers. The first system chosen for the prediction procedure was cadmium(II), and the results did not closely match the results of other workers. The distances from the real pattern to its nearest synthetic neighbors were much larger than the corresponding distance for a synthetic pattern of the same assumed kinetic parameters, and replicate voltammograms did not give reproducible predictions. These problems were attributed to noise on the real voltammograms. T h e Fourier transform can be used to separate the noise components from the primary signal. The major signal components should be concentrated a t low frequencies in the Fourier domain since the signal changes relatively slowly with time as compared to the noise. Therefore, low frequency components should be more reliable than high frequency components for prediction based on wave shape. Table IV lists values of the Fourier coefficients for a typical voltammogram. Obviously, the coefficients corresponding to higher frequencies have lower absolute magnitudes. Thus, t o minimize the influence of noise which would contribute more to the higher frequency values, the data were not autoscaled, taking advantage of the natural weighting of the coefficients. T h e theoretical prediction accuracy for this method of data representation is given in row 3 of Table 11. This method is less accurate than the autoscaled representation, but more suitable for real data. With this method, the use of more coefficients in the distance calculation has little effect because of their decreased magnitude. For a data set with higher amplitude noise components, i t might be more appropriate to use autoscaled features with weights assigned to diminish the influence of higher frequency components. T h e chemical systems listed in Table I11 were used to test this prediction procedure. The results are given in Table V. All blanks were the result of 50 ensemble averages and the voltammograms were ensembled 20 times. This was done to further reduce the influence of noise on the prediction. T h e signal-to-noise ratio was 1200 for the thallium reduction. In this case the noise was measured as the standard deviation of the first 20 data points a t the background current level. Concentrations of electroactive species were adjusted to give nearly equal peak currents for each chemical system. T h e numbers reported in Table V are the average of five replicate
Table V. On-Line Kinetic Parameter Prediction Results system M C d 2 +in 1.0 M Na,SO,
6x 1x
M TI' in 1.0 M K N O ,
7 x
M Mn'-'in 1.0 M KC1
5x M P b 2 +in 1 . 0 M HC10, 7 x
-
M P b 2 +in
0.1 M NaOH 0.9 M NaClO,
d' predld' lit 4 . 0 t 0.2 3.4 4.4 0.5 8.1 0.70 t 0.005 +_
1.1
5.0
+_
0.
17.0
0.50
1.2 t 0.1
0.38 0.03 0.51
1.1
npredln
prCdlCY lit
0.35 t 0.2 0.30 0.68 F 0.02 0.80 0.26 f 0.001 0.35 0.20 i 0. +_
2.04
_
+_
_
0.008 -
2
1.01 f 0.01 1
1.97 t 0 2
2.08
t
0.004
2 2.13
+_
2
0.1
~
ANALYTICAL CHEMISTRY, VOL. 51, NO. 7, JUNE 1979
832
~
I
I
I
I
I
1
Feature space plot of real and theoretical data. s = theoretical data. Real data: 1 X M Zn(I1) in 1.0 M KCI and 1.0 m MHCI. (1) = 1.00 VIS, ( 2 ) 2.00 V/s, (4) = 4.00 V/s. Axes are Fourier transform coefficients 4 and 5 Figure 4.
predictions and the uncertainty is the standard deviation of this average. For these predictions, 15 coefficients were used in the distance calculation and three nearest neighbors in the prediction calculation. The prediction values match the literature values very well except where the electron transfer is reversible ($ > 7 ) , because changing 4 or a has little effect on t h e wave shape under these conditions. The data could be retaken at increased scan rate to observe the change in 4. However, even an increase to 4.00 V / s would reduce 4 by only one-half. With the uncertainties involved and the quantization inherent in the training set, the changes would be difficult to demonstrate. The present instrumentation prevents scan rates higher than 4.00V/s from being easily obtained. This prediction procedure has several diagnostics which indicate when the voltammograms may not be acceptable for prediction. The first diagnostic is the distance measure. The theoretical data provide an indication of an acceptable distance measure for a given set of kinetic parameters. If the calculated distance greatly exceeds the expected distance, an abnormality in the data can be assumed. Also, if replicate voltammograms d o not produce similar predictions, the data can be suspect. These abnormalities can be caused by excessively noisy data or, more seriously, by data whose chemical behavior does not agree with the assumptions of the theoretical data. An evaluation of agreement between real and theoretical data can be made by a two-dimensional plot of coefficients obtained from t h e Fourier transform. Such a graph of Zn(I1) data in 1.0 M KCl and 1.0 m M HC1 is Figure 4. The real data do not lie with the synthetic data, which is indicative of a complicated electron transfer. In this case the Zn(I1) adsorbs on the mercury electrode prior to electron transfer (26). Therefore, the kinetic parameters for the Zn(I1) system cannot
be predicted with this procedure. These Zn(I1) data suggest that more information is present in the data than is currently being used. An alternative similarity measure to the Euclidean distance was investigated. This method simply normalized all current-voltage curves to 1.0, selected a specified number of points before and after the peak, and calculated the sum of the absolute value of the residuals between the voltammogram in question, and each curve in the training set. The most similar curve has the smallest residual value. This alternative similarity measure performs as well as the method based on Euclidean distance in Fourier space, but it is not sensitive to unacceptable voltammograms. For example, both methods were used to predict the kinetic parameters of a set of voltammograms which actually consisted of severely overlapped, two-component systems. The overlapped curves were formed by taking different linear combinations of two Cd(II), voltammograms displaced