Computerized method for mechanistic classification of one-electron

Computerized method for mechanistic classification of one-electron ... Regression analysis of electrochemical data with expanding space grid digital s...
3 downloads 0 Views 777KB Size
Anal. Chem. 1903, 55, 1713-1718

1713

Computerizled Method for Mechanistic Classification of One-Electron Potentiostatic Current-Potential Curves James F. Rusling Department of Chemistry, U-60, University of Connecticut, Storrs, Connecticut 06268

A computerized method for mechanistic ciasslfication of oneelectron, potentiostatic current-potential curves has been developed. The proceidure is guided by a hierarchical tree, the branch points of which represent binary decisions based predominantly on deviation-pattern recognition. Characterlstics of standard mechanisms are stored In binary form within the program, and confirmation of the final result is provided by nonlinear regression analysis. An acceptable classification is reached when regression of the data onto the approprlate equation gives a randalm deviation plot and reasonable parameters. Tests with simulated poiarogrems showed that for systems exhiblting a single wave, 100% correct ciasslfication resulted when random noise was smaller than about 0.85 % of the limiting current.

Dc polarography, normal-pulse polarography (NPP), normal-pulse voltammetry (NPV), and rotating-disk-electrode voltammetry (RDV) are considered potentiostatic methods since the scan rate is generally small enough so that current is measured at an essentially constant potential. The shapes of the current-potential (i-E) curves delpend on the mechanism of the electrode reaction. Classically, analysis of the shapes of curves obtained with these techniques has taken the form of “log plots” of E vs. a logarithmic function of i ( I , 2). Identification of the most likely mechanism consisted in finding the “best fit”of i-E data to straight lines with different logarithmic forms repreRenting different imechanisms. There are serious inherent disadvantages to this approach, however. The first is that some reactions, notably quasi-reversible electron transfers, yield1 nonlinear log plots. Second, differences in wave shape for closely related mechanisms are small, and linearization makes these shapes harder to distinguish by propagating error in the logarithmic term. Nonlinear regression avoids some of the problems of classical logarithmic analysis by removing the linearization step. Analysis of dc polarograms by nonlinear regression onto equations of the form i = f(E) has been used to differentiate between single and overlapped waves (3) and between waves controlled by diffusion or by reversible electron transfer followed by dimerization ( 4 ) . Differentiation between mechanisms in the above cases was facilitated with plots of Ai = [i(measd) - i(calcd)i]/SD vs. E, known as deviation plots (3,5). Here, SD is the Fitandard deviation of the regression, and i(measd) and i(ca1ed) are the measured and calculated currents a t potential E. A random deviation plot indicates a good fit of the regression equation to the data; a nonrandom plot indicates that the equation does not describe the experimental data. In many cases, nonrandom deviation plots have characteristic shapes which can lead to a choice of the correct model to describe the data, using a procedure known as deviation-pattern recognition (5). Recently, the advantages of automating methods for computerized analysis of electrochemical data have been realized (6-10). Meites and Shia developed a method for automatic classification of charge vs. time curves firom controlled-po-

tential coulometry using deviation-pattern recognition (6, 7). Reilley et al. (8) automated the analysis of data from double-potential step chronoamperometry, chronocoulometry, and chronoabsorptimetry by simplex fitting of results to a library of stored working curves representing various electrochemical mechanisms. Harrison (9) described an instrumental system combining five electrochemical methods with the capability for fitting data to a library of mechanisms. Schachterle and Perone (IO) used pattern recognition to classify Fouriertransformed cyclic voltammograms using ir, nearest neighbor methods and obtained correct mechanisms for 93% of experimental data. The aim of the work discussed in this paper is to develop a method for automated, computerized mechanistic analysis using potentiostatic current-potential curves. Such a system should include a classification scheme for rapidly locating the most likely mechanism, low requirements for storage space, and a final confirmation of the mechanism by fitting the data and computing relevant physical parameters. Of the techniques discussed above, only that of Meites and Shia (6, 7) fulfills these criteria. The methods of Reilley et al. (8) and of Harrison (9) require either fitting of the data to all the mechanisms in the library, a formidable computational task, or an arbritrary choice by the chemist of the “best” members of the library to employ. Pattern recognition yields a potentially faster classification (IO), but does not provide a final confirmation of the mechanism. Several ( 8 , I O ) of the above methods require storage of large data files. Thus, we have based our technique for potentiostatic i-E curves on deviation-pattern recognition, employing a hierarchical tree for rapid location of class. Characteristics of standard mechanisms are stored in binary form within the program, and confirmation of the final mechanism is provided by nonlinear regression analysis. The computations were designed to be used with a 48K microcomputer but could be used with any computer with sufficient memory. In this paper, development and operation of the method, as well as results obtained with noisy simulated polarograms, are described. Applications to experimental data from dc polarography, NPP, NPV, and RDV will be discussed in a subsequent report (11).

THEORY The mechanisms considered in this study have been limited to one-electron reductions or oxidations. It is assumed for reductions that only the oxidized form of the redox couple (or the reduced form for oxidations) is initially present in solution and that the limiting current indicates a one-electron process by comparison with standard diffusion-controlled reactions. The latter restriction rules out electrode processes controlled by the rate of a preceding or catalytic homogeneous reaction, the limiting currents of which are easily recognized as smaller or larger than, respectively, the diffusion-controlled limiting current. Mechanistic classes considered, equations describing i-E curves, and parameters for regression analyses me listed in Table I. All mechanisms give sigmoid i-E curves, with slightly different shapes for each class. First-order CE and EC mechanisms have potentiostatic i-E curves identical with diffusion-controlled waves when the chemical steps are

0003-2700/83/0355-17 13$01.50/0 0 1983 American Chemical Society

1714

ANALYTICAL CHEMISTRY, VOL. 55, NO. 11, SEPTEMBER 1983

Table I. Mechanistic Classes Considered for Automated Analysis of Potentiostatic i-E Curves class

reaction scheme

1. EC2

0 f. R ; 2R ---+ D

2. DIFF-R

O+R Z+O;O+R 0 +R; R +P 0-R

(EC)("

3. DIFF-I 4. EC2A 5. D-M

0 + Ra&; 2Ra,3,

6. QRC

O*R

equation b

---+

D

El,, IS, il El,, ,s, il

1,4,12,13 1-3

i = ilB/(l + e )

El,,, 8, il

172

i+

El,,,S, i, El,,,S, i,

13 14-16

> 0.029

- ilB = 0 i t ei2 -i,B=O

hr

i = ilB'/(l t e

kr

7. W2d

empirical rnodelforads.,rnaxima

8. NOMECH

ref

i+ - ilB = 0 i = ilB/(l + e ) S < 0.029 S

O + 2R

parameters

+ k ' e ')

i = ilB'[{f/(l + e l ) } + {(I - fY(1 + e ,111

E',Sa-'; k ' 1, 1 7 orE",k' f,E1/2,1,S1,E1,2~,SZ 3,4

(" Cannot be distinguished from diffusion-controlled reactions when chemical step is fast. e = exp[(E - E,,,)/S];S = R T / F ;B = [ l - b(E - EIj2)];b = (plateau slope - base line slope). Quasi-reversible reaction: e ' = exp[(E - E")/Sa-l], k ' = 1.13D,"2/k0~t1/* for dc polarography where koshoisthe standard heterogeneous rate constant, t i s drop time, and D o is the diffusion coefficient of 0. For RDV k ' = m,/k sh where m, is the mass transfer coefficient for 0. B' = [1- b(E - EM)];e i = exp[(E - El,,.i)/Sj], j = 1, 2; f = i l . , / i , ; EM = effective E,,, for total wave.

-2

t

Fl

PR

N O MECH

\

0.05

E - E2,

I

v

m

OIFFR

OIFFI

- 0.05

Flgure 1. Type A and B theoretical devlation patterns.

fast. Thus, three mechanisms comprise each diffusion-controlled class (DIFF-R, DIFF-I, and QR). The overlapping wave (W2) class has been included because of its ability to serve as an empirical model describing polarograms influenced by weak adsorption (3) or weak maxima ( 4 ) . The automated classification scheme employs binary decisions, most of which are based on the shapes of deviation plots obtained after nonlinear regression analysis of i-E data. If the equation used to fit the data is correct, the deviation plot consists of a random scatter of points around a residual (Ai) of zero. If the equation is incorrect, systematic errors of interpretation cause the deviation plot to consist of points randomly scattered about a smooth curve with a characteristic shape. Comparison of this shape with "deviation patterns" obtained from theoretical data representing different possible types of electrochemical behavior is then used to help choose a more correct equation for additional regression analyses (3, 5-7). Theoretical deviation patterns were obtained by regression of data simulated with the equations in Table I onto each of the equations of classes 1to 5. Three main types of deviation patterns (Figure 1)were found: type A, having two maxima and one minimum along the E axis in the order max-minmax; type B, having two minima and one maximum along the E axis in the order min-max-min; and type C, with random scatter around a residual of zero. The shapes of the deviation plots can be used to differentiate the mechanisms into groups. Plots obtained after regression of data representing each

QR

A,

WZ

D

A

E

C

H

NO MECH

Flgure 2. Hlerarchlcal tree for Classification of mechanisms.

mechanism onto the EC2 equation, for example, fall into three groups: a random deviation plot confirms the EC2 mechanism; a type A plot indicates that the data arose from DIFF-R, DIFF-I, D-M, or W2 mechanisms; a type B plot suggests the ECZA mechanism. Deviation plots for the QR mechanism fall into all of the three classes, depending on the value of the standard heterogeneous rate constant. Deviation plots for the other mechanisms were very well defined, however, since their parameter values are closely fixed by theory. This was true even for the DIFF-I and W2 mechanisms, where some variation of the theoretical parameters is allowed. The hierarchical tree (Figure 2), which guides the classification process, was developed from the clustering of mechanisms obtained by observing the shapes of deviation patterns. For example, regression of simulated polarograms onto an arbritrarily chosen equation from each multimember group obtained after the EC2 regressions provided further differentiation. This process was continued until each group contained a single mechanistic class. Each branch point in the completed tree represents a binary decision (Table 11) based on the results of a prior regression analysis. In a typical classification, data are acquired as a series of currents between 5 and 95% of the limiting current. Points are equally spaced along the E axis. Initial input required by the program consists of estimated limiting current, standard

ANALYTICAL CHEMISTRY, VOL. 55, NO. 11, SEPTEMBER 1983

Table 11. Decisions in Automated Classification Scheme no. follows:a 1.

EC2

2. 2'.

EC2/1 EC2/2

3.

EC2/1

decision

if 0.022

5.

if if EC2A/3 if if QR/2 if

6.

DIFF/3

7.

DIFF/6

8.

DIFF/6

9.

D-M/7

4.

2

if D P r a n d o m - 4 2 if DP nonrandom --+ 8 if S > 0.029 -4 5 if S < 0.022 -+ NO MECH

11. W2/10

0

< S < 0.029 -+ EC2

DP type A --+ 6 DP type B -+4 DP random and S < 0.029 EC2A DP nonrandom or S > 0.029 --+ 5 DP random --+ QR if DP nonrandom -+ NO MECH if DP random --+ 8 if DP nonrandom -4 7 if DP type A 9 if DP type B --+ 10 if S > 0.029 DIFF-I DIFF-R if S < 0.029 if DP random --$ D-M if DP nonrandom -4 NO MECH if DP random -+ QR if DP nonrandom -4 11 if DP random -4 W2 if DP nonrandom --$ NO MECH

Io:o 0

--f

-21

0

O

o

0

0

--+

--+ --+

10. QR/7

I

1715

Flgure 3. Deviatlon plot from regression of simulated data from a DIFF-R polarogram contalning 0.85% noise onto the EC2 equation.

a Immediately previous curve fit/immediately previous decision number. DI' = deviation plot. Number following arrow indicates next decision to he made. If mechanistic class follows arrow, classification ends there.

error of the current measurements, the potential of the first current value, and the difference ( b ) between the slopes, in hA/V, of the plateau and the base line (cf. Table I). The first nonlinear regression anallysis is onto the EC2 equation. Initial parameters (Table I) are Ellz,calculated from the i-E data, S = 0.0257 V, and the estimated limiting current. The nonlinear regression subroutine (18) returns least-squares estimates of the best values of the parameters and constructs the deviation plot. Decision number 1 (Figure 2 and Table 11) must now be made, which results in routing to decisions 2 or 3, and so on. Decisions imay require an additional regression analysis, testing of parameters, or acceptance of a mechanistic hypothesis and termination of the program. If additional regression analyses are required, the new initial value of 9 corresponds to its theoretical value in the new equation. Initial values of il and Ellz are taken from the previous fit. If a fit to the QR equation is neaeded, initial values of parameters are determined from empirical equations developed from simucalculated lated data and based on the value of E3,4from the i-E data. Fivo-parameter fits to the W2 equation are carried out with a separate regression program. These are discussed in more detail in the following paper (11). A fundamental part of the classification procedure is a computer routine which determines the shapes of deviation plots. Because of a large SIN inherent in the deviation plots (Figure 3), the routine for their analysis must extract features (i.e., peaks) which may be nearly obscured by noise (5-7). Following a suggestion biy Meites (6),the deviation plot was smoothed by calculating D, from the residuals Ai,, where N+2

D, = C Aij j=N

(11

and N goes from 1 to N,, the total number of data points. The smoothed deviation plot (Figure 4) reveals features which are more easily identified by the computer. These considerations become more important as S I N decreases. The smoothed deviatijon plot is analyzed for the presence or absence of peaks by wccessively testing absolute values of D, to see if they exceed 3.3 (SD). Assuming only normally distributed error in i, the probability that any value of Ai

L 0.05l

,

I

~

- 0.05

Flgure 4. Smoothed deviatlon plot obtained from data In Figure 3. Bokl

numbers at top denote zones for deviation-pattern recognition. would exceed 1.1 (SD) is 0.136. The probability that D, would Thus, 3.3 (SD) exceed 3.3 (SD) is (0.136)3 = 2.5 X corresponds to a threshold definition for peaks in the smoothed deviation plot. If no value of D, exceeds this threshold, the deviation plot is considered random. If values of D,do exceed 3.3 (SD), the signs and N values of the largest positive and negative peaks are stored. Subsequently, several values of D, on either side of each peak are eliminated and the remaining D, are tested for values which exceed 3.0 (SD). Since theoretical deviation patterns exhibited at least three peaks each, the threshold has been lowered in this second test to increase the probability of locating the secondary peaks. Signs and N values of any secondary peaks are also stored. To facilitate a decision about randomness of the deviation plot, peak information is condensed into the binary number F4 F3 F2 F1 with initial value 0000, from which its base 10 representation FT is calculated. If at least one positive peak has been found F1 = 1; if there is one negative peak, F2 = 1. If two positive peaks are found F1 = F3 = 1;for two negative peaks F2 = F4 = 1. If only one peak of any sign has been found, FT I 2 and the deviation plot is considered random. This definition has been chosen because of the finite probability (2.5 X 10") that a single peak represents an outlier. If FT > 2, the smoothed deviation plot is tested for the type A and type B shapes. For this purpose, the plot is divided along the E axis into four zones of approximately equal width (c.f. Figure 4). These zones are defined by the regions in which peaks were found in theoretical deviation patterns (Table 111). The peak information is stored in the binary numbers J 1 J 2 53 54 and K1 K2 K3 K4, both with initial values of oo00,from which the base ten representations JP and KN are computed.

1716

ANALYTICAL CHEMISTRY, VOL. 55, NO. 11, SEPTEMBER 1983

Table 111. Peak Position Ranges for Theoretical Deviation Patterns Obtained after Nonlinear Regression onto the EC2 Equation class

Ntb

EC 2 D-M DIFF-R w2 DIFF-I EC2A

26 26 35 32 34 30

ranges for peaks, N u 1 2 3 4

DP typeC C

0-7 0-8 0-7 0-7 0-7

9-17 9-19 8-18 8-18

8-19

18-23 20-29 19-27 20-28 19-24

a Ranges given in data point number, N . ber of data. See text.

24-26 30-35 28-32 29-34 25-30

A A

A A B

Total num-

Table IV. Characteristics for Analysis of Nonrandom Deviation Plots first peak found

characteristics

max in zone 1 min in zone 2 max in zone 3

~Ae 8 < JP < 10; 0 Q KN Q 5 O Q J P i 2;4