Computerized learning machine applied to qualitative analysis of

the specification of optimum categorization criteria. Results indicate that resolution is limited by the pre- cision of the experimental data and the ...
0 downloads 0 Views 724KB Size
Computerized Learning Machine Applied to Qualitative Analysis of Mixtures by Stationary Electrode Polarography L. B. Sybrandt and S. P. Perone Department of Chemistry, Purdue University, Lafayette, Ind. 47907

A computerized learning machine approach has been evaluated for qualitative analysis of mixtures by stationary electrode polarography. The limiting effects of concentration ratios, degree of peak overlap, and peak potential variation have been investigated. The results were evaluated not only from an overall statistical accuracy, but also the specific nature of errors made was investigated. This procedure allowed for the specification of optimum categorization criteria. Results indicate that resolution is limited by the precision of the experimental data and the concentration ratios. Ideally precise overlapping polarograms could be identifed with 2-mV peak separation over a 20-fold range of concentration ratios. For data obtained under ordinary experimental conditions, resolution was limited to peak separations of 35-40 mV for about a 10-fold range of concentration ratios. Peak identification could be performed reliably under conditions where peak overlap precluded visual resolution.

results were not evaluated solely on a statistical basis, but the nature of specific errors made was considered. Finally, attempts were made to optimize decision criteria by methods not previously reported. The theory of learning machines can be found in texts by Nilsson (7) and F u (8). The mathematical approach discussed below is essentially that of Jurs, Kowalski, Isenhour, and Reilley (2). A two-category patterns classifier can be defined by a discriminate function which is a scaler, single valued function of the pattern. If the patterns to be classified are assumed to be linearly separable, then the discriminate function takes the form

CHEMICAL MEASUREMENTS have often been plagued with the problem of overlapping signals from two or more species in a mixture. This overlap sometimes results in difficult qualitative analysis of the sample. Mass spectral analysis, ultracentrifuge schlieren pattern analysis, and kinetic analysis are examples where this problem may exist when one must deal with mixtures. These authors are particularly concerned with this problem in electroanalytical methods. The work presented here demonstrates the applicability of binary pattern classifiers to the qualitative analysis of mixtures by stationary electrode polarography. Binary pattern classifiers have recently been used for molecular formula determinations from low resolution mass spectra ( I , 2). They have also been applied to the classification of infrared spectrometry data according to chemical class (3), and to the interpretation of combined mass spectra, infrared spectra, and melting and boiling points for the determination of certain chemical characteristics, such as double bond presence ( 4 ) . Multicategory pattern classification by least squares has been used for molecular formula and molecular weight determination from mass spectra (5). In each of the above applications, the data patterns resulted from measurements on a pure chemical species. Wangen and Isenhour (6) have recently used a learning machine approach for semiquantitative analysis of mixed gamma-ray spectra. The work to be described here demonstrates a different learning machine approach to the qualitative analysis of mixtures. This approach included consideration of experimental deviations in real analytical data. Qualitative

xd+l

(1) P. C. Jurs, B. R. Kowalski, and T. L. Isenhour, ANAL.CHEM., 41, 21 (1969). (2) P. C. Jurs, B. R. Kowalski, T. L. Isenhour, and C. N. Reilley, ibid., p 690. (3) B. R. Kowalski, P. C . Jurs, T. L. Isenhour, and C. N. Reilley, ibid., p 1945. (4) P. C . Jurs, B. R. Kowalski, T. L. Isenhour, and C. N. Reilley, ibid., p 1949. ( 5 ) B. R. Kowalski. P. C . Jurs, T. L. Isenhour, and C . N. Reilley, ibid., p 695. ( 6 ) L. E. Wangen and T. L. Isenhour, ibid., 42,737 (1970). 1

,

382

ANALYTICAL CHEMISTRY, VOL. 43, NO. 3, MARCH 1971

d+l

s =

(1)

wixi

i=l

where xi is the ith component of a pattern having d points, equals one, wi is the weight corresponding to the ith component, and s is the scaler result. The category in which a given pattern is placed is determined by the sign of s. The convention used here is

> 0 species present s < 0 species absent.

s

(2)

+

The set of weights, w1,wz, . . ., w ~ +represents ~, a (d 1)dimensional weight vector. This weight vector is determined by an error-correction procedure. The vector is initiallized arbitrarily to all ones. A set of representative patterns, the training set, is presented sequentially to the classifier. When a pattern is incorrectly categorized according t o Equations 2, the weights of Equation 1 are immediately adjusted in a manner to correct that error. The adjusted weight, w i ’ , is defined as Wi’ = Wi

+ cxi

(3)

+

for i = 1, 2, . . ., d 1, where c is the correction increment. The correction increment is determined such that the new scaler will equal the original scaler in magnitude, but have the opposite sign. Thus, c, as derived by Jurs et al., (2) is simply - 2s c=-

(4)

d+l xi2

i=l

If the training set is linearly separable, this procedure will converge to a single weight vector which can correctly classify all the patterns, Training is terminated when the vector is perfectly trained, or when a preset number of iterations through the training set is reached. The above training procedure is repeated for each desired component of the mixture. Thus, there is a weight vector corresponding to each species to be identified. (7) N. J. Nilsson, “Learning Machines,” McGraw-Hill Book Co., New York, N. Y . ,1965. (8) K. S . Fu, “Sequential Methods in Pattern Recognition and Machine Learning,” Academic Press, New York, N. Y., 1968.

A large number of patterns are usually required to train the weight vectors. For many analytical techniques, it is impractical to make measurements on actual mixtures to obtain this large training set. It is therefore necessary to have a means by which synthetic mixtures can be generated. For stationary electrode polarography (SEP), the current contributed by each species is usually additive over the whole curve in a mixture, This relationship is valid for mixtures of Cd(II), In(III), and Sb(II1) (9). Here, a set of standard curves for single-component solutions was obtained experimentally at one concentration. These standards were multiplied by random numbers to represent concentration variations and then summed to generate synthetic mixtures for training.

Table I. Distribution of Random Numbers Representing Concentrations for the Generation of 200 S E P Patterns of Cd(II), In(III), and Sb(IJ1) in the 0.1 to 2.0 Concentration Range Training set Prediction set Concentration Cd In Sb Cd In Sb 27 32 31 36 33 31 o.Oo0-0.000 10 11 9 8 8 10 0.1004.199 8 5 15 13 8 0.200.299 10 8 3 5 11 5 0.300.399 10 11 11 11 12 19 18 0.400-0.499 110 95 97 98 102 80 0.400-1.599 12 9 4 11 8 2 1.6W1.699 6 8 4 4 12 9 1.700-1.799 9 12 9 13 10 1.800-1.899 10 6 8 8 11 6 8 1.900-2.000

EXPERIMENTAL

Digitized stationary electrode polarograms were obtained for single-component solutions of Cd(II), In(III), and Sb(III), each at 5 x lO-5M in 1.OM HC1. The computer-controlled dual-cell apparatus is described elsewhere (9). The potential sweep was from -0,100 V to -0.800 V us. SCE. A sweep rate of 1.00 V/sec was used. Data were digitized at 2-mV intervals. Data were obtained under conditions approximating as closely as possible those under which routine electroanalytical data would be obtained. That is, procedures were careful, but not rigorous. Cell temperatures were ambient; instrument recalibration was not carried out before each run; solutions were not deaerated between each run; etc. All chemicals used were reagent grade. The distilled water was further purified by passage through a mixed bed cation-anion exchange resin. Three separate presumably identical solutions were prepared for each species. Each sample solution was placed in the cell, deaerated with high purity nitrogen, and 4 runs were obtained, with an ensemble average of 8 voltage scans per run. The averaged data for each sample were punched on paper tape for later use. These 36 polarograms are termed the standard curves. All learning machine experiments to be discussed were performed on a Hewlett-Packard 2116A 16-bit digital computer with an 8K-word core memory. Peripheral devices included a n ASR/35 teletype, high speed paper tape punch, high speed photoreader, Calcomp plotter, and a HewlettPackard Model 2020 magnetic tape unit. The external, rapid-access magnetic tape storage device was required because of the limited core memory size, since a large amount of data must be processed to train the weight vectors. All programs were written in Hewlett-Packard FORTRAN or BASIC. (These programs are available from the authors upon request.) LEARNING MACHINE PROCEDURE

Three separate programs were used to implement the pattern classifier. The first program constructed synthetic mixtures from the standard SEP curves. The standard curves were loaded into core memory uia the high speed photoreader. A set of random numbers, previously generated on paper tape, was then loaded uiu the reader in groups of three, each corresponding to a component in the synthetic mixture. The mixture was calculated and stored on magnetic tape, followed by the random numbers representing the concentrations in that mixture. Additional synthetic mixtures were generated with subsequent groups of random numbers until the desired number of mixtures was obtained for the training set, usually 200. A training program was then used to calculate weight vectors for qualitative analysis (9) W. F. Gutknecht and S. P. Perone, ANAL.CHEM.,42, 906 (1970).

using the training set data stored on magnetic tape. These vectors were trained sequentially because of core limitations. Each trained weight vector was punched on paper tape before proceeding with the next vector. A third program utilized the trained weight vectors for recognition and prediction. For this work, recognition will designate the qualitative analysis of mixtures generated by the same standard curves as used in training, and prediction will refer to the qualitative analysis of synthetic mixtures which are generated from other standard curves. RESULTS AND DISCUSSION

The random numbers used to calculate concentrations were varied from 0 to 2.0, with all numbers less than 0.1 set to 0. Since the standard polarograms were obtained at 5 X lO-jM, this represents a concentration range of 5 X 10-6M to 1 X lO-4M. A zero concentration is an important consideration for training, so the random number generator was further biased to provide an additional 10% probability of this occurrence. Thus, the total probability of a zero concentration for a given component is about 15%. The random number distributions for the training and prediction sets of 200 synthetic mixtures each are shown in Table I. Weight vectors trained on this concentration range are applicable to any other 20-fold range, provided a conversion factor is applied to the sufficiently precise data. For the sake of simplicity, concentrations will be referred to by the random numbers used to generate the synthetic mixtures. Peak locations for the 36 standard polarograms are shown in Table 11. Because of the manner in which these curves were obtained, the peak locations varied considerably more between samples than between runs of a sample. Maximum peak height variations were about 3, 4, and 5 Z, respectively, for Cd(II), In(III), and Sb(II1). A mixture of Cd(II), In(III), and Sb(II1) results in a polarogram with a well-resolved Sb(II1) peak, and considerable overlap of the Cd(I1) and In(1II) signals. A typical stationary electrode polarogram for an approximately equimolar mixture of Cd(II), In(III), and Sb(II1) is shown in Figure 1. This figure demonstrates the reasons for selecting this particular system for study here. The well-separated Sb peak provides a relatively uncomplicated identification problem, whereas the In-Cd peaks overlap so strongly as to create a very challenging analysis problem. Weight vectors can be perfectly trained at the 0.1-2.0 concentration range if the data fluctuation is kept to a minimum. A set of 100 mixtures randomly generated from a single run for each component was correctly dichotomized ANALYTICAL CHEMISTRY, VOL. 43, NO. 3, MARCH 1971

383

Table II. Peak Locations of Standard SEP Curves of Cd(II), In(III), and Sb(II1) us. SCE

Sample Species Cd Cd Cd In In

No. 1

2 3 1

2 3

In

Sb Sb

1

2

Sb

3 a

Run No. 1 -628 mV - 622 - 624 -588 - 582 - 580 - 154 - 156 - 156

H = High, M

=

Middle, L

=

No. 2 - 628 - 620 -624 - 588 - 582 - 580 - 154 - 160 - 156

No. 3 - 628 - 620 - 624 - 588 - 580 - 580 - 154 - 160 - 156

No. 4

- 626 -620

- 624 - 588

- 580 - 580 - 154 - 160

- 156

Relative locations H L M H M L L H M

Low. Table 111. Results of Training Weight Vectors to Qualitatively Determine Cd(II), In(III), and Sb(II1) in the 0.1-2.0 Concentration Range. Training Set = 200 Patterns Error Species Iteration corrections Cd 25 4 In 25 5 Sb 9 0 Training set: Cd Samples No. 1, 2, Runs 1-4. In Samples No. 1, 3, Runs 1-4. Samples No. 1 , 2, Runs 1-4. Sb

Figure 1. Stationary electrode polarogram of Sb(III), In(III), and Cd(I1) in 1.OM HCI

after 12 iterations for Sb(III), and 5 iterations each for Cd(I1) and In(II1). Perfect training also resulted with training sets comprised of all 4 runs of a given sample for each component. In these cases, the training sets consisted of 200 mixtures, 50 mixtures each of runs No. 1-4. Training was no more difficult than when a single run of each species was used, even though these sets incorporated the data fluctuation between runs. In all training situations above, recognition was perfect when the same data and different random numbers were used to generate the synthetic mixtures. However, prediction was often very poor using data other than those used in training. For example, percentage error in prediction was 15.0% for Cd, 5.0% for In, and 3.5% for S b when trained on all 4 runs of a given sample. More critically, however, Cd(II), In(III), and Sb(II1) were incorrectly classified in 100, 30, and 22 %, respectively, of the mixtures where these species were absent, (C = 0). Independent Effects of Peak Overlap and Uncertainty in Peak Location. A combination of peak overlap and data fluctuation adversely affects training, but either condition alone does not. A single Sb(II1) run was artificially shifted 1 data point cathodic and called a new species, X . As a mixture, this represents two identically shaped peaks separated by only 2 mV. The concentrations of Sb(II1) and X were randomly varied as before. It was possible to qualitatively identify either component in 100 binary mixtures after 4 iterations per weight vector. Conversely, the peak location of Sb in an Sb, In, Cd mixture has been allowed to vary as much as 8 mV, using 8 different runs in the training set. The Sb weight vector trained to perfection for mixtures in the 0.1-2.0 concentration range, regardless of the training difficulty observed for Cd(I1) and In(II1). Generation of a Representative Training Set. The training set should contain patterns which are representative of the 384

ANALYTICAL CHEMISTRY, VOL. 43, NO. 3, MARCH 1971

patterns which might occur in a prediction set. For the electrochemical data considered here, variations in peak locations must be represented in the training set. If signal variations are primarily instrumental in nature, a peak shift of a given magnitude and direction for one species will correspond to an equivalent shift for all other species in that mixture. By generating synthetic mixtures with constant relative peak locations, training is simplified since overlapping peaks will be n o closer for a given mixture than would be observed experimentally. To achieve a more representative training set from the polarograms shown in Table 11, the samples representing maximum (High) and minimum (Low) peak locations were chosen. However, consistent with the preceding discussion, each mixture was generated with either all high or all low curves for each component, using the random numbers of Table I. All runs of these samples were included in the training set. The results of training three weight vectors t o qualitatively determine Cd(II), In(III), and Sb(II1) in ternary mixtures are shown in Table 111. The right-most column represents the number of weight vector corrections necessary during the iteration designated. Only the Sb weight vector trained perfectly. A different weight vector was obtained after each error correction. These vectors differed in their ability to perform qualitative analysis. The four Cd and five In weight vectors obtained in the training process summarized by Table 111 are identified by the letters A thru E. Table IV depicts the different recognition abilities of these vectors. Analysis is based on the sign of s, as expressed in Equations 2. The Cd weight vectors, C and D, and the In weight vectors, A , B, and C, incorrectly classified a large percentage of mixtures where these species were absent (C = 0), and are not very useful for qualitative analysis. However, the Cd weight vectors, A and B, and In weight vectors, D and E, usually determine the absence of a component, and have a high probability of identifying the component when it is present. It is significant that the errors occur only a t the lower limits of concentration, and only when the interfering peak is

Table IV. Recognition Ability of Weight Vectors Obtained after Each Error Correction of Table I11 for Cd, In, and Sb C > 0 errors Weight C = 0 errors/ Minimax vector symbol Total errors Error, total C = 0 Concn range Cd:In ratio a. Cd 0.12-0.44 O.lOlO.22 0/31 3.5 A 7 0.12-0.27 0. 1010.17 5 2.5 0131 B ... ... 19/31 9.5 C 19 ... ... 12/31 6.0 12 D b. In *.. .,. 24/33 12.0 24 A ... ... 10133 5.0 B 10 ... 7/33 7 3.5 C 0.152.26 6.18/9.22 2.5 2/33 5 D 0.15-0.26 4.6619.24 0133 2.5 E 5 c. Sb ... 0 0131 0

Total errors

0

Table V. Prediction Ability of Best Weight Vectors of Table IV C > 0 errors C = 0 errors/ Error, total C = 0 Concn range a. Cd-weight vector B 0136 0.10-0.31 4.0 b. In-weight vector D 0127 0.13-0.13 0.5 In-weight vector E 0127 0.11-0.18 2.0 c. Sb-Perfectly trained weight vector 0 0132 ...

present a t a 5 : l or larger concentration ratio. The Cd weight vector B correctly classified 4 mixtures which had Cd:In ratios between 0.10 and 0.16. In was correctly identified by In weight vector D in 5 mixtures with Cd:In ratios between 6.6 and 13.6. The S b weight vector correctly classified all mixtures in the training set. The prediction ability of these weight vectors is shown in Table V. This prediction set was formed from Cd sample No. 3, In sample No. 2, and S b sample No. 3, runs No. 1-4, using the concentration distribution shown in Table I. These samples have peaks a t different locations than those used in training. The Cd weight vector B could not identify Cd in 8 of the 14 patterns which had C d : I n ratios of 0.25 o r smaller. The In weight vector D erred once, o n a mixture with a 9.85 Cd:In ratio. The S b weight vector correctly predicted all 200 patterns. Comparison of Recognition and Prediction Capability Using Fixed Set of Concentration Ratios. The results presented in Tables IV and V depend significantly o n the number of difficult mixtures which must be classified, and this will vary for different random number sets. A valid comparison of the recognition and prediction ability for various weight vectors can be made if the same set of concentrations, all known, is used for both cases. Here, training sets and prediction sets would be generated from different groups of standard curves, but identical sets of concentration ratios would be used. To achieve this end, a set of known mixtures was generated which have concentration ratios of varying degrees of difficulty. The complete set of mixtures is shown in Table VI for the 0.1-2.0 concentration range. This set was used for further work reported here.

Min/max Cd:In ratio 0.0610.25 9.8519.85 2.9919.85

...

A recognition set of curves was generated from the compositions of Table VI and the same standard curves used in training the weight vectors evaluated in Tables I11 and IV. The best weight vectors in Table IV were then used to classify this recognition set. The results showed that Cd can be correctly identified a t a 1 :10 or 1 :4 Cd:In ratio, depending on the sample tested. Likewise, In presence was recognized a t a 4 :1 or 10 :1 Cd :In ratio. All mixtures in which Cd or In were absent were correctly categorized. S b classification errors involved only zero or 0.1 concentration levels and the per cent error was 1 . 1 averaged over all samples. The prediction ability of the weight vectors was evaluated using the set of compositions given in Table VI and the standard curves used for Table V. The results were comparable to those for the recognition set. C d could be identified correctly a t a 1 :4 Cd:In ratio, and In was identified correctly at a Cd:In ratio of 7:l o r 5:1, depending on the particular standard curve used to synthesize the mixture. No errors in classifying S b were observed. Effects of Modifying Criterion for Classification. The weight vectors of Table IV which were considered acceptable classifiers were those giving the fewest C = 0 classification errors, based on the sign of s as shown in Equations 2. Only two of the four possible C d weight vectors were acceptable on this basis. However, the remaining weight vectors may become better classifiers if the decision is based on an s value greater or less than some positive number, s’, rather than the sign of s alone. The number s’ is set slightly larger than the largest value of s which occurs when a weight vector makes C = 0 classification errors in the training set. The two-category classification is then ANALYTICAL CHEMISTRY, VOL. 43, NO. 3 , MARCH 1971

385

Table VI. Concentrations of Cd, In, and Sb for Mixturesa Used to Test Weight Vectors Mix No. Species Concn Mix No. Concn Mix No. Concn Mix No. Concn Mix No. Concn Mix No. Concn 1 Cd 0.0 18 0.4 35 1 .o 52 1 .o 69 2.0 86 1 .o In 1 .o 0.0 0.5 2.0 0.5 1.o 1.o Sb 0.0 0.5 1 .o 0.2 0.8 2 Cd 2.0 19 0.5 36 2.0 53 1.0 70 2.0 87 1.0 2.0 In 1.0 0.0 2.0 0.6 1.0 2.0 Sb 1.o 2.0 1.0 0.3 0.9 Cd 0.6 2.0 54 3 0.0 20 37 1 .o 71 2.0 88 1.o In 2.0 2.0 0.1 0.7 1 .o 1 .o 0.4 2.0 1 .o Sb 1 .o 2.0 1 .o Cd 0.1 21 0.7 38 2.0 55 1.o 72 89 4 2.0 0.5 In 2.0 2.0 0.2 0.8 0.5 1 .o 2.0 1 .o Sb 1 .o 2.0 0.5 0.0 39 2.0 56 5 Cd 0.2 22 0.8 1.0 73 2.0 90 0.4 In 2.0 0.9 2.0 0.4 1.o 0.3 2.0 1 .o 2.0 1.o 0.6 Sb 0.0 57 91 Cd 0.3 23 0.9 40 2.0 1.o 74 2.0 0.3 6 In 2.0 1 .o 0.4 2.0 0.3 1 .o 2.0 0.7 1.0 0.0 2.0 Sb 1.0 58 92 Cd 0.4 24 1 .o 41 2.0 0.5 75 2.0 0.2 7 2.0 2.0 0.2 In 1 .o 0.5 0.0 0.5 2.0 0.8 Sb 1 .o 2.0 0.0 59 0.5 25 0.0 42 2.0 0.4 76 2.0 93 0.1 8 Cd 0.6 2.0 0.1 0.5 0.0 In 2.0 2.0 0.9 0.5 0.4 0.0 2.0 Sb 60 26 0.0 43 2.0 0.3 77 2.0 94 0.0 9 Cd 0.6 0.7 2.0 0.4 0.0 2.0 In 2.0 0.3 1.o 0.0 2.0 2.0 Sb 0.4 95 2.0 61 0.2 78 1.o 0.0 Cd 0.7 27 0.0 44 10 0.0 1 .o 1 .o In 2.0 0.3 0.8 0.0 2.0 0.3 2.0 0.2 0.0 Sb 2.0 62 0.1 79 1 .o 96 1.o 11 Cd 0.8 28 0.0 45 0.0 1 .o 1 .o In 2.0 0.9 0.2 0.1 0.1 2.0 2.0 0.2 Sb 2.0 63 2.0 80 1 .o 97 0.5 29 0.0 46 2.0 12 Cd 0.9 0.0 1.o 0.5 1 .o 0.1 In 2.0 0.2 1 .o 0.0 0.1 2.0 Sb 2.0 81 1 .o 0.0 47 1.o 64 1 .o 13 Cn 1 .o 30 0.0 0.0 0.0 1 .o In 2.0 0.3 1 .o 0.0 2.0 Sb 2.0 82 1.o 1 .o 65 1.o 14 Cd 0.0 31 0.0 48 0.1 2.0 1.o In 1.o 2.0 1 .o 1.o 0.4 1 .o 0.0 Sb 83 1.0 49 1.o 66 0.5 15 Cd 0.1 32 0.0 1 .o 0.2 1.o In 1.0 1.o 0.5 1.o 0.5 1 .o 0.0 Sb 2.0 84 1 .o 0.0 50 1 .o 67 16 Cd 0.2 33 1 .o 0.3 2.0 0.0 In 1 .o 0.6 0.0 1 .o 1.o 1 .o Sb 85 1 .o 68 2.0 34 2.0 51 1 .o 17 Cd 0.3 1.o 2.0 1 .o 1 .o 0.4 In 0.7 0.1 1 .o 1.0 1.0 Sb a Mix 24 = Mix 57 = Mix 88; Mix 31 = Mix 94; Mix 32 = Mix 95.

s s

- s' > 0 species present

- s' < 0 species absent

(5)

Tables VI1 and VI11 describe the ability of the weight vectors to classify the mixtures of Table VI using data of the training set and prediction set, respectively. Classification is based on the scheme shown in Equations 5 . Qualitative analysis of the training set data is not so good as that based on Eyuations 2 , and the weight vectors of Table V. The Cd weight vectors incorrectly classify C = 0 conditions where In is present at a 2.0 concentration. They also erred on mixtures 386

62 and 93, where In is not present or present in a minimal amount. Two of the In weight vectors gave similar results. Classification of the prediction set, Table VIII, gave better results than in Table VII. Moreover, the results were better than when using Equations 2 and the weight vectors of Table V. The Cd weight vectors again missed mixtures 62 and 93, but could classify correctly for Cd:In ratios of about 1:7 or larger. The In weight vectors B and C incorrectly categorized only 2 or 3 mixtures in the 10 : 1 to 20 :1 Cd :In ratio range. No incorrect C = 0 decisions were made. By classifying patterns according to Equations 5 , one can

ANALYTICAL CHEMISTRY, VOL. 43, NO. 3, MARCH 1971

~

~~~

~~

~~

~~

Table VII. Ability of Weight Vectors to Qualitatively Classify the Mixtures of Table VI Using the Data of the Training Set. Decisions Based on Equations 5 Mixtures Incorrectly Classified Set 4 Set l a Set 2 Set 3 a. Cd-weight vector C,s‘ = 4200 3 3 3 4-5 15 31 62 15 93 62 62 62 93 93 93-94 Cd-weight vector D, s’ = 3680 3 3 4-5 4 31 15 31 15 62 62 62 62 94 93 94 93 b. In-weight vector A, s’ = 5476 29 29 29 29 37-38 36 37-38 93 48-49 93 48-49 93 93 In-weight vector B, s’ = 3910 29 29 None 36 37-39 37-39 63 48-49 48-49 93 93 In-weight vector C, s’ = 3622 37-39 37-39 None 63 48-49 4849 93 93 Sets 1 and 2 corresponded to standard curves from Runs 1 and 2 of Cd, In, Sb Samples 1 (Table 11). Sets 3 and 4 correspond to standard curves from Runs 1 and 2 from Cd and Sb Samples 2 and In Sample 3 (Table 11).

choose the type of results obtained by an appropriate choice of s’. A large s’ will classify C = 0 conditions correctly, while incorrectly classifying smaller peak ratios. A smaller s’ will classify larger peak ratios correctly, but miss more C = 0 conditions. Thus, a compromise might be considered. Table IX shows the results of lowering s’ from 4200 to 1724 for the Cd weight vector C, and from 3680 and 1594 for the Cd weight vector D. Both of these lower s’ values were selected to allow only one C = 0 error on the randomly generated training set. Table IX refers to mixtures generated with the concentrations listed in Table VI. Comparing Table IX, part a, with Table VII, part a, it is observed that mixtures 62 and 93 are now correctly classified at the 0.1 concentration level for sets No. 1 and No. 2, whereas mixtures 14 and 32 are now incorrect. Overall, there are a considerable number of C = 0 errors for sets No. 1 and No. 2. However, sets No. 3 and No. 4 are classified very well by using the compromise s’, as Cd weight vector C only missed the 1 :20 Cd:In ratio. The prediction data, Table IX, part 6, shows a marked improvement in qualitative analysis by using the compromise s‘ as opposed to the maximum s’ (Table VIII) or minimum (s’ = 0). Cd can be identified at about 1 :10Cd:In ratios in the prediction set. Cd weight vector C correctly classified all mixtures of Table VI for sets No. 3 and No. 4 of the prediction set. In summary, inspection of specific errors made shows that two samples resulted in good qualitative analysis. These were one of the two used in training and the sample used for prediction. Cd indentscation was successful at about 1 :10 Cd:In ratios. No C = 0 errors were made on these

Table VIII. Ability of Weight Vectors to Qualitatively Classify the Mixtures of Table VI Using Data of the Prediction Set. Decisions Based on Equations 5 Mixtures Incorrectly Classified Set 4 Set 2 Set 3 Set 1“ a. Cd-weight vector C,s’ = 4200 4 4 4 4-5 15 15 15 15 62 62 62 62 93 93 93 93 Cd-weight vector D, s’ = 3680 4 4 4 4-5 15 15 15 15 62 62 62 62 93 93 93 93 b. In-weight vector A, s‘ = 5476 29 29 29 29 37 37-38 37 37 48 48 48 48 93 93 93 93 In-weight vector B, s‘ = 3910 37 37 37 37-38 48 48 48 48 In-weight vector C, s‘ = 3622 37 37 37 37-38 48 48 48 48 a Sets 1 to 4 correspond to standard curves from Runs 1 to 4 of Cd and Sb Samples 3 and In Sample 2 (Table 11). Table IX. Qualitative Classification Ability of Cd Weight Vectors Based on Equations 5. s’ Chosen to Give Only One C = 0 Error in Training Set Mixtures Incorrectly Classified a. Training set dataa Set 1 Set 2 Set 3 Set 4 Cd-weight vector C, s’ = 1724 3 3 4 4 14 14 31-32 31-32 94-95 94-95 Cd-weight vector D, s’ = 1594 3 3 4 4 14 15 14 31-32 31-32 94-95 94-95 b. Prediction set datab Set 1 Set 2 Set 3 Set 4 Cd-weight vector C, s’ = 1724 4 4 None None Cd-weight vector D,s’ = 1594 15 4-5 4 None 4 15 a Sets 1 thru 4 correspond to those for Table VII. Sets 1 thru 4 correspond to those for Table VIII.

samples. In classification was correct at 1O:l Cd:In ratios, again with no C = 0 errors. Sb can be classified perfectly for these two samples. Sample No. 1 (sets No. 1 and No. 2, part a, Table IX) was more difficult to classify for all three species. Cd was considered present in mixtures where In was present at 1.0 or 2.0 concentration levels. In could only be detected at 5 :1 Cd :In ratios or smaller. Sb was classified ANALYTICAL CHEMISTRY, VOL. 43, NO. 3, MARCH 1971

387

Weight vector symbol A B C D A B C D E

Table X. Recognition Ability of Weight Vectors Obtained after Training Errors for Cd, X3, and Sb Mixtures. Decision Based on Equations 2. Not all Weight Vectors Listed C > 0 errors C = 0 errors/ Min/max Total errors Concn range total C = 0 Error, Cd :X3 ratio a. Cd 9 0.10-0.45 0.10/0.22" 4.5 1/31 9 0.10-0.45 0.10/0.22 4.5 0/31 0.12-0.25 5 0. lO/O. 14 1/31 2.5 7 0.12-0 45 3.5 0. lO/O. 22 0131 b. X3 32 0133 0.11-1.30 1.10/13,6 16.0 3/33b 14 0.11-0.50 3.14/12.2 7.0 11 0 . 1 1 4 ) 50 . 0133 5.5 3 .50/12.2 16 0.11-0.50 1.10j12.2 2/33 8.0 16 0.114.50 0/33 8.0 1 * 10/12.2 I

c. Sb

...

0

0

0131

One X3 concentration w a s 0. b All missed when [Cd] = 0 and [Sb]> 1.2.

a

as absent in 20:20:1 Cd:In:Sb mixture, and incorrectly identified when Cd was present at a 1.0 or 2.0 concentration and In was also absent. Effect of Increased Peak Overlap. The effect of increased peak overlap on training was investigated. Weight vectors were trained to qualitatively determine Cd(II), X3, and Sb(III), where the species X3 represents In(II1) data moved 10 mV ( 5 data points) cathodic toward the Cd peak. The training set was otherwise identical to that of Table 111. After 25 iterations, there remained 12 errors for the Cd weight vector and 13 errors for the X3 weight vector. The S b weight vector was perfectly trained in 7 iterations. Table X describes the recognition ability of certain Cd and X 3 weight vectors obtained after error corrections during the training. Comparing those results with those of Table IX, it is observed that Cd can be identified nearly as well when the peaks are separated by 30 mV as it can when the peaks are a t their normal separation of 40 mV. However, when the Cd:X3 ratio is greater than about 3.5 : 1, X3 is incorrectly classified in the best case. This does not compare favorably t o the minimum 4.7:l Cd:In ratio at normal peak separation. CONCLUSIONS

The work described here obviously represents only the first steps toward providing computerized interpretation of electroanalytical data by pattern recognition. The unique prob-

388

0

ANALYTICAL CHEMISTRY, VOL. 43, NO. 3, MARCH 1971

lems afforded by normal fluctuation of electroanalytical data and severe peak overlap have been considered. Consideration of the specific nature of classification errors made has allowed the optimization of categorization criteria. Qualitative analysis of severely overlapped peaks by the learning machine approach is about as good as second-derivative measurements in stationary electrode polarography (9). Both methods are successful to about 1O:l peak ratios for a separation of around 40 mV. However, it is encouraging to note that as one allows n o experimental deviation in reduction peaks, the qualitative resolution afforded by the pattern recognition approach here was 2 mV. Further investigation of this fact seems worthwhile, as does the extension to other electroanalytical techniques, such as second-derivative SEP. In addition, the quantitative evaluation of electroanalytical data by learning machine methods should be investigated. ACKNOWLEDGMENT

The authors thank W. F. Gutknecht for the electrochemical data used in this study.

RECEIVED for review August 24, 1970. Accepted November 13, 1970. L. B. Sybrandt received Fellowships granted by Hercules, Inc., and the Analytical Division of the American Chemical Society. This work also supported by the National Science Foundation, Grant No. GP-21111.