Semiquantitative analysis of mixed gamma-ray spectra by

Jan 15, 1970 - also wish to acknowledge Arietta E. Sherry for her technical assistance and Walter J. Blaedel for his counsel. ... Learning machine met...
9 downloads 8 Views 695KB Size
ability, would indeed be a partner to the analytical chemist in the laboratory. ACKNOWLEDGMENT

The authors thank the DuPont Company for the use of a modified Spectronic 20 and logarithmic amplifier, They also wish to acknowledge Arletta E. Sherry for her technical assistance and Walter J. Blaedel for his counsel.

RECEIVED for review January 15, 1970. Accepted March 11, 1970. This work was supported by the National Institutes of Health, Grant No. G M 10978, and the Graduate Fellowship Program Of the National Science Foundation. Subroutines and engineering support were contributed by the Wisconsin Laboratory Computer Facility supported by NIH Special Research Grant No. FR00249.

Semiquant itat ive Ana lysis of Mixed Gamma-Ray Spectra by Computerized Learning Machines L. E. Wangen and T. L. Isenhour' Department of Chemistry, University of Washington, Seattle, Wash. 98105

Learning machine methods have been applied to the semiquantitative determination of seventeen light elements by resolution of 14-MeV neutron induced gamma-ray spectra. Individual spectra are combined to produce complex patterns with which weight vectors are trained by an error forcing iterative process designed to minimize computational time. Weight vectors are trained at order of magnitude intervals for each of the 17 elements and each is tested on 1000 randomly generated mixture spectra which include radioactive counting statistical variations.

BINARYPATTERN classifiers have been used successfully to relate certain types of chemical data (e.g., infrared and mass spectra) to the compounds which produced them (1-4). In these cases the patterns were derived from a single compound, that is, the spectra were pure. A logical extension of the learning machine method is the development of decision surfaces which are able to classify patterns which are not pure but are derived from mixtures. This work describes a learning machine process whereby it is possible to determine in a semiquantitative fashion the presence or absence of an element in a mixture of up to seventeen light elements by classification of the 14-MeV neutron generated gamma-ray spectrum. The past few years have witnessed widespread use of the relatively inexpensive 14-MeV neutron generators in activation analysis. Several studies have involved determination of the sensitivities and possible applications of 14-MeV neutron generators for elemental analysis (5-9). In particular, oxygen Present address, Department of Chemistry, University of North Carolina, Chapel Hill, N. C. 27514 (1) P. C. Jurs, B. R.Kowalski, and T. L. Isenhour, ANAL.CHEM., 41, 21 (1969). (2) P. C. Jurs, B. R. Kowalski, T. L. Isenhour, and C. N. Reilley, ibid., p 690. ( 3 ) Ibid., p 1949. (4) B. R. Kowalski, P. C. Jurs, T. L. Isenhour, and C. N. Reilley, ibid., p 1945. ( 5 ) J. H. Hislop and R.E. Wainerdi, ibid., 39 (2), 29A (1967). (6) J. E. Strain and N. J. Ross, ORNL-3672, 1965. (7) J. Perdijon, ANAL.CHEM., 39, 448 (1967). (8) M. Cuypers and J. Cuypers, J . Radioanal. Chem., 1, 243 (1968). (9) I. Fujii, T. Inouye, H. Muto, K. Onodera, and A. Tani, Analyst, 94, 189 (1969).

has been quantitatively determined by several workers quite successfully (10-13). A major problem in the application of small neutron generators is the resolution of the complex gamma-ray spectra produced by mixed samples. Ge(Li) detectors may be used but these are expensive to operate and have greatly decreased efficiencies in the higher energy range. Computer methods have been used to resolve complex gamma-ray spectra; however, these require large computational facilities and extensive libraries of standard spectra. The learning machine method, on the other hand, produces a series of weight vectors which may be used at a later date without recourse to a computer. Furthermore, this approach has been shown (I) to contain considerable redundancy so that a single widely deviating value in the input data does not notably decrease the probability of a correct result. In most library search routines, however, slight gain shifts or other minor errors may have major effects on the end result. Finally, the learning machine method is completely empirical, thereby having the added advantages of requiring neither such information as half lives and accurate peak energies nor such mathematical operations as corrections for decay, and detection and integration of peaks. This work describes the application of learning machine methods to the classification of complex patterns generated from the 14-MeV neutron activated gamma-ray spectra of the components. The technique is limited by the uncertainty inherent in all radioactive decay processes such that some elements are determined with much better confidence than others because of higher cross sections, etc. Table I shows the seventeen elements used in this study, together with the pertinent nuclear reactions and corresponding capture cross sections for 14-MeV neutrons, half lives, and prominent gamma-rays, primarily from work done by Mathur and Oldham (14). (10) 0.U. Anders and D. W. Briden, ANAL.CHEM., 36,287 (1964). (11) I. Fujii, H. Muto, and K. Miyoshi, Jap. Anal., 13,249 (1964). (12) E. L. Steel and W. W. Meinke, ANAL.CHEM., 34,185 (1962). (13) R. W. Benjamin, K. R. Blake, and I. L. Morgan, ibid., 38, 947 (1966). (14) S. C. Mathur and G. Oldham, Nucl. Energy, September/ October, 136 (1967). ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

737

C may be calculated by Table I. Elements and Relevant Nuclear Reactions

Crosssection, milliElement Reaction barns Half life B1l(n,p)Bell B ... 13.7s Nl4(n,2n)NI3 N 8 10m Na23(n,p)NeZ3 Na 31 38s Naz3(n,cu)FeZo 170 11s Mgzs(n,p)Na26 60 60s Mg Mg26(n,a)Ne23 3 38s A1Z7(n,p)Mg27 AI IO 9.5m A127(n,y)A128 ... 2.3m P31(n,2n)P30 P 2.6m 10 P31(n,~)A128 140 2.3m Clayn,2n)C134m c1 32.4111 8 C137(n,p)S37 5.0m 25 K3g(n,2n)K38m K 7.7m 3 Sc45(n,2n)Sc44 sc 3.9h 130 TiS0(n,p)Sc5O Ti 1.8m 30 Vbl(n,p)W1 V 5.8m 50 V5I(n,y)W2 ... 3.8m Cr Cr62(n,p)V52 3.8m 100 Cr63(n,p)V53 37 2.0m 3.8m Mn n,a)V6 30 Co”J(n,a)Mn69 2.6h co 30 Co6’J(n,y)Co6om ... 10.5m 1.6h NiB1(n,p)CoG1 22 Ni Ni62( n,p)CoG 13.9111 22 9.7m cu C~63(n,2n)Cu6~ 500 Cu65(n,2n)Cua4 1000 12.8h 2.6h Cu65(n,p)Ni 65 25 38. lm Zna4(n,2n)ZnB3 Zn 100 5.lm Zn66(n,p)Cu66 80 Zn@(n,p)Cu6* 30s 25

W’*Y=-W*Y=W.Y+CY*Y Gamma ray energies, MeV 2.12 0.51 0.44 1.63 0.40 others 0.44 0.83, 1.01 1.78 0.51 1.18 0.51, 2.10 3.13 0.51,2.04 0.51, 1.16 0.51,l. 11,l.56 0.32,0.93 1.43 1.43

+

1.oo 1.43

0.84 1.33 0.07 1.17 0.51 0.51 1.48 0.51 1.04 1.08

LEARNING MACHINE METHOD APPLIED TO MIXTURES

Previous work has demonstrated the advantages of a linear pattern classifier which attempts to develop decision methods independent of the theoretical relations between the data and the desired results. The success of such a classifier is based upon its ability to classify correctly patterns which were not members of the training set. A linear classifier operates on the assumption that the data may be separated into the desired categories by a linear operation. Each pattern of d components may be represented as a point or vector (Y) in d 1 dimensional space. (The d 1st component, which has the same value for each pattern, is added to establish a common coordinate origin for the patterns and the decision surface.) The linear pattern classifier is simply a set of d 1 weights, (W), which geometrically constitute a vector normal to the decision surface.

+

+

+

Hence W . Y and W

> 0 for patterns in category 2 (positive category) Y < 0 for patterns in category 1 (negative category)

The weight vector is developed from a training set of patterns selected to contain a representative sampling of each category. As various members of the training set are incorrectly classified, W is adjusted by a feedback process which is guaranteed to converge completely if the training set is linearly separable. A feedback method, found to converge rapidly in many cases, amounts to generating a new weight vector (W’) such that its dot product with the pattern has the same magnitude but the opposite-hence correct-sign. To accomplish this a fraction (C) of each pattern component is added t o the corresponding weight component

W’ = 738

w + CY

ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

..

-2w Y e=-----Y*Y *

Linear separability implies that the two categories of patterns may be separated by a linear surface. In two-dimensional space, this decision surface is a straight line, in threedimensional space it is a two-dimensional plane, and in d 1-dimensional space it is a d-dimensional plane. While individual patterns, such as the normalized gammaray spectra produced from irradiation of pure elements, may be separable, complex patterns formed from mixtures are not necessarily linearly separable. In the practical problem of analyzing mixtures for a given component, the positive category includes patterns having more than a specified percentage ( A + ) of that component and the negative category less than another specified percentage ( A -) of that component. Whether such a set is linearly separable depends on the nature of the pure patterns and the values of A + and A -. As A+ approaches A - for any situation other than one in which there are no statistical variations and all values of each pattern component are known to a n unlimited accuracy, the set will become inseparable. Hence, there can be no general solution for complete resolution of complex mixtures in many practical cases. However, the development of decision surfaces for specified sets of A+ and A - may produce useful results for those patterns outside the range of A+ and A - . Furthermore, the success of such decision makers may be directly evaluated. The approach used in this work is to create mixture patterns with various values of A+ and A - and to attempt to develop decision surfaces to properly classify them. A priori calculation of whether such a set is separable would require detailed knowledge of the characteristics of the patterns, and the statistical and experimental effects which cause variations in them. Hence, the empirical method is able t o determine the practical separability, in many cases, with notably less effort. The development of such pattern sets and decision surfaces and the success of their application is discussed in detail below.

+

EXPERIMENTAL

A known amount of each element was irradiated separately with a copper flux monitor. The irradiation source was a Kaman-Nuclear Corp. Model A-710 neutron generator which is a sealed tube, mixed deuterium and tritium in titanium target instrument utilizing the H 3(d,n)He4 nuclear reaction providing a flux of about lo9, 14.3 MeV neutrons cm-2 sec-1 a t the sample position. A dual pneumatic transfer system was used which consisted of l/&ch i.d. polyethylene tubing with stainless steel fittings supplied by Kaman Nuclear. Solenoids in the system were controlled by a specially built control panel and were arranged so that element and monitor could be simultaneously irradiated and returned to the counting stations. The two samples were contained in l/z-inch 0.d. polyethylene rabbits which were propelled through the system by nitrogen gas pressure. During irradiation, the samples were rotated about two axes by a Kaman Nuclear Axis Rotator Terminator Model A-9104 to ensure a homogeneous flux and a reproducible system. Two 3-inch X 3-inch NaI (Tl) Harshaw type 12S12/3E with RCA model 8054 phototubes scintillators were used with a common power supply (Power Designs Model 2K15) but with other electronic components operating independently. The same detection system was always used

ELEMENT

-

MONITOR

\SCALER

MULTICHANNEL ANALYZER

1

1

IAM P L IFIER Figure 1. Irradiation and detection system

[PREAMP

1 POWER SUPPLY counting stations

P HOTOM U LT I PLI E R Nol(TI)

deli very tube

I I

for the copper monitor system while the other was used for the element under investigation. Amplifier settings were identical for all counts in each system. Figure 1 shows a block diagram of the detection systems. The electronic components include 2 Hamner Model NB-11 Preamplifiers, 1 Ortec Model 410 Linear amplifier, 1 Ortec Model 440 Selective Filter amplifier, 1 Ortec Model 420 Timing Signal Channel analyzer, a scalar, and a nuclear data 512 channel analyzer. The monitor apparatus used the single channel analyzer with a window set on the 511 KeV Cu63annihilation radiation. The element sample with a copper monitor were irradiated for 60 seconds. After a 30-sec delay the sample was counted for 60-sec live time, whereas there was a 60-sec delay, then a 5-minute real time count for the copper monitor. Longer irradiations and counts were taken on each sample; however, the shorter irradiations with 60-sec counts gave the better results when all 17 species were considered and, hence, were used in the study. One hundred twenty-eight channel spectra were printed by an IBM typewriter and subsequently punched on computer cards. All spectra were standardized to 1 gram of element and adjusted to the same neutron flux as determined by the copper monitors. In addition each 128 channel spectrum was condensed to a pattern of 64 components by summing successive channels. This is justified by the fact that 3500 KeV is recorded in 128 channels giving 27 KeV/channel and the scintillator has only about 7 % resolution giving 70 KeV at 1 MeV. Therefore, the pattern contains considerable redundancy because by increasing the number of channels beyond the resolving ability of the detector, the information is just divided over more channels. All computer programs were written in Fortran IV and executed on the University of Washington Computer Center IBM 704017094 direct couple system under the IBSYS operating system. All training programs utilized a generalized error-forcing subset learning routine (SUBROUTINE LEARN). Figure 2 is a flow diagram of LEARN, listings of which are available upon request. The main program generates a set of mixture patterns for a particular element, defines to which category each pattern belongs, and then calls on LEARN to attempt to develop a weight vector which will successfully classify each pattern in the training set. This process is carried out for as many elements in as many different training sets as desired. The general theory of learning machines is comprehensively covered in a book by

IRRADIATION I

Nilsson (15). A complete description of the basic training program used in most of the work may be obtained upon request. Training Sets. Training sets were initially produced by a random process employing a linear random number generator. Each mixture spectrum could contain from 1 to 17 components with each component contributing between to 100% by weight to the mixture. However, little success was realized with this procedure, and hence it was abandoned in favor of a nonrandom method. In order for the training procedure to completely converge, it is necessary that the training set of patterns be linearly separable. In the random method of training set generation, it is quite possible to obtain 1.01 % of a component in one mixture and 0.99% of the same component in another. If the classification criterion were 1.OO%, then the training procedure might never completely converge; in fact, the training set would probably contain inseparable members. Furthermore, many separable sets might become inseparable when statistical fluctuations were considered. The nonrandom method used for generating the training sets from the standard spectra is as follows: (1) A classification criterion is selected (usually 0.1, 1.0, or 10.0%) which defines the categories, e.g., a classification criterion of 1.0% implies category 2 for levels equal to or greater than 1 % and category 1 for levels less than 1 %; (2) to obtain a realistic chance of separability, no percentages are allowed within a previously specified interval about the classification percentage; (3) a set of ten percentages is selected for the element of interest, five in each category such that each category is adequately spanned; (4) sixteen binary mixture spectra are generated at each of the ten prechosen percentages of the element of interest such that each of the remaining 16 elements are represented in ten binary spectra at a percentage of one hundred minus the percent of the element of interest. In this manner a training set of 160 patterns is generated and used to train a weight vector capable of classifying the element of interest as to its presence greater than or less than the classification percentage. The percentages of a representative training set in which aluminum is the element to be determined at a classification percentage of 1.0% are illustrated in Table I1 for one of the (1 5) N. J. Nilsson, “Learning Machines,” McGraw-Hill Book

Co., New York, N. Y.,1965. ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

0

739

SET AND INITIAL

I

PREVIOUS SUBSET

A

T R A I N I N G SET

PATTERN T O BINARY

Q ECISION

YEs

PLACE PATTERN SUBSET

AND APPLY FEEDBACK T O CORRECT WEIGHTS

I I 1

NO

Figure 2. Flow diagram of SUBROUTINE LEARN RESULTS AND DISCUSSION

Table II. Percentage Composition by Weight of the Ten Boron-Aluminum Mixtures in the Aluminum Training Set Pattern Percent A1 Percent B Category 1

Category 2

1 2 3 4 5 6 7 8 9 10

0.00 0.05 0.10 0.25 0.50

100.00 99.95 99.90 99.75 99.50 98.00 90.00 75.00 50.00 0.00

2.00 10.00 25.00 50.00 100.00 ~

~~

~

~~

other 16 elements, boron. That is, of the 160 patterns in the set, 10 contain boron and aluminum at the percentages indicated. The same would be true for each of the other elements and aluminum, hence the 160 total patterns. 740

ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

In order to determine the potential of the method, a representative sample of three elements was chosen and weight vectors were trained on an order of magnitude criterion. The three elements phosphorus, chlorine, and boron represent quite high, average, and low activities, respectively. Training sets for each element were developed by the method described above at three different classification percentages, and hence three weight vectors were trained for each element. Table 111-a shows the percentages used for the training sets and the classification criteria. Table 111-b indicates the convergence rate (represented by the number of feedbacks required for complete convergence) and predictive ability of the weight vectors developed with these training sets. As might be expected from intensity of the spectra, phosphorus converged most readily at all three classification percentages, although the fact that fastest convergence occurred at 1.O implies that there is a maximum in the convergence rate within the range of

Table 111-a.

Classification percent Category 1

Category 2

Training Set Percentages for Order of Magnitude Training 10.0 0.00 0.01 0.10 1 .oo 5.00 15.00 25.00 40.00 65.00 95,OO

1.0 0.00 0.01 0.05 0.10 0.50 5.00 10.00 25.00 50.00 95.00

0.1 0.00 0,001 0.005 0.01 0.05 0.50 5.00 25.00 50.00 95.00

classification percentages used. Note that complete convergence was obtained for all cases except boron at 0.1%. Hence the difference between boron at 0.05 % and 0.5 probably doesn't constitute a separable situation when in the presence of the other elements, at least, in a reasonable number of feedbacks. It is important to note that the development and testing of the decision surfaces proceeds without consideration of specific difficulties such as the NI6 interference produced from neutron captures by the oxygen in the irradiation capsule. Although a rigorous consideration of all such problems might, in a specific analysis, produce a better single quantitative result, the empirical method can produce more general, qualitative results while requiring notably less time and effort. The weight vectors resulting from the above training were tested on a set of 1000 randomly generated mixture spectra. This set of 1000 mixtures, each containing an arbitrary number of components, were developed from the standards using the random method described first; that is, each spectrum could contain up to 17 components present between 0 and 100%, the The results of lowest possible nonzero level being these tests are shown in Table 1114. Prediction is excellent for all cases except boron at 0.1 %, for which the results appear worse than random. The probability of guessing the correct answer is, of course, 0.5 for a binary decision maker. In this example it is apparent that training to incomplete convergence results in a weight vector which is useless for prediction. Generally when complete convergence was not accomplished, an inordinate number (from ' 1 3 to 3/4) of those misclassified resulted from placing patterns which did not contain the element of interest at all into category 2 (the positive category). Because, in general, the probability of an element not being present in a randomly generated spectrum is about 0.5, this places a large emphasis on these members in prediction if they

PRESENT 14 MeV NEUTRON INDUCED GAMMA RAY SPECTRUM (7) TO THE CHLORINE CLASSIFIERS (gi,i 0.1%, 1.0%, 10,0%)

Cl PRESENT

> 10.0%

I

No C1 PRESENT > 1.0% BUT 7 10.0%

I

No C1 PRESENT > 0.1% BUT 2 1.0%

YES '-

C1 NOT PRESENT AT LEVELS

2

0.1%

1

are incorrectly classified. Another explanation lies in the possibly very close similarity between spectra containing 1.O and 0.0% of the same species when in the presence of other species. It is possible that placing the 1.0% spectrum into category 2 also places 0.0 spectrum into category 2, therefore misclassifying the 0.0 member which, as already mentioned, occurs so often that its misclassification would have an exaggerated effect on prediction. Figure 3 indicates how the three weight vectors for chlorine,

x

Table 1114. Relations between Convergence Rate, Prediction, and Classification Percentage

Classification percentage 10.0 1.0

Element P c1 B P C1 B

0.1

P

c1 B

Feedbacks to complete convergence 685 3944 10805 96 7829 21864 3491 65859 a

Category 1 99.8 99.5 99.9 100.0 100.0 100.0 99.5 99.9 28.1

Prediction- correct Category 2 99.2 97.0 94.9 85.3 78.4 78.0 91.6 94.5 75.1

Total 99.7 99.2 99.2 97.0 94.2 94.5 97.2 98.1 44.2

Did not completely converge in allowed number of feedbacks.

ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

741

Table IV. Effect of A on Convergence Rate and Prediction

10 5 2.5 1.25 1.11

0.00 0.00 0.00 0.00 0.00

lo

0.01 0.01 0.01 0.01 0.01

0.03 0.05 0.05 0.05 0.05

0.06 0.10 0.10 0.10 0.10

0.10 10.00 30.00 50.00 70.00 90.00 0.20 5.00 25.00 45.00 70.00 95.00 0.40 2.50 10.00 35.00 60.00 90.00 0.80 1.25 5.00 25.00 50.00 90.00 0.90 1.11 5.00 25.00 50.00 90.00

1

8 -

100.0 100.0 99.2 100.0 100.0

96.1 98.0 99.5 99.9 100.0

97.1 98.5 99.4 99.9 100.0

A further point investigated was the effect of the magnitude of the d 1 term on the rate of convergence. As noted above, this term is necessary so that the hyperplane separating the two categories, Le., the decision surface, will pass through the origin of the space. Figure 4 clearly shows that while the magnitude of the d 1 term has little effect as long as it is small, rate of convergence decreases drastically as the relative magnitude of the d 1 term becomes excessively large. Each pattern used in this work was normalized so that the sum of the components was 1.0, Note that when the d 1 term was set to 1.0, convergence was not obtained in the limit of 1000 feedbacks. The average value of a point normalized in this way is 1.0 + 64.0 or 0,0157 for a pattern of 64 components. Even though there is considerable scatter in the results, the rate of convergence goes through a maximum in the vicinity of the d 1 term equal to 0.1 but decreases rapidly for larger values. Hence the magnitude of the d 1 term should be of the same order as that of the average component of the pattern. To test the overall utility of the learning machine method in the classification of mixed patterns, weight vectors were trained for all 17 elements at classification percentages of 1.0 and 10.0%, such that two weight vectors were obtained for each element. Table V records the results. The number of feedbacks for complete convergence is a good indicator of the relative intensity of the gamma-ray spectra, but, more important, serves as a measure of which elements might be detected with better sensitivity by this method. For example, as shown earlier, boron didn't converge completely at a 0.1 classification criterion while chlorine did; hence, it could be reasonably assumed that any element with a convergence rate as low or lower than boron at 1.0% wouldn't converge completely at 0.1 % whereas those in the vicinity of C1 at 1.0% would converge at 0.1 % in a reasonable number of feedbacks. (Note, however, that a reasonable number of feedbacks is a value judgment and depends on the available computer facilities and how the trained vector is to be used.) To perform a practical test of the 17 weight vectors, two sets of 1000 randomly generated mixtures were developed with the standard deviation inherent in radioactive decay statistics included in the second set. A gaussian random number generator with a mean of zero and a standard deviation of one was used. The following steps describe the generation of a spectrum:

+

0 -

+

6 v)

Y V

+

4 -

W

w

b

LL

0

e e

2 -

* * **

+

0 10-4

10-3

lo-z

lo-'

ioo

IO"

MAGNITUDE OF d + I TERM (LOG SCALE)

Figure 4. Feedbacks required to converge us. magnitude of d 1 term Did not converge in 1000 feedbacks

+

(1

using the data from Table 1114, might be used in a semiquantitative scheme. Confidence limits for these decisions are also apparent from Table 1114. (Note that once the weight vectors have been obtained, the implementation of the classifiers consists of relatively trivial dot product calculations.) It should be realized that the values used from Table 1114 were developed using standardized spectra and with no consideration given to the statistical deviations inherent in radioactive decay processes. The effects of statistical deviations are considered below. Some specific studies were made on aluminum, one of the more favorable elements because of its isotopic purity and high cross-section for 14-MeV neutrons. One of these was the determination of rate of convergence as a function of the width, A, of the no-decision range about the classification percentage. Table IV shows the relation between aluminum percentage and convergence rate for various values of A. A training set constructed from the A1 percentages in the 1st row might be expected to converge faster than one constructed from the A1 percentages in the 3rd row. In the former case the closest A1 percentages to the classification percentage (1 .O%) are 0.10 and 10.0, each being a factor of 10 of the classification percentage, hence A = 10; in the latter case, the closest A1 percentages are 0.40 and 2.5, hence A = 2.5. Aluminum weight vectors were trained for A's ranging from 1.11 through 10.0 and then used in prediction on the 100Omixturespreviously mentioned. Table IV shows that while the rate of convergence increases quite rapidly for large A and decreases rapidly as A approaches 1, prediction appears to vary slowly for large A but increases as A decreases. 742

16 219 4351 1935 10611

(x

Prediction correct) Category 1 Category 2 Total

+

e

N

*a0

Feedbacks to converge

A1 percentages used to construct training set

A

ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

+

(1) A linear random number generator produces a number from 1 to 17 inclusive giving the number of elements, M in the mixture. (2) M elements are chosen at random to compose the mixture with each present at an arbitrary percentage between 10-sxand 100%. (3)(u) For the standard spectrum the value in the jth channel is multiplied by the percent for each element and these are summed to give the total counts in that channel.

Table V. Convergence Rate and Prediction of Weight Vectors for 17 Elements Trained at 1.0 and 10% Classification Criteria Prediction

Feedbacks to converge 1 .O% 10.0%

Element

2 1892 3928 218 1739 383 96 7289 35225 20169 1732 251 5534 26661 56740 73620 54 1 27047

B N

Na Mg A1 P

c1 K

sc Ti V

Cr Mn

co Ni

cu

Zn

16300 5854 1311 2285 977 2095 4140 15029 5094 1975 562 2285 9656 26129 34640 470 1568

Stbndard set

where yii is the number of counts in channel j for element i and A iis the percent by weight of element i in the mixture. (b) For the spectrum which is to include the statistical deviation this sum becomes

Y’ =

C Abt’

i-1

+ RiC~t’)~’*l

(2)

where Ri is a number generated by the gaussian random number generator. (Note that the y,’ and Ai in Equations 1 and 2 are identical, i.e., the 1000 statistically deviated spectra were obtained from the 1000 regular spectra used to test the vectors .) Table V shows the results of testing the weight vectors on these mixtures. Plainly when the standard deviation is considered, the prediction drops considerably for some cases, i.e., the elements represented by low activities. The higher activities continue to result in very good prediction. In addition the prediction at the 10% decision surface is quite a bit higher in general than that at 1 % for the spectra including the statistical deviations. This again is attributed to better statistics. Notice that Ni and C o give the lowest prediction and correspondingly had the slowest convergence rates. Possibly the lower results for the second case could be improved by building the standard deviation into the training sets. However, it should be emphasized that the

Statisticallv deviated set 10.0%

1.0% 98.1 98.5 98.9 98.6 98.5 98.9 99.2 99.4 99.1 99.3 99.2 99.2 99.1 98.7 99.3 98.2 99.1

98.9 99.1 98.4 98.8 99.0 99.0 99.3 98.6 99.0 98.1 98.6 99.3 98.7 99.7 98.5 99.4 98.6

1 .O% 73.6 89.8 98.4 89.0 98.1 98.8 84.4 65.3 63.9 77.1 98.8 89.2 67.9 63.7 60.2 98.2 68.8

10.0% 94.1 98.7 98.3 98.9 98.9 99.1 98.6 93.8 91.8 97.6 98.4 98.6 92.6 76.0 73.0 99.2 93.9

prediction for all cases is still considerably better than random, and that the very good prediction for several of the elements illustrates the potential usefulness of the technique for analyzing gamma-ray spectra. RECEIVED for review November 26, 1969. Accepted March 12, 1970. Research supported by the National Science Foundation.

Correction

f

Fluorometric Determination of Sulfur Dioxide as Sulfite In this article by Herman D. Axelrod et al. [ANAL.CHEM.,

4 2 , 512 (1970)l on page 513, paragraph 3, on acidity and

HCHO optimization, the last sentence should read: “Likewise for the HCHO, using lO-‘M 5AF, the suppression is independent of the HCHO in concentration of 0.02-0.07%, s4 :whereas the 10-f~and 10-6M 5AF require HCHO concentra’tions of 0 . 0 6 4 1 5 ~in the final solution.”

ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

743