Computerized learning machines applied to chemical problems

of training for oxygen presence using initial unit weight vectors comprised of all positive ones and all negative ones, respectively. The rates of con...
1 downloads 0 Views 652KB Size
cluding markedly atypical spectra, amounts to a more specific requirement and should improve the predictive ability. The results of investigating this possibility are shown in Table VII. After each pass involving feedback through the training set, the resulting weight vector is allowed to classify the patterns of both the training and prediction sets without feedback being applied. The results are noted and another iteration through the training set with feedback further trains the weight vector. The process is repeated until recognition attains 100 after the sixth iteration. It appears that the predictive ability is enhanced by increased training using the data which is more difficult to classify. Initial Weight Vector. The initial values of the components of the weight vector do not affect the possibility of solving a pattern classification problem, but varying them does force the learning machine to discover alternate training paths to acceptable final weight vectors. Table VI11 shows the results of training for oxygen presence using initial unit weight vectors comprised of all positive ones and all negative ones, respectively. The rates of convergence are seen to be comparable. The ten most important weight vector components (those with the largest indicator values) are listed in descending order for the two training trials. The values of the eight weight vector components appearing in both lists are given and their ratios are calculated. If all the ratios were the same, the two weight vectors would be identical; however, the ratios vary from 0.78 to 1.41, and the two weight vectors, while obviously related, are not identical.

Table VIII. Convergence Rate and Resulting Important Weights as a Function of Different Initial Weight Vectors Initial weight vector +1

Parameters 155

Spectra tested 1528

Recognition 200

-1

155

1561

200

Mass position 29 31 41 42 45 53 56 70

Ten most important weights 29, 39, 56, 41, 70, 55, 45, 53, 31, 42 29, 41, 27, 56, 45, 53, 42, 70, 31, 40

Weight vector components 2.00/2.31 3.5113.53 -0,921- 1.18 1.1411.15 3.9113.53 -2.061- 1.70 - 1.87/- 1.37 -2,501- 1.71

Ratios 1.41 0.99 0.78

0.99 1.10 1.21 1.37 1.41

ACKNOWLEDGMENT

The authors gratefully acknowledge the kind cooperation of the University of North Carolina Computation Center staff, especially Mr. E. Hoyle Anderson, Jr. RECEIVED for review October 28, 1968. Accepted February 13,1969. Research supported by the National Science Foundation.

Computerized Learning Machines Applied to Chemical Problems Multicategory Pattern Classification by Least Squares B. R. Kowalski, P. C. Jurs, and T. L. Isenhour Department of Chemistry, University of Washington, Seattle, Wash. 98105

C. N. Reilley Department of Chemistry, University of North Carolina, Chapel Hill, N . C. 27515

A least squares pattern classifier which is capable of both recognition and prediction has been developed for multicategory pattern classification. The method operates on an approximative calculation which develops a single weight vector to simultaneously classify patterns into any number of categories. The method is evaluated with various mass spectrometry problems and compared to binary pattern classifiers for both ease of implementation and accuracy of results.

PREVIOUS WORK has demonstrated the applicability of learning machine processes to binary classification of experimental data both for recognition of previously examined cases and prediction of new results (I, 2). The method used involved a geometric treatment of a data set as an augmented pattern vector ( Y h t h e original pattern vector of d dimensions (X) plus an additional d 1 dimension to allow manipulation in

+

(1) . . P. C. Jurs, B. R. Kowalski, and T. L. Isenhour. ANAL.CHEM.. 41,21(1969). (2) P. C. Jurs, B. R. Kowalski, T. L. Isenhour, and C . N. Reilley, ibid., 41,690 (1969).

hyperspace. Feedback methods were used to locate a weight vector (W) which formed a hyperplane dichotomizing the points in hyperspace into the desired categories. Koford and Groner and Widrow et al., have used a least-mean-square error function in which two categories are separated by using a least squares approximation to find the best weight vector (3-5). These binary methods are successfully applied to multicategory cases only by making a series of binary decisions; hence, for a classification into K possible categories, K-1 weight vectors must be individually trained and a total of K-1 times d 1 parameters must be recorded for application of the system. This paper describes an approximate scheme based on least squares which simultaneously classifies a set of patterns into K

+

(3) J. S . Koford, U. S. Government Research and Development Reports, 41, 50 (1966), TR No. 6201-1. (4) J. S . Koford and G. F. Groner, IEEE Trans. on Inform. Theory, 12,42 (1966). ( 5 ) B. Widrow, P. E. Mantey, L. J. Griffiths, and B. B. Goode, Proc. IEEE, 55,2143 (1967). VOL. 41, NO. 6 , MAY 1969

695

Table I. Incorrectly Classified Oxygen Compounds Acceptable Actual Compound SZ* range of si Si 0

il

CHs-C-OCHa CHs-0-CHa CHa

2 1

CHs--CH--CHs 0

0

II

1.48+2.50 0.54+ 1.48 -0.50+0.54

1.31 1.56 0.76

1 1

0 . 5 4 4 1.48 0.54- 1.48

0.49

CHsj7H-N02 2 CH~CHZOCHZCHZOH 2

1.48+2.50 1.48d2.50

0.85 1.24

CH3-CHz-CH CHaCHzOCHzCHa CHa

1.60

categories using only one weight vector and, in general, only one calculation step. The least squares method is not guaranteed to develop perfect recognition even with cases of linearly separable data; however, sufficient training to produce very useful recognition and predictive ability may be accomplished. LEAST SQUARES PATTERN CLASSIFIER The weight vector (W) developed in a binary pattern classifier is actually a linear combination of some or all of the patterns of the training set developed in such a fashion that the dot product of the weight vector and ith pattern (Wayi) gives a scalar (s,) whose sign indicates to which of two categories the pattern belongs. The principle of the multiclassification method is to develop a weight vector which produces values of s, whose magnitudes, as well as their signs, place the patterns in one of several categories. The correct category (s,*) for each pattern of the training set is defined by some arbitrary value. For example, in the case of developing a weight set to differentiate between compounds having zero, one, two, three, or four oxygens the values of 0, 1, 2, 3, and 4 may be assigned to the st*’s for the corresponding compound. A least squares procedure is employed to select that set of weights which computes values of si so as to minimize (si - si*)z.

Q

=

2 (si -

is1

si*)z =

2(,2

wjyij

i=l 3 5 1

- s,*

where m = number of mass positions, n = number of patterns in the training set, and Q = sum of squares of deviations. The normal equations to minimize Q are developed by taking the partial derivatives of Q with respect to each weight and setting these equal to zero.

fork

=

1,2, 3,

. . .m

The normal equations are solved in the conventional fashion to determine the optimum wj’s and hence W by the least squares criterion. For cases where the number of categories is small compared to the number of patterns, as in the determination of oxygen number in mass spectra, each si is considered in the category of the nearest meaningful value. For example, if oxygen 696

*

ANALYTICAL CHEMISTRY

numbers 0, 1, and 2 are given si* values of 0, 1, and 2, respectively, an si of 1.68 is classified as meaning 2 oxygens. However, arbitrarily selecting l .5 as the discriminating value between 1 and 2 oxygens may not give the best results. Hence after the least squares step, a “best line” routine is used to compute the discrimination line between the two categories which gives the best answer for the training set. In cases where large numbers of categories exist, such as for molecular weight determination, the category is no longer as important as the nearness of si to the correct answer. Also, the “best line” calculation may be quite lengthy in cases containing many categories, such as hydrogen number, and has little value for these. In actual time the least squares calculation increases as a square of the dimensionality of the problem and at least as the first power of the number of patterns. For a small number of categories it is usually much slower than a series of binary dichotomizers ; however, because the calculation length is independent of the number of categories, the least squares method becomes much more practical as the number of categories increases. DATA AND COMPUTATIONS All data were taken from the American Petroleum Institute Project 44 tables with only intensities equal to or greater than 1% of the maximum peak recorded. The implementation of the least squares pattern classifier was developed in two stages with different size data sets because of the capabilities of the two computer systems used. The first study dealt with methods of maximizing recognition, or the ability to correctly classify previously seen data, with some emphasis on minimizing the necessary computation. The data set included 130 low resolution mass spectra in the range CI- 5, Oe2, NF2, with 79 mass positions which along with the d 1 term gave an 80 dimensional system. The computations were done on the University of Washington Computer Center IBM 7040/7094 direct couple system. The second study dealt primarily with prediction, the ability to correctly classify previously unseen data. The data set included 630 low resolution mass spectra in the range CI- 10, Hi-22, OF 4, Nc-3, with 155 mass positions which along with the d 1 term gave an 156 dimensional system. The computations were done on the University of North Carolina Computation Center IBM 360/40 teleprocessing with the Triangle University Computation Center IBM 360/75. All programs were written in Fortran IV.

+

+

RECOGNITION The example used for most of the study was determination of oxygen number as 0, 1, or 2. “Best line” adjustments were made after every calculation. Except where indicated the peak intensities were adjusted to make the maximum peak 100 and then each value was raised to the one-half power as has been shown to be very useful in previous work ( I , 2). A least squares calculation as described above produced a weight vector which correctly classified 123 of the 130 spectra for a 94.6% success in the oxygen 0, 1, or 2 case. (The “best line” correction improved the calculation from 9 to 7 wrong.) The incorrectly classified compounds and their computed si’s are given in Table I. Reducing Number of Parameters. Because the binary pattern classifier previously described (2) had success using less than all the available data, the process of reducing the number of parameters by casting out masses was also in-

T R I A L MASSES NO.

USED

MASSES RETAINED BASED ON

% RIGHT

T R I A L MASSES X NO. USED RIGHT

20

30

40

no

eo

70

Wj BO

M A S S E S R E T A I N E D B A S E D ON 20

30

40

SO

60

70

90

IO0

90

100

Rj

80

Figure 1. Recognition as a function of number of eliminated masses

vestigated for the least squares method. Although any action which decreases the available data and simultaneously the number of adjustable parameters-Le., decreasing the dimensionality of the problem-must make recognition more difficult, the calculation time is decreased by approximately the square. Hence, even decreasing the dimensionality by a factor of two will greatly decrease the computation time. Choosing the mass positions to eliminate is a difficult problem requiring the impractical calculation of every possible combination to guarantee the ideal choices. However, it is reasonable to expect that those which least affect the calculation are logical choices. Hence, two methods were investigated: (1) Eliminating masses on the basis of smallest weight values, and (2) Eliminating masses on the basis of smallest cumulative effect on the decision process. In the second case the product (R,) formed by the weight times the sum of the amplitudes of the corresponding mass position was used. Figure 1 shows the effect of casting out masses in groups of fifteen and recalculating with each smaller set. The two methods seem fairly comparable although there is some indication that the Ri criterion is better. Considering the R, method it is interesting that elimination of the first forty-five mass slots, leaving only 35 possible slots (of which the average spectrum has about fifteen peaks) has little effect on the recognition ability. Even eliminating all but four masses and the d 1 term gives 70.0 right answers which is still far better than random (33 A further test eliminating masses in groups of 30 by the R, criterion gave answers of 71.5 % recognition after two iterations (60 eliminated) which compared to the four iterations each eliminating 15 masses shows that good results may be obtained without much care in the selection of eliminated masses. A modification of this method, which may prove useful with a large data set, is to use a small set to get an indication of which dimensions of the patterns to eliminate before treating the large set, thereby decreasing overall computational time. Cross Terms. Thus far both the binary pattern classifiers (I, 2) and the least squares pattern classifier have operated on each dimension of the data independently, with no consideration of interactions between dimensions. However, in mass spectral data, and many other types used in experi-

+

x).

x

mental science, important interactions exist between components of the data. In mass spectrometry, for instance, certain fragments appear together, while others are mutually exclusive, etc. Deciding which interactions to consider is a difficult problem because in a pattern of M components there are M(M - 1)/2 possible binary cross terms and M( M - 1)(M - 2)/6 possible third order terms, etc. In the problem presently under consideration where 80 mass slots are used, it would require 3160 additional dimensions to consider all binary cross-terms, and would require prohibitive amounts of time to compute. Hence, a method must be derived to select important cross terms without computing all possibilities. Also, in order to compare methods of computation, the same dimensionality must be maintained for adding dimensions would increase the number of arbitrary parameters. The method chosen here is to consider those components with the largest IR,i’s most important and to form cross terms among them. Simultaneously, the lowest lRJ1’s are considered least important and removed to make room for the cross terms. In order not to completely lose the data of the smallest terms, the number of cross terms (9 was chosen, then the K 2 lowest R, ’s were found and these intensities were averaged in two groups, those with positive R,’s and those with negative R,’s, and placed in two of the K 2 available dimensions. Thus room was created for K cross terms. The cross terms are formed by the square root of the product of the two pattern components. Table I1 shows the results of iterative calculations where after each iteration the fourteen smallest R,I’s were combined as indicated above and the listed cross terms were formed on the basis of largest ‘ R ,. This, of course, allows the formation of multiple cross terms as the calculation proceeds. In the first case the recognition ability remains nearly the same for four iterations and then becomes progressively worse. In the second and third cases recognition ability improves initially and then becomes progressively worse. Because formation of the cross terms requires loss of some first order terms it is not surprising that this eventually degrades the results. The improvement in the two cases, however, encouraged further investigation of the cross terms. In searching for a better method a programming error

+

+

VOL. 41, NO. 6, MAY 1969

697

Table 11. Recognition for Different Selections of Cross Terms as a Function of Iteration Cross terms used Recognition (%) (based on decreasing order of Rj) Iteration: 1 2 3 4 5 6 7 lX2,1X3,1X4,1X5,1X6,2X3,2X4,2X5,2X6, 3 X 4,3 X 5,3 X 6 1 X 2, 1 X 3, 1 X 4, 1 X 5, 1 X 6, 1 X 7, 1 X 8, 1 X 9, 1 X 10, 1 X 11, 1 X 12, 1 X 13 1 X 2, 1 X 3 , l X 4, 1 X 5, 1 X 6 , l X 7 , l X 8,2 X 3,2 X 4, 2 X 5,2 X 6,2 X 7

Table 111. Comparison of Low Mass Number and High for Selection of Cross Terms Recognition (%) Cross terms used Iteration: 1 2 3 1. Based on decreasing Rj’s 1 x 2, 1 x 3, 1 x 4, 1 x 5, lX6,2X3,2X4,2X5, 2 X 6,3 X 4,3 X 5,3 X 6

2.

94.6

98.5

82.3

94.6

98.5

96.2

14, 12 X ( d + l), 94.6 X 26, 12 X 40, X (d l), 13 X 39, X 40, 14 X (d 1)

98.5

96.2

98.5

98.5

Based on masses

12 X 13. 12 X 14. 12 X 15, 12 X 16, 12 X 17, 13 X 14, 13 X 15, 13 X 16, 13 X 17, 14 X 15, 14 X 16, 14 X 17

3.

Based on most important masses from 1 2

+

12 x 13, 12 X 12 X 39, 12 13 X 14, 13 13 X 26, 13

4.

Rj

+

+

Based on decreasing R,’s

1 x 2, 1 x 3, 1 x 4, 1 x 5 , 94.6 1 X 6, 1 X 7, 1 X 8, 1 X 9, 1 x 10, 1 x 11, 1 x 12, 1 x 13

0

II

\

II

4

3.48

-c

4.26

3.39

4

3.48

+

4.26

4.53

/ CHa HZNCHZCHZNHZ 2 CH~CH=CHCH~(WU~S) 4 CHrCHz

1.52 -+ 2.45 3.48 4.26 -+

2.60 4.29

C H2-CHz

4

3.48

+ 4.26

3.35

4

3.48 +4.26

CHCH

I

/

0

II

CHsCHzCHzCH

4.38

Table V. Recognition as a Function of Iteration for Different Forcing Factors Forcing factor Recognition (%) Oxygen number: 0 1 >1 Iteration: 1 2 3 0 0 0 -1

698

1 1 3 0

2 4 4 1

ANALYTICAL CHEMISTRY

96.2 95.4

94.6 94.6

9

10

93.8

91.5

74.6

63.1 76.9

71.5

57.7

97.7

96.2 96.9

98.5

94.6

66.2

83.8

71.5

52.3

97.7

99.2 97.7 94.6

86.9

83.8

72.3

72.3

63.1

produced the following interesting computation. Masses 81 through 89 were dropped, averages of the last fourteen masses based on whether their R,’s were positive or negative, were formed and stored in mass slots 81 and 82 and the following cross terms were formed and stored in mass slots 83 through 88: (d 1) X 12, (d 1) X 13, (d 1) X 14, 12 X 13, 12 X 14, 1 3 X 14. The initial calculation gave 94.6x right as usual, but on the second iteration (first time these cross terms were added) 100% recognition resulted. Upon investigating the RJ’s of the masses used (d l), 12, 13, 14 it was seen that these were 1, 4, 8, and 12 in order of decreasing absolute value. This rather mathematical progression of values, however, was not seen as the underlying cause of this accidental success; the more likely reason being the choice of low masses. Because small fragments are more likely to be related in origins than large masses-Le., the sum of fragments from any first order process must be equal to or less than the molecular mass-one should expect cross terms to be most important among small mass values. Another interesting aspect is that the method of cross term formation with the ( d 1) term simply results in taking the square root of the other term. This gives different power terms for the same mass and bears some relation to the quadric and higher order learning machines described by Nilsson (6). Further investigation of recognition using cross terms was made on carbon numbers 1, 2, 3, 4, and 5 of the same set of compounds. This is, of course, a more difficult problem because of the larger number of categories. Table I11 shows three iterations for each of four different combinations of cross terms. The incorrectly classified compounds and their computed s,’s from the first computation step are given in Table IV. From Table I11 it is again seen that cross terms are helpful but elusive to select properly. It is apparent that consideration of these second order interactions, and perhaps higher order interactions, can greatly enhance the success of least squares pattern classification. Further study is under way to develop rational ways of selecting cross terms. Forcing Factors. In the least squares method forcing factors sl* must be defined, and their relative values may affect the success of the method. Table V shows the results of several forcing factor combinations used with all possible cross terms formed from R,’s in decreasing order of intensity 1, 4, 8, 12, and 16. While no concrete conclusions may be drawn from Table V it does imply that the symmetric ranges 0, 1, 2 and - 1, 0, 1 are superior to the asymmetric ranges 0, 1,4 and 0,3,4.

+

+

+

+

+

Table IV. Incorrectly Classified Carbon Compounds Acceptable Actual Compound S%* range of s, Si CHaNHz 1 0.5 - + 1 . 5 2 1.89 CH,CH,CCHI CHs 0

94.6

8

98.5 100.0 94.6 94.6 93.1 92.3 92.3 91.5 92.3 94.6 100.0 99.2

+

(6) N. J. Nilsson, “Learning Machines,’’ McGraw-Hill Book Co., New York, N. Y.,1965.

.. .

. .

. . a

..

./ ;

. .:

: I

7

1

,

,

*;.

20

40

:. .. .*.

~

..

I

,

.,

,

,

,

100

120

140

100

iao

LO

LO

40

60

60

100

MOLE cu L nd

110

I40

160

le0

SO0

w E I G HT

Figure 2. Recognition of computed molecular weight cs. actual molecular weight

PREDICTION The least squares method was applied to both the binary case and multicategory cases. In the special case of a binary decision the least squares pattern classifier has the same number of adjustable parameters and should be able to approach the same level of success as a binary pattern classifier using iterative feedback. However, because the least squares method is a single approximation it is not expected to be comparable to the binary pattern classifier using iterative feedback in the case of difficult problems. For the same set of 300 spectra randomly chosen the binary pattern classifier was trained to 100% recognition (300/300) and had a predictive success of 92.7% (306/330). The least squares pattern classifier (without cross terms) using forcing factors of -1 for no oxygen and +1 for oxygen had better than 99% recognition (299/300) and 88 % prediction (291/330). Thus the least squares method, as a binary pattern classifier, compares favorably to the procedure using iterative feedback. Two processes were considered for maximizing predictive ability in multicategory classification: (1) Size of training set and (2) Compression of data to decrease the number of arbitrary parameters. For each test all 630 compounds were used, some to form a randomly chosen training set and the remaining to form the predictive set. Oxygen number was categorized as 0, 1, 2, 3, or 4 and these values were also employed as forcing factors ($,*). Because of the length of computation no "best line" process was used and each s1 was simply rounded to the nearest integer and then tested for correctness. Training Set Size. Table VI shows the effect of training set size on recognition and prediction. As expected, recognition slowly decreases as training set size increases. On the other hand prediction increases noticeably with increasing training set size. A predictive ability approaching 80% is quite useful for a five category problem. In order to predict a five category case using the branching tree method of binary pattern classifiers three decisions would be required. With a good binary pattern classifier having a predictive ability of 90% (2) a five category case could be expected to have 73 predictive success.

so

80

eoo

MOLECULAR WEIGHT

Figure 3. Prediction of computed molecular weight Values from the prediction set that were not plotted along with the true values in parenthesis are: -119 (122), 212 (132), 215 (174), 1232 (194), 1234 (194), 1243 (194), -2339 (156), 903 (124), 518 (180)

Compression of Data Set. In order to decrease the number of arbitrary parameters, with its attendant benefits of decreased computa tion time and hopefully increased prediction, a method of data compression was evaluated. In the earlier discussion of casting out masses the number of parameters was decreased by dropping selected mass positions. The method used here is to combine adjacent mass positions by taking the averages of their components. The effect serves to retain the intensity data but in a less resolved form. After choosing a training set of 300, groups of adjacent mass slots were averaged. Table VI1 shows the results of this data compression. The recognition shows a general decrease as expected, but a definite increase in prediction is seen suggesting that removing arbitrary parameters, even at the expense of losing data resolution, can improve predictive ability.

Table VI. Recognition and Prediction as a Function of Training Set Size Oxygen number = 0, 1, 2, 3, or 4 Training set size Recognition Prediction 200 99% (198/200) 69 % (297/430) 300 96 (288/300) 78 % (257/330) 400 94 (376/400) 7 9 z (182/230)

Table VII. Recognition and Prediction as a Function of Data Compression Oxygen number 0, 1, 2, 3, 4 Number of mass Number positions combined to of dimen- Recognimake each dimension sions tion Prediction 1 (no compression) 144 96% 78 % 2 72 88 Z 80 % 4 37 84X 82 % 8 19 79z 79 % 16 9 73z 71 %

VOL. 41, NO. 6, MAY 1969

699

Table VIII. Recognition and Prediction for Hydrogen Number (1 to 22) Recognition Deviation of Total set +4 0.7 +3 2.0 +2 7.3 +1 20.0 0 41 .O -1 20.0 -2 7.3 -3 1.o -4 0.7 -~

Prediction

Deviation 2 +4 +3 +2

+1

0 -1 -2 -3

5 -4

of Total set 8.2 5.2 14.2 19.4 21.8 12.1 10.6 2.4 6.1

To test further the usefulness of least squares multicategory pattern classification, two problems having large numbers of categories were attacked. The first was hydrogen number with categories from 1 through 22 and the second molecular weight with values from 16 to 202. Table VI11 shows recognition ability for hydrogen number from a training set of 300. The deviation in units of hydrogen number is listed with the percentage of the total set. Recognition gave 41 correct;

however, recognition rises to 81% after the limits are expanded to % 1 hydrogen. Table VI11 also shows the prediction for 330 compounds. In this case the correct values are only 21.8% but still 53.3% fall within + l hydrogen and 78.1% within 1 2 hydrogens. The standard deviation for recognition is 1.7 hydrogens and for prediction is 2.7 hydrogens. Figure 2 shows the result of an attempt to obtain molecular weight recognition for a training set of 300. In this case the number of categories is quite high. A standard deviation of 11.4 is produced which includes some extremely wrong values that would ordinarily be discarded by the experimenter. Figure 3 shows the predictive ability for molecular weights obtained for the remaining 330 compounds. The standard deviation is 237.0, including some negative values and very large positive ones. Comparisons of Figures 2 and 3 show that while considerable recognition may be developed for molecular weight, the predictive ability inherent in this method is of little use. CONCLUSIONS

Least squares pattern classification has been shown to have useful applications in both recognition and prediction of multicategory classification. The method has the advantages of simultaneously considering any number of categories and of producing a single weight vector which can place patterns in several categories. This method, which could be improved by a number of approaches, offers a useful complement to the binary pattern classifier using iterative feedback. RECEIVED for review October 28, 1968. Accepted February 13, 1969. Research supported by the National Science Foundation.

Q-Switched Laser Energy Absorption in the Plume of an Aluminum Alloy E. H. Piepmeier' and H. V. Malmstadt Department of Chemistry and Chemical Engineering, University of Illinois, Urbana, Ill. 61803 Time- and spatially-resolved spectrometric observations were made of single and multiple spike, Q-switched laser plumes that are to be used for analytical procedures. The observations suggest the rapid formation of an atmospheric plasma initially containing little sample material. The resulting intense continuum background radiation which is present in time-integrated spectra lasts for only a few tenths of a psecond during and after each laser spike and can be separated in time from characteristic line emissions which last many pseconds. Measurements indicate that a large fraction of the energy in each spike of a laser pulse is absorbed by the plume at some distance from the surface of the target and, therefore, does not reach the target to cause direct vaporization. Observations of the craters and spectra suggest that the hot plume continues to cause sample vaporization after the laser pulse is over.

SINCETHE INVENTION of a pulsed, high-powered laser in 1960, investigators have used its ability to vaporize a material and excite the resulting plume for spectrochemical emission and atomic absorption analyses. Margoshes and Scribner ( I ) have recently reviewed over 40 papers by some of these investigators. Because of the laser's ability to vaporize micron-diameter 700

ANALYTICAL CHEMISTRY

regions of virtually any target material, it offers unique possibilities for the elemental analysis of small microgram samples and preselected local regions of larger samples. Unfortunately, each new analysis is hampered by a lack of knowledge of what is taking place in the laser sampling, excitation, and cooling steps, and also by lack of suitable standards for the new materials. Consequently, much time is spent in emperically establishing acceptable analytical procedures. A better understanding is needed of the way energy is transferred from the laser beam to the sample, the way material is ejected from the sample and excited, the processes involved when energy is transferred from the laser beam to the resulting plume, the cooling processes, and the final vapor loss processes. This should not only lead to the active control of conditions for obtaining optimum sensitivity, accuracy, and reproducibility for a particular analysis, but should also help the investiPresent address, Department of Chemistry, Oregon State University, Corvallis, Ore. 97331 (1) M. Margoshes and B. F. Scribner, ANAL.CHEM., 40,223R (1968).