Multiple discriminant function analysis of carbon-13 nuclear magnetic

(1) C. Allegre, N. Deschamps, J. Faucherre, and M. Flanotel-Monstard, C. R. Somm. Seances Soc. Geol. Fr„ 8, 297 (1964). (2) F. M. Graber, H. R. Lukens...
6 downloads 0 Views 389KB Size
The determinations we performed brought interesting metallurgic and analytical results. We showed that rhenium at the ppm level may be the major metallic impurity in the pure molybdenum, if the source of metal is badly chosen. In tungsten, the concentration of rhenium is very low, in discrepancy with earlier results. In this case, we discovered that spark source mass spectrometry has a blank of some pg of rhenium per g of tungsten, when the actual concentration measured by activation analysis was of some g/g. On the contrary, the agreement between activation analysis and mass spectrometry is good for alloys containing some percent of rhenium.

LITERATURE CITED (1)C. Allegre, N. Deschamps, J. Faucherre, and M. HanoteCMonstard, C. R. Somm. Seances SOC.Geol. Fr., 8,297 (1964). (2) F. M. Graber, H. R. Lukens. and J. K. Mackenzie, Trans. Am. Nucl. SOC., 10, 26 (1967). (3)T. A. Linn, J. M. Byrne, and G. M. Sandquist, Anal. Chim. Acta, 81, 303 (1972). (4)V. A. Kovaienker, I. P. Laputina, and L. N. Vyal'sov, Dokl. Akad. Nauk SSSR,217, 187 (1974). (5) G. Lorang, J. P. Langeron, and Vu Quang Kinh, Ann. Chim., 8, 239 (1973). (6)D. R. Hay, R. K. Skogerboe. and E. Scala, J. Less-Common Met., 15, 121 (1968). (7) W. Goishi and W. F. Libby. J. Am. Chem. SOC.,74, 6019 (1952). (8) Yu. A. Bankovskii, A. F. lyevinsh. and E. A. Luksha. Zh. Anal. Khim., 14,

714 (1954). (9)H. G. Doge, G. Ehriicti, H. Grosse-Ruyken, 0. Grossmann, and B. Neef.

2 Int. Symposium, "Reinstoffe in Wissenschaft and Technik", Dresden, Sept-Oct 1965,p 485. (IO) E. S. Gureev, T. Islamov, E. M. Lobanov, I. A. Miranskii. and A. A. Khaidarov, Dokl. Akad. Nauk Uz. SSR,25 (2),21 (1968). (11) 2 . K. Doctor and B. C. Haldar, J. Radioanal. Chem., 9, 19 (1971). (12)J. P. Op de Beeck, J. Radioanal. Chem., 4, 137 (1970). (13)S.Tribalat, Anal. Chim. Acta, 3, 113 (1949). (14) C. Klofutar, V. Stuiar, and F. Krasovec, Z.Anal. Chem., 214, 27 (1965). (15)F. Sebesta, S. Posta, and 2 . Randa, Radlochem. Radioanal. Len., 11, 356 (1972). (16)S.P. Patil and V. M. Shide, Separ. Sci., 9 (3),259 (1974). (17) K. Terada, S. Yamanaka, and T. Kiba, J. Radioanal. Chem., 20, 27 (1974). (18) N. F. HallandD. N. Johns, J. Am. Chem. Soc., 75, 5787 (1953). (19)F. H. Huffmann, R. L. Oswalt, and L. A. Williams, J. horg. Nucl. Chem.. 3, 49 (1956). (20)J. W. Morgan, Anal. Chim. Acta, 32, 8 (1965). (21)J. Korkish and F. Feik, Anal. Chlm. Acta, 37, 364 (1967). (22)K. lshida and R. Kuroda, Anal. Chem., 39, 212 (1967). (23)A. M. Phipps, Anal. Chem., 43, 467 (1971). (24)B. Chayla, Thesis, Paris, 1973. (25)N. Renault and N. Deschamps, Radlochem. Radioanal. Lett., 13, 207 (1973). (26)Yu. G. Sevastianov, J. Radioanal. Chem., 21, 247 (1974). (27)P. W. Atteberry and G. E. Boyd, J. Am. Chem. SOC., 72, 4805 (1950). (28)M. Pirs and R. J. Magee, Talanta, 8,395 (1961). (29)K. ishida. K. Kawabuchi, and R. Kuroda, Anal. Chim. Acta, 36, 18 (1966). (30) M. Fedoroff, J. Blouri, and G. Revel, Nucl. lnstrum. Methods, 113, 589 (1973). (31)L. A. Currie, Anal. Chem., 40, 586 (1968). (32) M. Fedoroff, Ann. Chim., 6, 159 (1971).

RECEIVED for review April 24, 1975. Accepted June 10, 1975.

Multiple Discriminant Function Analysis of Carbon- 13 Nuclear Magnetic Resonance Spectra: Functional Group Identification by Pattern Recognition Charles L. Wilkins' and Thomas L. lsenhour Department of Chemistry, University of North Carolina, Chapel Hill, N.C. 275 14

Previous papers (1-3) have reported systematic investigations of linear learning machine analysis of carbon-13 NMR spectra in which a variety of discriminant functions, derived from various pre-processed forms of the spectral data, were developed. Those studies showed that suitably processed forms of either time or frequency domain mappings of the data could be used as the basis of weight vectors for identifying the presence or absence of several functional groups via analyses of the NMR spectra of unknown compounds. This note describes the investigation of synergistic use of several differently derived vectors (Le., from different pre-processed forms of the data), to yield decisions on unknown structures. Earlier research, using several data sources simultaneously, has shown that for some kinds of structural questions, improvements in learning machine predictions can be achieved (4, 5 ) . Similar results are found with non-learning machine computer interpretation (6). The key questions addressed in the present work are those of interest in considering practical use of the method for investigation of structural problems. First, what are the results for unknown data, no; used in training? Second, what are the results when one demands, simultaneously, correct answers for a number of functional groups? Finally, Visiting Associate F'rofessor, 1974-75 Academic Year, from the University of Nebraska-Lincoln.

is the degree of reliability such that it can be of use to a spectroscopist? Presented here are the results of the use of a committee threshold logic unit (TLU) for making decisions on the presence or absence of seven structural features for each of 62 unknown compounds, using as inputs predictions of carbon-13 NMR TLU's developed earlier. It was expected that this approach would yield more reliable decisions than those produced by the individual TLU's. It was also expected that higher reliability could be attained if unanimity of the committee were made a threshold condition, with a no-decision option being included.

EXPERIMENTAL Data Base. Linear discriminant functions were developed using a collection of 13C NMR spectra containing 500 spectra measured on two different instruments and in eight different solvents (7). Chemical shifts were referenced to tetramethylsilane and covered a range of about 200 ppm. Eighty of the spectra were obtained in the CW mode, the remainder in the pulsed mode. All were proton noise decoupled. Intensities were digitized manually in 1% intervals. Threshold Logic Units. In the present study, five linear discriminant functions were used for each of the functional group predictors. These were: 1) the peak no peak (PNP) weight vector ( I ) ; 2) the peak no peak Hadamard transform (HAD (PNP)) weight vector (2);3) the normalized absolute intensity (NAI) weight vector ( 1 ) ; 4) the positive value NAI free induction decay (FID)

ANALYTICAL CHEMISTRY, VOL. 47, NO. 11, SEPTEMBER 1975

1049

Table I. Summary of Functional Group Predictions for Unknown Compounds Number with a l l predictions correct’ Compound type

Aryl bromidesj Aliphatic acids Aliphatic D i a c i d s Alkyl c h l o r i d e s Alkyl b r o m i d e s Aldehydes and k e t o n e s Aliphatic a l c o h o l s Miscellaneous j Miscellaneous

N

Source

3 5 5 5 5 9 14 8 3

b

XAI

FID

HAD(PNP)

HAD(FID)

Committee

2

2

0

3

3

2

C C

d d

e

f R h

5

Experimentalj

PIiP

i

Total 62 16 17 8 11 16 19 Presence or absence of each of the seven functional group categories: carboxylic acid, alkyl bromide, alkyl chloride, aldehyde-ketone, aliphatic alcohol, phenyl, and carbonyl (any). Reference 8, spectra 448, 452, and 454. Reference 9, p p 147-148. Reference 9, p 133. e Reference 9, pp 145-146. f Reference 9, p 140. g Reference 7. Reference 10, pp 82, 120, and 124. Reference 3. ’Intensity information was available for these spectra. a

Table 11. Committee Prediction Results= KO.of compound-

Vote F:esent

Absent

questions b

Table 111. Correctness of Committee Decisions by Degree of U n a n i m i t p

No. correct

for majority vote (%)

“0

vote

0 5 259 243 (93.8) 1 4 77 6 1 (85.9) 2 3 31 23 (74.2) 3 2 25 11 (44.0) 4 1 20 1 6 (80.0) 5 0 22 2 0 (90.9) a Seven functional groups listed in footnote a, Table I. Seven questions posed per unknown, total of 62 unknowns and 434 questions. ~~

weight vector ( 3 ) ;5) the positive value NAI-FID Hadamard transform (HAD(F1D))weight vector ( 3 ) .The last of these (HAD(F1D)) was developed to operate on the Hadamard transform of the real part of the 256-point Fourier transform of normalized absolute intensity data. The output of each of these five predictors served as the input to the committee TLU. Unknowns. Unknowns (totaling 62 spectra) were obtained from five different sources. Eight of the sixty-two unknowns were selected from the collection used for developing the weight vectors (7), excluding those spectra used for training. Three of the unknowns were taken from “Organic Magnetic Resonance-Spectral Supplement” (8). Forty-three unknowns were obtained from a recent book (9) which contained only chemical shift and structural information, and omitted intensities. Three unknowns were taken from another recent book which contained both chemical shifts and intensities ( I O ) . Five unknowns were obtained by determining their spectra with an XL-100-15spectrometer ( 3 ) . Calculations. Computations were carried out with a program written in FORTRAN IV and using a Raytheon 704 computer. Weight vectors used were calculated previously (1-3) using an IBM 360/65 computer. Each of the five types of weight vectors described above was applied to each of the unknown spectra (after applying suitable pre-processing to the unknown) for each of the seven functional group questions investigated. A majority vote (either a simple or greater majority) determined the decision regarding the presence or absence of each of the seven groups for each unknown. The program allowed the user t o stipulate either a “batch” mode of operation whereby unknown spectra could be submitted on cards and automatically processed or, alternatively, an interactive mode. In the latter case‘, the user could specify any of the transform operations to be carried out on the unknown and the results displayed on a television screen. When the user was satisfied the data had been entered correctly, the “predict” option could be selected and each of the appropriate operations then carried out on the unknown data (Le., all of the 35 weight vectors were recovered from magnetic tape and applied to the appropriate preprocessed form of the unknown spectrum). In the prediction phase 1850

Cmpdquestions

434 Simple majority 378 4 or 5 agree 281 Unanimous a Decisions for either presence or functional groups.

NO. correct (”)

of

total questions answered

374 (86.2) 100 3 4 0 (90.0) 86.9 263 (93.6) 64.3 absence of each of the seven

of the operation, all pre-processing and prediction was performed automatically with the results being output to a high-speed printer (or, optionally, to the television display).

RESULTS AND DISCUSSION The results of examining the 62 unknown spectra are given in Table I. None of the individual weight vectors yielded results superior to those of the committee decisions. In fact, the number of perfect answers (making no mistake on any of the questions posed) obtained by the committee decision was 19 of the 62 spectra of unknown compounds presented. Equally significant are the data in Table I1 and 111. Table I1 contains a breakdown of the results on a per question basis for either presence or absence of the functional groups considered. Since seven questions were posed for each of the sixty-two spectra analyzed, a total of 434 questions were asked. As Table I11 reveals, if a simple majority vote determines the predictions, about 86% of all the questions are answered correctly. False positive (presence) predictions are made in approximately 30% of the questions and false negative (absence) predictions for about 11%of the questions. More important, if a no-decision region is introduced by requiring unanimity of the five discriminant functions, the over-all predictive ability rises to approximately 94% for the 64.3% of the questions for which this criterion is satisfied. Furthermore, the incidence of both false positive and false negative predictions is reduced to less than 10%.Considering that none of the unknowns were drawn from any of the training sets and that for the majority of unknowns intensities were unavailable and were therefore assigned arbitrary values (proportional to the number of carbons in each magnetically non-equivalent group), the results are encouraging. Table I11 provides clear evidence of the validity of the

ANALYTICAL CHEMISTRY, VOL. 47, NO. 11, SEPTEMBER 1975

basic premise that improvement in decision reliability can be expected with use of a suitable committee TLU. Turning to the performance of individual weight vectors, i t is seen that the FID and HAD (PNP) were significantly worse than the other three. This may be the result of the absence of actual peak intensity information for about two thirds of the unknowns. It seems intuitively reasonable to expect a greater effect of discrepancies in peak :intensities on Fourier or Hadamard transformed frequency domain data, due to the fact that these transforms result in a "spreading" of peak intensity information across the entire transformed representation. This conclusion is further supported by the observation that, for the nineteen spectra where peak intensities were available, the performance of these predictors was much better (FID 4 of 19 perfect (21%) and HAD (PNP) 7 of 19 perfect (37%)) whereas for the remaining 43 spectra the performance of both were 4 perfect (9%). The performance of the two-layer 'FLU method examined in the present work further suppsorts the earlier conclusion (2) that a flexible interpretation approach, allowing the experimenter to choose the type or types of pre- and post-processing of unknown data is most likely to yield practically useful structural suggestions. For the particular set of questions asked here, the comlmittee results were best. On the other hand, if the unknowns had been obtained in the form of un-transformed ]?ID data, it is possible that the FID predictor would have proved superior. With respect to the questions initially raised, the results of this study establish that performance of the weight vectors used is satisfactory for unknowns and that results of reasonable reliability can be obtained for all of the structural questions examined. Obviously, a practicing spectroscopist would use results of this type in conjunction with whatever

other information may be available to solve actual structural problems. In that context, many more decision vectors would be required, but reliability in the range found here should be acceptable. In work under way, methods for using various types of spectral information to further enhance the confidence with which chemical structure inferences may be made by machine interpretation methods are being explored (11).

LITERATURE CITED (1) C. L. Wilkins, R. C. William, T. R. Brunner, and P. J. McCombie, J. Am. Chem. SOC.,96, 4182 (1974). (2) T. R. Brunner, R . C. Williams, C. L. Wilkins, and P. J. McCombie. Anal. Chem., 46, 1798 (1974). (3) T. R. Brunner, C. L. Wilkins, R . C. Williams, and P. J. McCombie, Anal. Chern. - - , 47.662 119751 . _, (4) P. C. Jurs, 8. R. Kowalski, T. L. Isenhour, and C. N. Reilley, Anal. Chem., 41. 690 (1969). (5) P. C.Ju&, B. R. Kowalski, T. L. Isenhour, and C. N. Reilley, Anal. Chem., 41, 1949 (1969). (6) W. Voelter, E. Breitmaier, G. Breitmaler, D. Gupta, G. Haas, and W. A. Konig, Chem. 2..97, 239 (1973). (7) L. F. Johnson and W. C. Jankowski, "Carbon-13 NMR Spectra", John Wiley and Sons, Inc. New York, N. Y., 1972. (8) J. F. Hlnton and B. Layton, Org. Magn. Reson., 4 (Spectral Suppi.), 448-452 (1972). (9) J. B. Stothers, "Carbon-I3 NMR Spectroscopy", Academic Press, New York, 1972. (IO) G. C. Levy and G. L. Nelson, "Carbon-13 Nuclear Magnetic Resonance for Organic Chemists", Wiley-lnterscience New York, 1972. (11) J. B. Justice, T. L. Isenhour, and C. L. Wllkins, Abstracts, Twenty-sixth Pittsburgh Conference on Analytical Chemistry and Applied Spectroscopy, Cleveland, Ohio, March, 1975, Paper No. 198.

...-

~

RECEIVEDfor review November 22, 1974. Resubmitted March 28, 1975. Accepted May 5, 1975. Support of this research through Grant CP-41515X(CLW) and Grant GP43720X(TLI) by the National Science Foundation is gratefully acknowledged.

Qualitative Analysis of Lacquer and Similiar Solvent Mixtures by Gas Chromatography Benjamin Levadle and Stephen MacAsklll State of Vermont Division of Occupational H61akh.P.O.Box 607,Barre, Vt. 05641

Evaluation of workplace air to determine whether it meets with the criteria for occupational safety and health requires qualitative and quantitative information about possible pollutants in the employees' environment. To ascertain the exposure to organic vapors and solvents, air samples are taken on activated charcoal. Absorbed vapors are eluted from the charcoal using carbon disulfide and the eluate is analyzed both qualitatively and quantitatively as required ( I , 2 ) on two or more column substrates by chromatography. Frequently, as a first step, simple qualitative information concerning materials being used is all that is needed. For the analytical laboratory such information is especially useful since it assists in the resolution of components on the chromatogram by reference to analysis of source materials, especially when paints, lacquers, varnishes, and similar systems are involved. The common procedure for the analysis of such systems, consisting of a vehicle which is a mixture of organic solvents and a nonvolatile component, is first to separate the two either by vacuum or steam distillation. The resulting distillate is subjected to a systematic solubility analysis, using mixed inorganic acids and then dimethylsulfate, for

general characterization (3). Identification of specific components may be accomplished by gas chromatography, infrared spectrometry or other suitable methods. Although the analytical methodology works quite well, problems frequently are met with during the distillation operations. They are time consuming. Systems which polymerize rapidly or change on being warmed or heated are frequently encountered. I t became necessary for our laboratory to resolve these and other difficulties with as simple and as time saving an approach as possible. We have developed an analytical procedure which takes advantage of the fact that the enclosed headspace of a container holding a paint, varnish or lacquer will be composed of a distribution of the volatile components present in the solvent vehicle. The procedure is quick, simple, and requires only readily available apparatus. It utilizes the solvent stripping technique ( 3 ) described below. Figure 1 is a photograph of the chromatographic analysis obtained for headspace of a mixture of chlorinated hydrocarbons. Bulk samples are collected in 500-ml widemouth bottles (Arthur H. Thomas 1712-E33) closed with aluminum lined

ANALYTICAL CHEMISTRY, VOL. 47, NO. 11, SEPTEMBER 1975

1851