Computer methods in analytical mass spectrometry. Empirical

May 1, 2002 - Identification of mass spectra by computer-searching a file of known spectra ... Investigation of combined patterns from diverse analyti...
2 downloads 0 Views 593KB Size
verted t o integers, either by rounding off or by truncation. As might be expected, over a range of test compounds more retrievals are achieved when the unknown spectra are rounded off rather than truncated.

MOLECULAR WEIGHT REJECTION

If the molecular weight of an unknown sample is known, and the catalog tape is arranged with the spectra sorted in order of descending molecular weight, the spectra can be skipped until the group of spectra with the correct molecular weight is reached. These can be compared and the search stopped when it is passed. To be successful this method depends on a reliable determination of the unknown molecular weight. The need for this can be partly overcome by defining a range of masses which will most probably include the parent mass. Except when an impurity is present, the parent mass is unlikely t o be less than the highest recorded mass minus 9 amu. I t may however be greater-e.g., for substances such as alcohols, which tend t o lose H2O readily. A system was tried in which the spectra were examined if their molecular weights were at the three highest recorded masses of the unknown and these plus 15, plus 18, and plus 44 t o allow for a possible loss of CH3, HzO, or C02. It might be expected that skipping the spectra in the library tape in descending order in blocks until the desired masses are encountered would give much greater speed. Figure 3 shows the number of spectra to be examined at each mass number. However, the skip function has to read

the tape t o find the file marks so that this method was found t o give an increase in scanning speed of only the same order as that using noncommon mass rejection. If a disk or addressable tape were available, a much greater increase might be possible. These two methods of rejection can be combined, giving a slight increase in speed as is shown in Table 111. A typical output from the computer is given in Table IV using as unknowns two spectra published by Pettersson and Ryhage (7). In both cases shown a correct identification is made.

CONCLUSION If a catalog of complete mass spectra can be stored, the problem of retrieval and identification of a given spectrum is almost a trivial one. Even when using the ASTM catalog of six strongest mass spectral peaks, a very successful search routine can be devised and, by making use of simple filtering techniques, identifications can be achieved in times commensurate with real time of the GLC-MS. ACKNOWLEDGMENT The authors are indebted to Caulfield Technical College for the use of its card sorting machine in this work,

RECEIVED for review December 18, 1967. Accepted April 15, 1968.

Computer Methods in Analytical Mass Spe-ctrometry Empirical Identification of Molecular Class L. R. Crawford Division of Chemical Physics, C.S.I.R.O.,Chemical Research Laboratories, Melbourne, Australia

J. D. Morrison Division of Physical Chemistry, La Trobe University, Bundoora, Victoria, 3083, Australia The use of the computer to identify from a mass spectrum the functional groups present in an unknown molecule i s discussed. Four empirical methods of carrying out this operation are described. Three of these involve the derivation of “avera e” mass spectra representative of various functional groups, or of atom and bond structure content. The fourth makes use of mass spectral correlations similar to those derived by McLafferty. A weighted mean is then taken of the results. It is concluded that at least in the case of low molecular weight compounds, a successful group identification can usually be made.

of this series (1) dealt with the problem of the recognition of an unknown spectrum, which is known t o be a member of a catalog. It was shown there that the mass spectra were highly specific and that even in the case where the samples were impure, or noise was present, in many cases at least identification in this way was a rather trivial problem. THE FIRST PAPER

(1) L. R. Crawford and J. D. Morrison, ANAL.CHEM.,40, 1464

(1968).

It was also evident that the category of search could be reduced in various ways and thereby this identification process was speeded up considerably. The present and succeeding papers will deal with the more general problem where the unknown spectrum may not be listed in a catalog and where the identification has to proceed ab initio. Perhaps the first thing that a mass spectrometrist does on examining a mass spectrum of an unknown is to look for certain key peaks which are indicators of the presence of functional groups in a molecule (2). These peaks lie usually in the mass region between mje = 12 and mje = 200. If a preliminary identification of the class of a molecule can be made, the subsequent elucidation of the complete structure thereby becomes much easier. For some analytical work, a complete identification is not even required, all that is wanted is a determination of the type of molecule present and, for example, the relative amount of (2) F. W. McLafferty, “Mass Spectral Correlations,” Aduan. Chem. Ser. 40, Analytical Chemistry 1963. VOL. 40, NO. 10, AUGUST 1968

0

1469

A

Figure 1. The centre of gravity of compounds on the hypersphere used as a measure of the spread of the compounds over the surface Distance R increases as the compounds become more scattered

branched and straight chain material in a hydrocarbon mixture. Type analysis of this kind has been carried out for several years in oil research laboratories ( 3 , 4 ) . Before looking at details of specific peaks and their relation to the presence of functional groups and classes of molecules it is of interest to examine in quite a n empirical way the similarity of the mass spectra for various classes of molecules. EMPIRICAL CORRELATIONS BETWEEN MASS SPECTRA Any mass spectrum, consisting of a set of mass numbers and peak heights, may be considered as defining a point in a n n-dimensional hyperspace, where each mass number and peak height gives a corresponding abscissa and ordinate. The set of all recorded mass spectra for pure compounds can be mapped as a set of such points on to a space of, say, 500 dimensions, corresponding to all peaks in the mass range 1500. All of these points will occupy the positive quadrant of this space and if the mass spectra have been normalized so that 500

P,2=1 n=l

will lie on the surface of a hypersphere of unit radius. A very similar approach to this has been suggested previously by V. V. Raznikov and V. L. Talroze ( 5 ) . Each mass spectrum can be regarded as a star in this hypersky and the closer any two stars are, that is, the smaller the scalar distance between them given by

n-I

the greater the similarity of their spectra. The various chemical functional groups, often associated with a heteroatom, usually dominate the molecular mass spectral fragmentation pattern. Considering the mass spectra of molecules containing the group-CHO attached to a n aliphatic radical R,some of the key fragment ions remain the same in all members, regardless of R,other common features of the spectrum are shifted to different positions on the mass scale, depending on the size of R. Because the spectra of the (3) L. R. Snyder. H. E. Howard, and W. C. Ferguson, . ANAL CHEM.35,1676 (1963). f4) “Prooosed Method of Test for Hydrocarbon Types in Low Olefini’c Gasoline by Mass Spectrometry”; ASTM- Standards on Petroleum Products and Lubricants, Vol. I, 38th Ed., American Society for Testing and Materials, Philadelphia, Pa., 1961, pp 1120-1141. ( 5 ) V. V. Raznikov and V. L. Talroze, Dokl. Akad. Nauk SSSR, 170, 379 (1966). \

,

\

,

1470

ANALYTICAL CHEMISTRY

Table I. Distances of Centre of Gravity of Each Group Inside Hypenphere Radius of hypersphere = 100 Aromatics 14.6 Ethers 33.0 Acids 37.5 Ketones 21.4 Aldehydes 25.8 Esters 26.8 Alkenes 20.9 Alkanes 29.5 Alcohols 30.4 Cycloalkanes 24.6 Dienes 18.5 Amines 30.6

molecules in each chemical class are related in this way, the spectra of all paraffins, ketones, ethers, amines, etc., will form constellations in this hypersky. It is of interest t o examine whether these constellations form well separated groups o r whether they overlap to any great extent. Examples of the compounds used to compile these data for two classes are listed below: Ketones Acetone 2-Butanone 3-Methyl-2-butanone 2-Pentanone 3-Pentanone 2-Hexanone 3-Hexanone 2-Methyl-3-pentanone 3-Methyl-2-pentanone 3-Heptanone 2-Heptanone 2,4-DimethyI-3-pentanone 5-Methyl-2-hexanone 4-Heptanone 2-Methyl-3-hexanone CHeptanone 2-Methyl-3-heptanone 2,6-Dimethyl-3-heptanone 2,6-Dimethyl-4-heptanone 3-Octanone 2-Methyl-3-octanone 4-Octanone 2,6-Dimethyl-4-heptanone 7-Methyl-4-octanone 3-Nonanone CNonanone 2-Methyl-3-nonanone 4-Decanone 5-Decanone 2-Methyl-5-decanone 5-Undecanone

Amines Methylamine Dimethylamine Ethylamine Trimethylamine n-Propylamine terc-Butylamine Diethylamine n-Butylamine Sec-Butylamine Isobutylamine Triethylamine Dimethyl-n-Propylamine 3,5,5-Trimethyl hexylamine 2-Ethylhexylamine Cyclohexylamine N-Ethylaniline

There are a number of tests which can be applied to verify the spread and overlap of these constellations. First, the spread of these constellations can be measured in the following way. Because the points all lie on a sphere, the centre of gravity of any group-e.g., the ketones-will lie inside the sphere. For a group of very closely similar spectra, this point will be almost on the sphere; (Figure l), the greater the spread within a group, the greater the distance of their centre of gravity inside the sphere (R in Figure 1). These values are shown for a number of representative families in Table I. Second, the extent to which the constellations overlap can be determined by comparing first the distances between the

Table 11. Intergroup-Centre Distances Distances between projections on the hypersphere of centres of gravity of each class of molecule. Radius of hypersphere = 100 CycloAromatic Ether Acid Ketone Aldehyde Ester Alkene Alkane Alcohol alkane Diene Amine 121 115 131 120 128 129 128 130 125 134 0 129 Aromatic 50 88 117 129 115 105 85 77 91 0 84 129 Ether 104 96 120 132 110 97 114 96 84 0 108 Acid 134 42 81 106 124 118 85 56 35 108 0 91 Ketone 125 49 74 103 96 124 79 56 0 59 96 85 128 Aldehyde 46 74 110 127 120 59 0 91 97 35 77 130 Ester 75 86 38 0 102 116 79 91 85 114 105 120 Alkene 0 78 75 94 124 114 49 46 104 42 88 Alkane 128 78 0 100 129 115 86 74 74 96 81 50 129 Alcohol 94 100 0 110 38 93 118 96 120 106 117 121 Cycloalkanes 93 124 129 102 0 129 124 127 132 124 129 115 Dienes 114 115 116 118 129 0 103 120 110 118 115 131 Amine Table 111. Distances between Points Corresponding to Individual Compounds, and Projected Centres for Each Class of Molecule CycloAromatic Ether Acid Ketone Aldehyde Ester Alkene Alkane Alcohol alkane n-Propyl benzene Ethyl sec-butyl ether rz-Butanoic acid 2,6-Dimethyl-3-heptanone Decanol 12-Propyl butanoate 1-Butene ri-Heptane 2-3-Dimethyl-3-pentyl alcohol Cycloheptane I-4-Pentadiene Di-N-propylamine

55 139 140 136 138 139 136 138

134 55 94 121 115 112 122 119

134 112 77 119 121 115 119 119

130 101 113 35 77 47 93 62

132 101 109 75 50 72 85 65

130 104 111 60 89 64 99 75

120 118 125 110 102 117 64 102

129 91 110 35 54 39 78 28

139 136 135 139

81 127 133 126

115 122 129 130

93 113 126 124

82 94 125 119

97 113 127 125

107 72 103 123

Diene Amine

134 74 98 96 99 89 106 101

127 116 121 104 90 107 53 93

117 127 132 121 117 123 95 120

135 120 118 114 103 117 118 114

80

80

98 125 121

118 134 126

95 54 113 122

122 86 54 129

119 123 132 55

Table IV. Distances between Points Corresponding to Individual Compounds, and Projected Centres for Groups of Molecules Classified According to Number of Rings, Double Bonds, and Heteroatoms No. of NO. of No. of No. of

rings double bonds 0 atoms N atoms

11-Propyl benzene Ethyl set-butyl ether n-Butanoic acid 2,6-Dimethyl-3heptanone Decanal ri-Propyl butanoate 1-Butene ri-Heptane 2-3-Dimethy1-3-pentyl alcohol Cycloheptane 1-4-Pentadiene Di-N-propylamine

0 0 0 0

1 0 0 0

0 1 0 0

0 2 0 0

1 3 0 0

1 4 0 0

0 0 1 0

1 0 1 0

0 1 1 0

1 1 1 0

0 2 1 0

1 4 1 0

0 0 2 0

1 0 2 0

0 1 2 0

0 2 2 0

1 0 3 0

0 0 0 1

1 0 0 1

1 3 0 1

138 136 136 135 55 132 139 139 139 138 139 108 139 140 139 139 141 140 140 126 119 127 122 133 134 137 72 120 119 125 122 132 66 77 102 132 84 127 134 130 119 122 119 129 134 137 114 126 120 121 124 128 114 108 99 133 125 131 134 129 62 113 65 94 75 113 102 72 28 98

93 85 99 64 78

126 125 127 103 125

130 132 130 120 129

137 95 101 47 103 123 81 139 90 53 65 83 118 108 136 100 101 69 109 122 78 132 109 111 108 100 116 127 137 85 75 35 92 113 89

101 118 106 134 134 138 77 109 96 105 120 93 54 53 113 127 134 100 99 103 57 100 120 86 95 54 117 121 121 119 121 113 122 114 123 118 132 135 138 121 120 109 123 126

projections of the centres of gravity for each group o n the hypersphere, Table 11, and second the distances between a given member of each group, and these projected group values, Table 111. An alternative method of grouping the reference spectra in constellations is on the basis of the number of rings, the number of double bonds, the number of 0 atoms, and the number of N atoms. The projection of the centres of gravity of these constellations on the hypersphere can be calculated, and Table IV

101 102 104 117 92

102 60 124 113 83 69 107 71 126 132 119 116 102 56 113

117 122 114 137 116

127 122 128 126 125

117 111 124 132 106

127 130 127 115 125

117 84 59 85 121 110 128 128 132 125 115 130 112 89 134 127 98 122 128 128 136 127 109 139 133 134 10 127 119 120 114 131 129 54 122 125

shows the distances between the mass spectral points for a number of substances, and these centres of gravity. In the present example, the identifications are not as accurate as those in Table 111; however, the information obtained is frequently complementary. If the constellations were well defined, it should be possible to identify the class of a molecule by deciding to which constellation that molecule’s mass spectral point belonged. I t is apparent from Tables 111 and IV that the spread in each group varies, in some it is slight, making identification clearVOL. 40, NO. 10, AUGUST 1968

* 1471

I

I ,

I

cut, in others the spread is much more pronounced, and identification becomes less certain. This mass hypersky has unusual relationships. The star corresponding to the mass spectrum of a mixture is frequently closer to the stars corresponding to the spectra of the components than it is to any others. Table V shows these relationships for several mixtures.

ACID

.

I*

ALKANE

MASS PERIODICITY SPECTRA

I t is very often the case,-e.g., in compounds containing an aliphatic chain-that the prominent peaks in a mass spectrum occur a t a periodic spacing on the mass scale. These periodicities can be examined by calculating a mass periodicity spectrum by formula

a"

ii a

500

I

Q=

AROMATIC

n=l

P(n> X P(n

+ m)

(3)

a process similar to autocorrelation. Examples of such spectra are shown in Figure 2. Different classes of molecule give different periodic spectra; the spectra within a given class have often greater similarity to each other than they have to any other. Table VI shows the results of comparing the mass periodicity for a number of substances, with average periodic spectra for their class. The similarity S was calculated

Figure 2. Mass periodicity Spectra aVeraged for the members of a few classes of compound

S

1

= 1

Q(n) (unknown) - Q(n) (group average) I (4)

Table V. Distances between Points Corresponding to Mixture Spectra and Projected Group Centres

Spectrum 1 Spectrum 2 Spectrum 3 Spectrum 4 Spectrum 5 Spectrum 6 Spectrum 1. Spectrum 2. SZectrum 3. Spectrum 4. Spectrum 5. Spectrum 6.

Aromatic

Ether

Acid

62 120 136 137 137 131

127 105 84 82 111 87

134 115 95 62 100 103

Ketone Aldehyde 127 56 103 97 84 57

Ester

Alkene

Alkane

131 55 96 87 77 61

122 112 119 114 122 94

129 82 104 100 100 57

128 93 102 99 101 61

CycloAlcohol alkane 129 104 95 92 112 86

122 128 127 123 133 112

Diene

Amine

117 134 136 136 138 130

126 128 120 119 130 93

58% N-Ethyl aniline, 35 % 1,2,3-trimethylbenzene, 7% other benzenes. 70 % Isopropyl benzyl ketone, 25 methyl acetate, 5 ethyl, propyl acetates. 70 % Dimethoxyethane, 30 % acetic, 5 % propionic acid, butyric acids. 10% Dimethoxyethane, 90% acetic, 5 % propionic acid, butyric acids. 70% Methyl ester of C19 acid, 30% acetone, C4 ketone. 70% 6-Dodecanone, 3 0 z C1, C3 aldehydes.

z

z

Table VI. Similarities (S) between Mass Periodicity Spectra for Single Compounds and Average Periodicity Spectra for Each Class of Molecule Aromatic Ethyl butyl benzene Ethyl butanoate 2-Methyl-l,3-dioxolone Styrene n-Butanoic acid 2,4-Dimethyl-3-pentanone 2-Methyl undecanal 1-Heptene n-Decane 3-Methyl-3-hexylalcohol 1,CPentadiene iso-Butylamine

1472

62 -

41 35 54 34 37 59 45 46 39 45 40

ANALYTICAL CHEMISTRY

Ester 42 73 46 30 48 42 42 41 44 54 35 38

Ether 42 63 73 _. 36 59 66 39 48 53 57 43 60

Acid 37 57 47 29 62 40 34 44 37 43 36 45

CycloKetone Aldehyde Alkene Alkane Alcohol alkane 43 50 69 33 43 83 47 43 64 58 39 49

46 49 56 42 39 59 64 55 67 57 46 46

38 55 60 42 51 59 41 58 55 49 58 65

43 53

65 39 46 67 47 53 67 56 46 53

44 58 61 40 48 61 55 57 59 70 49 52

32 50 58 38 51 55 32 47 43 41 55 70

Diene

Amine

39 52 53 46 51 52 44 64 49 48 67 67

38 58 62 38 58 57 39 55 50 50 53 73 -

Table VII. Distances between Points Corresponding to Reduced Spectra of Single Compounds, and Averages of the Reduced Spectra for Each Class AroCyclomatic Ester Ether Acid Ketone Aldehyde Alkene Alkane Alcohol alkane Dimethoxymethane 49 [681 44 42 30 27 38 51 21 16 28 22 18 25 25 m-Xylene 24 13 25 29 19 27 14 29 19 iso-Butyl benzene t-Butyl benzene 29 17 29 20 26 31 20 28 Methyl-butanoate 68 67 61 44 66 71 36 60 48 36 55 70 29 2-Methyl- 1,3-dioxolone 24 Pentanoic acid 9 53 46 38 43 46 41 59 35 24 71 55 3-Methyl-2-Butanone 78 49 78 64 41 2,4-Dimethyl-3-pentanone 24 71 55 50 83 65 42 2-Methyl-5-decanone 23 66 56 49 43 76 59 36 2-Ethyl hexanal 22 65 48 47 82 .5 1 76 57 43 Tetra-decanal 12 49 34 38 58 60 50 57 1-0ctene 21 51. 40 44 56 61 65 59 2,3-Dimethyl-2-butanol 20 63 70 61 58 48 43 53 36 2,3-Dimethyl-2-pentanol 17 67 72 66 65 54 40 59 33 Cyclohexane 30 28 23 18 31 31 43 34 23 62 1,SHexadiene 28 32 27 22 36 38 39 30 74 Cyclohexene 24 48 37 40 52 57 61 49 65 2-Ethyl-hexylamine 23 55 52 46 67 63 46 63 58 38 3,5,5-Trimethyl-hexylamine 62 55 51 66 66 43 62 59 35 -. __ 32 Propyl benzyl ketone 46 35 31 48 46 35 48 39 __ __ 24 26 16 32 29 25 32 21 28 N-ethyl aniline

&

&

6

Diene

Amine

20 31 29 30 34 28 23 39 39 34 40 45 43 32 30 79

41 20 24 25 51 57 34 60 61 65 62 42 47 52 52 24 29

48 34 52

fi __

31 40 __

29

30

Table VIII. Similarity Factors Based on Correlations of Significant Peaks Enabling Group Identification

1, 3-Diethyl benzene Methyl 3,7,11,15-tetramethyl hexadecanoate Ethyl sec-butyl ether C19 acid 6-Dodecanone Tetradecanal 1-Nonene Tridecane 3-Nonanol Cycloheptane 1,5-Hexadiene Di-Isopropylamine m-Methyl cyclohexanol Phenyl methyl ketone Cyclopentanone Di-ethoxypropane 2-Methyl- 113-dioxolone Styrene

Aromatic Ester 100.0 0 6.2 2.4 0 14.3 10.0 6.2 6.6 4.1 3.2 10.4 6.6 15.6 72.0 17.5 2.4 11.0 100.0

19.2 12.5 40.0 14.3 12.7 12.7 19.9 9.3 14.0 10.4 4.0 18.9 2.8 5.7 14.1 9.3 0

Ether 0

Acid 0

Ketone 0

Aldehyde

13.0 24.2 0 7.7 15.4 13.0 6.6 23.0 7.9 0 4.0 3.1 8.7 7.5 32.5 19.6 0

13.0 11.7 40.0 0 0 0

4.2 7.5 6.7 16.2 8.4 10.6 16.7 10.9 5.1 12.6 20.1 3.1 5.5 7.5 7.5 5.8 0

0 5.0 0 5.2 5.2 0 0 5.1 0 0 4.0 0 0 5.1 5.1 5.2 0

0 0

2.8 0 4.0 0 0 0 0

12.0 0

0

CycloAlkene Alkane Alcohol alkane Diene 0 0 0 0 0 12.7 5.0 0 5.0 10.2 19.2

12.7 7.5 0 14.3 15.2 19.2 23.3 14.4 22.9 10.4 13.3 15.6 2.8 17.5 7.5 11.0 0

0

11.9 22.9 16.7 6.6 4.1 2.8 13.4 5.1 1.7 0

2.1 19.2 6.7 4.2 6.7 2.1 20.1 9.3 0 0 0 3.1 2.8 2.4 20.8 16.1 0

10.6 0 0 3.3 8.6 10.6 6.6 5.1

17.8 16.7 6.6 29.6 0 8.3 0 0 0

0 0 0 0 0 0 0 0 0 18.9 0 0 0 5.1 0 0 0

Amine 0 6.2 5.0 6.7 15.4 7.5 6.2 0 6.8 3.2 4.0 30.7 7.1 2.8 10.0 5.1 8.5 0

Table E. Results of Class Determination by Three Methods and the Weighted Average, for Ethyl sec-Butyl Ether

Sig. peaks Condensed spectra Hyperspheres Weighted averages

Aromatic

Ester

Ether

Acid

2.4 3.1 4.1 3.3

12.5 10.7 6.3 9.6

24.2 31.0 29.7 28.6

11.7 10.2 9.0 10.2

Ketone Aldehyde Alkene 7.5 4.7 5.0 5.6

5.0 4.9 6.0 5.3

5.0 4.1 5.3 4.8

CycloAlkane Alcohol alkane Diene Amine 7.5 4.9 5.8 5.9

19.2 12.9 14.0 15.1

0

3.8 5.0 3.2

0 3.4 4.6 2.9

VOL. 40, NO. 10, AUGUST 1968

5.0 6.4 5.3 5.6

1473

e7

duced spectra of several substances and the average points for each class are shown in Table VII. I n some cases in Table VI1 where there are two specific groups or structural features present, similarities to each of these show up.

9

ALKENES

SIGNIFICANT FRAGMENT PEAKS U

Q

n

T

1111111111111111111111111111 I

7

I

14

7

CONDENSED MASS

14

rn

Figure 3. Reduced mass spectra, condensed a t intervals of m/e = 14,and averaged for the members of the classes of compounds

Among all the mass spectra examined, by far the most probable periodicities were 2 and 14. The first of these reflects the greater stability of even-electron over odd-electron ions, the second is caused by the easier cleavage of C-C bonds in aliphatic chains.

REDUCED MASS SPECTRA The method of normalization used in the tests described placed all the mass spectral points on a hypersphere. The attempt to identify an unknown by locating it within a constellation in the hypersky is equivalent to viewing this sky from its centre of curvature. There may be other viewpoints from which it would be easier to distinguish to which group a given point belonged. This might be achieved by examining a suitable (500-k)-dimensional projection. The survey of periodicity suggests that one such would be that where the spectra were summed at intervals of m/e = 14 to give new reduced spectra consisting of only 14 peaks-i.e., to calculate a set of 14 peaks 3._ A

T(m) =

P(m n-0

+ 14n) . . . . m = 1 , .

,

. 14

(5)

For this purpose it is preferable to use the simple normalization 500

P(n)

=

1

n=l

when the mass spectral points become spread throughout a volume of the hyperspace. The averages of the reduced spectra for up to 36 members of each class of molecules are shown in Figure 3. The distances between the points corresponding to the re-

1474

ANALYTICAL CHEMISTRY

The empirical correlations between significant peaks and structural class tabulated by McLafferty ( 2 ) can be made the basis of a simple method of identification. McLafferty (2) compiled a table of probabilities for each significant mass peak, in terms of its frequency of occurrence in distinct types of molecules. However, because the 12 classes chosen here are not likely to be represented equally in his library of compounds, the frequency method would tend to be biased for some groups. Over a small number of test substances it was found that putting the probability contributed by any one mass peak to each group equal to the reciprocal of the number of groups it represents gave better results. Only the four strongest peaks in an unknown were taken into account and the probabilities for each class of molecule were summed. The results of doing this for several substances are shown in Table VIII. When the four methods described are compared it appears that method 3, where the mass spectra are summed in 14s, gives somewhat more success in identification than the others. Table I X compares the results of methods 1, 3, and 4 for a given substance. The total time required to carry out methods 1, 3, and 4 is only 0.9 second (CDC 32001, and a weighted average of the results of these three methods gives a correct group identification in almost every case (fourth row in Table IX). The weights are 26, 34, and 32 for 1, 3, and 4, respectively. Data storage required for methods 1, 3, and 4 is 168, 320, and 912 up to 6 decimal digit integer words, respectively.

CONCLUSION By comparing a mass spectrum with tabulated values of condensed mass spectra, specific peak probabilities, and hypersphere cluster centres, a list of values which indicates the class of molecule is obtained. The data required occupy 1400 words of computer storage, small enough to be held in core. The total identification time is less than one second for most compounds. I n the library of 182 compounds from which most of the data was derived, the highest, second highest, and third highest probabilities included 152, 172, and 175 correct functional group identifications, from which it can be concluded that a t least in the case of low molecular weight (up to C 20) compounds a successful identification of at least one functional group can usually be made.

RECEIVED for review February 2, 1968. Accepted April 15 1968.