Computer based search and retrieval system for rapid mass spectral

Mar 1, 1975 - Computer based search and retrieval system for rapid mass spectral screening of samples. T. O. Gronneberg ... Citation data is made avai...
0 downloads 6 Views 657KB Size
chased using funds provided to W. G. Meinschein and J. M. Hayes by the National Science Foundation (GB-13206, GU-2003) and the National Aeronautics and Space Administration (NAS 9-9974, 15-003-118). The present developments were principally supported by a grant (to J.M.H.) from the National Institute of General Medical Sciences of the National Institutes of Health (GM-18979).

(15) J. M. Hayes and D. A . Schoeller. unpublished work, rnanusbript in preparation. (16) R. T. Sanderson, Ind. Eng. Chem., Anal. Ed., 15, 7 6 (1943). (17) W. G. Mook and P. M. Grootes, Int. J. Mass Spectrom. /on Phys., 12, 273 (1973). (18) T. B. Coplen, Int. J. Mass Spectrom. /onPhys., 11, 37 (1973).

RECEIVEDfor review July 25, 1974. Accepted November 7, 1974. The mass spectrometer-computer system was pur-

Computer Based Search and Retrieval System for Rapid Mass Spectral Screening of Samples T. 0. Gronneberg,‘ N. A. B. Gray,2 and G. Eglinton Organic Geochemistry, Unit, School of Chemistry, Bristol University, Bristol BS8 I JS, England

An on-line laboratory computer system that attempts to determine the chemical class of each component in a GC-MS run as data are acquired is described. This preliminary orientation is followed by a non-real time file search against automatically-chosen specialized reference flles. Either lonseries classification or a “profile” is used for the mapping of an unknown onto the appropriate reference file for spectrum identification. The files each contain only the spectra of compounds of similar structures. The search files do not contain the common ions but rather those ions which distinguish each reference compound from other similar structures. Few comparisons have to be made, and the identifications are normally more precise because distinctive ions are used. Reference files are built up by selecting standards from constituents of the complex mixtures under study; this procedure ensures that the files are appropriate for the analytical problem.

Several methods for processing mass spectra and searching mass spectral data files have been developed (1-8). The utility of an interactive, data-conversational, search facility has been demonstrated (9, 1 0 ) . Much of the research (3, 5, 6, 9 ) has concentrated on effectively using the minimum amount of data for both reference and unknown spectra; thus, intensities may be assigned to 8, 4, or 2 levels and only n ions from every interval of m amu may be stored, etc. Some approaches have attempted to optimize a search by using operations unique to a particular computer design ( 7 ) . Use of retention indices as an extra constraint on file search has been proposed (11 ). Comparison of selected spectral characters encoded as one bit has proved useful for library searching (12). A method for preliminary classification using averaged modulo-14 spectra is useful where no absolute identification is required. An unknown is classified by computing its modulo-14 or “ion-series” spectrum and selecting the closest equivalent standard ion-series from a chosen “correlation set.” This method, developed by Smith ( 1 3 ) ,is similar to that of Pettersson and Ryhage ( 1 4 ) and to the classification scheme of Crawford and Morrison (15). Present address, Agricultural College of Norway, Isotope Laboratory, 1432 AS-NLH, Norway. Present address, King’s College Research Centre, King’s College, Cambridge CB2 lST, England.

The use of small on-line laboratory computers dedicated to the control and processing of the data output of GC-MS systems is becoming increasingly common. The load on the computer processor due to the GC-MS is moderate and most laboratory systems have sufficient excess processing capacity to do some of the spectrum classification simultaneously with continued data acquisition. File search methods can also be implemented on laboratory computers; however, the diskhape storage available is normally limited so search libraries have to be selective. A small computer system using selective files for library search is commercially available (16), and user-generated libraries specific to the user’s problem have been proposed ( I 7 ) . These files are requested by the user when they are thought to be appropriate. When file searches are performed on laboratory computers, the complete original spectra of unknowns are available for checking against the data stored on reference compounds. This facility permits more selective tests for matching and fully automatic file searches with key ion restrictions. The system described herein makes use of this technique which is not normally employed in file searches on big computers. The essential operational features are that the acquired spectra are first assigned a classification, which then automatically selects appropriate search files of reduced spectra. This procedure obviates intervention from the analyst and saves search time; it also avoids irrelevant matches. Thus, this system is more effective than others previously described (8, 9, 10, 16) for the routine GC-MS analysis of complex mixtures, provided that the numbers of chemical classes and constituent homologous series are relatively small.

DATA ACQUISITION FOR REFERENCE FILES Most laboratories analyzing complex mixtures are able to define certain constraints on the type of compounds that they would expect to find routinely. These constraints are generated partly by the type of analysis performed (e.g., geochemical, environmental, pharmacological3 and partly by the separation technique employed prior to-the GC-MS run. Thus, it is obvious that the analyst in most cases has a reasonable idea of what kinds of compounds are likely to predominate in a fraction prepared for a GC-MS run. Reference compounds which are not of these routine classes can therefore be ignored for the first phase of an analysis, though such compounds might be appropriate for some

ANALYTICAL CHEMISTRY, VOL. 4 7 , NO. 3, MARCH 1975

415

later stage of identifying unusual artifacts or contaminants. Creation of specialized files of reference spectra has been reported (18, 19). Another approach is to build up these files of appropriate reference spectra from the GC-MS analysis of the constituents of the complex mixtures themselves. Once a compound in a mixture has been identified, its mass spectrum can be added to the appropriate reference file and can also be used to update the ion-series spectrum of that class of compound in the correlation set (13) used for classification. The computer should then be able to classify homologs of the compound if they occur in later analyses as well as recognize the original compound when it next occurs. This approach is especially convenient for studies in environmental organic chemistry and organic geochemistry where homologous series are often encountered.

DATA PROCESSING METHODS The computer used in this C-GC-MS system is a 12K PDP-Be with disk, omnibus, DEC-tapes, and a display; this computer is on-line to a Varian CH-7 mass spectrometer. Spectra acquired during the course of a GC-MS run are stored on disk for later analysis. Much of the current analytical work is on hydrocarbon, ester, sterol, and cutin fractions extracted from recent sediments or plant material. For these compounds, the ion-series method of Smith ( 1 3 ) is particularly appropriate for classification. This processing can be carried out simultaneously with data acquisition, so that a t the end of the GC-MS run, the analyst has a listing of probable chemical classes and suggested molecular weights for the components in the sample. The analyst will use these results when he plans the further processing of the data. Further identification is made by file searches on the laboratory computer. This file search system utilizes a “Partially inverted file.” In such a file organization, the information request (in this case the spectrum of the unknown) is mapped onto a segment of the reference data base and then this segment is searched. The main advantage of this file organization is to increase the speed of searching. Files ordered by molecular weight can be considered as inverted on molecular weight key; this approach has been used by the Mass Spectrometry Data Centre (Aldermaston) on their large data base. The increase in file search speed is a consequence of having to search only the relevant portions of the data base as selected by the mapping function. Two methods are being tried to effect a mapping of an unknown onto a reference file. The first method is to use the ion-series classification algorithm; this method is appropriate to the carbon-rich types of compounds such as fatty acid esters, etc., where this classification method is quite effective. For those types of compound for which ionseries classification is ineffective, such as pesticides, etc., a second mapping method uses a “profile” of common features of all members of the class. This profile consists of a set of restrictions on the intensities of ions of specific mle values and a set of restrictions on the relative intensities of pairs of ions. If the ion-series algorithm is used to select the different files appropriate for each different component in a sample, the program has to be given a correlation set specifying the ion series and files to be used. A correlation set is formed by up to 64 ion-series classes chosen so as to cover most chemical classes within one main type of analysis. Each correlation set has 16 different named files accessible for automatic selection; each ion-series class has a link to one of these search files and several ion-series classes may refer to the same file. 416

This automatic selection of search files has a great advantage in GC-MS analysis. By choosing the appropriate correlation set and profile, the analyst can allow the computer to select those spectra from a complete GC-MS run which are appropriate to a file search. The specialized search files are then chosen automatically. Thus, apart from the initial setting-up procedure, the processing of the whole batch of spectra takes place without operator interaction. Each search file contains typically 100-200 reduced spectra. Classification with linked file search (150 spectra) takes % second per unknown spectrum. Rejection of irrelevant spectra takes less than ‘jsecond. This file search processing on the mini computer is relatively slow but is faster than listing or plotting of spectra prior to examination by a mass spectrometrist using large scale interactive file searches. The processing time is comparable with that required to transmit abbreviated spectra to a large machine using typical 1200-baud communication lines. One second of processing is required just for the data transfer (20 ions given by mass and intensity) from the mini to the main frame computer before the file search is started on the latter. Although centralized mass spectral libraries for large scale interactive file searches have documented their utility (10, 20), it would be inconvenient to apply this method to all components of GC-MS runs. Rather, automatic routine identification of most components can be done on a mini computer, and any unidentified components subsequently submitted to large scale interactive file search. The programs permit the operator to override the automatic selection of reference files and to search a particular file for comparison with all components. This option can be useful for checking which components have been found previously in similar environments by using a file containing all compounds already encountered in that environment, irrespective of chemical class. Two additional advantages follow from using several files. First, it makes it easy for an analyst to build up a data base of spectra specific to his research while utilizing any portions of a common data base that he requires. The other advantage is that it permits the ions used to characterize a spectrum to be more specific to the individual compounds. If a file is known to contain spectral data on methyl esters, then it is not necessary to include any of the standard ions which distinguish methyl ester spectra from those of other compounds. The ions used to characterize an individual ester can be chosen as those which best distinguish it from other esters, permitting a more effective use of the available data storage. The most effective-scoring function for getting matches between restricted data on a reference compound and the complete spectrum of the unknown has not yet been determined. Two options are currently provided. First, the score returned can be simply the sum of % intensities from the spectrum of the unknown for the ions supposedly characterizing a given reference compound. In this case, the search is based on a ten-peak file created by the analyst, where the selected ions in many cases will be of minor intensity but nevertheless significant. This approach might prove invaluable for distinguishing between isomers. Here, the best fitting reference spectrum might not always get the highest score, but this difficulty is largely overcome by the requirement that all the ions specified as characteristic of a given reference compound must be found in the unknown spectrum. Second, if the search files were generated automatically, a matching coefficient is computed between the selected ions of the reference compound and a similar set of ions characterizing the spectrum of the unknown, using the rel-

ANALYTICAL CHEMISTRY, VOL. 47, NO. 3, MARCH 1975

Table I. The Relevant Ion-Series Classes i n the Correlation Set Used for the Analysis of Cutin Hydrolysates= Class

Ion-series class names as given by computer printout

ALKANOIC ACID (ME-ESTER) ALKANEDIOIC ACID (DIME-ESTER) W-OH-ALKANOIC ACID (ME-ESTER, TMSIETHER) 4 MU-OH-ALKANOIC ACID (ME-ESTER, TMSIETHER) 5 MU, W-DIOH-ALKANOIC ACID (ME-ESTER, BIS TMSI-ETHER) MU-OH -ALKANEDIOIC ACID (DIME -ESTER, 6 TMSI-ETHER) 7 ALKENOIC ACID (ME-ESTER) 8 ALKADIENOIC ACID (ME -ESTER) 9 MU-OH-ALKENOIC ACID (ME-ESTER, TMSIETHER) 10 COUMARIC ACID (ME -ESTER, TMSI-ETHER) 11 HYDROXYBENZOIC ACID (ME-ESTER, TMSIETHER) 12 PHTHALIC ACID (DIALKYL ESTER) “PLASTICIZER” a The non-systematic names of the TMSi classes are used to provide an approximate indication of the pattern of substitutions, etc. 1

2 3

ative intensity ranking of these ions as in the method of Kelly et al. (3).The method of automatically selecting ions thought to characterize a spectrum is to start a t highest mass and collect those ions whose intensities exceed half the average intensity of ions in the spectrum. If a group of such ions occurs differing by only 1 or 2 mass units, then the most intense ion in the group is selected. The collection of ions is continued until either all ions have been considered or a t least 10 ions have been selected and at least half the sum of the total ion current has been accounted for. Finally, the ten most intense of the selected ions-ie., the ‘10significant’ ions-are used. Supporting software includes programs for defining class “profiles,” editing ion-series data, for extracting new ionseries from sets of spectra, and for the creation of new search files or addition of new material to existing search files. The editing and updating procedures are done after the complete data processing when all analytical information‘is available.

Table 11. Mass Spectra Satisfying the Profile Constraints for TMSi-Derivative@ no matching spectrum 10-hydroxytetradecenoic acid (methyl ester, TMSi ether) 8 10-hydroxypentadecanoic acid (methyl ester, TMSi ether) 9 no matching spectrum 10 no matching spectrum 11 no matching spectrum 12 16 -hydroxyhexadecenoic acid (methyl ester, TMSi ether) 16 -hydroxyhexadecanoic acid (methyl ester, TMSi ether) 13 10,15-dihydroxypentadecanoic acid (methyl ester, bisTMSi ether) 14 6 -hydroxypentadecanedioic acid (dimethyl ester, TMSi ether) 1 5 no matching spectrum 16 10,16-dihydraxyhexadecanoic acid (methyl ester, bisTMSi ether) 17 ‘7 -hydroxyhexadecanedioic acid (dimethyl ester, TMSi ether) 18 no matching spectrum 19 no matching spectrum 20 9,10,18-trihydraxyoctadecanoic acid (methyl ester. trisTMSi ether) 2 1 no matching spectrum 0 The peak numbers refer to GLC trace in Figure 1. 3 7

APPLICATION T O T H E ANALYSIS OF A C U T I N FRACTION Plant cuticles are largely composed of inter-esterified and cross-linked long chain hydroxy acids (21 ). Thus, the constituent acids in the hydrolysates of most cutins are mainly mono- and dibasic hydroxy acids with from one to three hydroxy groups; both saturated and unsaturated acids occur. In addition, alcohols and phenols are liberated by the hydrolysis. A number of components of such cutin hydrolysates have already been identified (21 and references therein). The gas chromatographic-mass spectral analysis is usually done using the more volatile TMSi ethers-methyl ester derivatives of the acids (22). Representative cutin material was available from previous work (23). Reference files and ion-series classes were constructed from spectra obtained using derivatives of these standard fractions. The compounds present in these fractions were identified from their GC-retention times and mass spectral fragmentation patterns. In Table I, the set of ion-series classes forming the correlation set is given; hydroxy acids with the hydroxy substituent in a position near the middle of the chain are referred to as “MU-OH

Matching reference compound found in f i l e search

Peak‘

0

5

10

I

m,n

15

20

Figure 1. GC-MS trace (total ion current) of hydroxy acids (TMSi ethers-methyl esters) in the cutin hydrolysate of Magnolia grandiflo-

ra. The peak numbers correspond to the scan numbers in Table 11. Column condition: ’/,& X 9-fistainless steel, 3 % OV17 on Gas Chrom Q.Temperature programmed 160-280 OC at 4’/min.

ACIDS”; “OMEGA-OH ACIDS” implies a terminal hydroxy group. The fragmentation pattern of the TMSi derivatives is dominated by loss of the TMSi group itself which causes the TMSi derivatives of different homologs to have rather similar ion-series spectra. In order to avoid any problems due to occasional misclassification, all spectra from TMSi ethers were collected into a single file which was also used with the alternative “profile” mapping method. Mass spectra of TMSi ethers are easily characterized by certain ions derived from the TMSi-oxy group and can consequently be picked out from the rest for a file search in the TMSi ether file. ANALYTICAL CHEMISTRY, VOL. 47, NO. 3, MARCH 1975

417

SCAN 4 l M.E

12 :t

311 83

100

SCORE 188

P+ 343 0

496

2J

____________________-----------M,’E

7:

M.E

‘i:

343 146

61 21

103 159

35 15

M/E

X

107 95

26 14

M/E

87 97

SPECTRUM

2

25 10

OF

‘MRGNOLIR GRRNDIFLORR SHMPLE

_____________-__________________

SPECTRUM REF NurwER 13 SCORE PI CLRSS 200 0 MU, W-DIOH-RLCHNOIC RCICI (ME-ESTER 81s TMSI-ETHER, 376 401 MU-OH-HLKRNEL’IUIC R C I D ‘DIME-€STEP. T n S I - E T H E P I 414 0 W-OH-HLhHNOIC A C I D \ME-ESTER. TMSI-ETHER>

CLRSS W-OH-RLKRNOIC R C I D a’ME-ESTER, TMSI-ETHER) MU, W-DIOH-RLKRNOIC R C I D (ME-ESTER, B I S TMSI-ETHER)

.

NO P+ I O N IONS ABOVE RSSIGNEU P+

M./E

%

244

12

WE

%

M/E

I!

M,G

%

M/E

SCORE

x

86

P+ 417

SPECTRUM

Figure 2. Lineprinter log from a section of ion-series classification data obtained during the course of a C-GC-MS run on Magnolia grandiflora cutins.

REFERENCE TGSH 11

OF

................................ ................................

‘MRGNOLIA GRANDIFLORA S R W L E

SPECTRUM REF NUMBER 14 SCORE P+ CLRSS 179 0 MU.W-DIOH-RLKANOIC A C I D (ME-ESTER, TMSI-ETHER) 209 373 MU-OH-ACKANEDIOIC K I D eda t the Organic Geochemistry Unit in Bristol, and had previously been assigned to 10,15-dihydroxypentadecanoic acid (methyl ester, TMSi ether) and 6-hydroxypentadecanedioic acid (dimethyl ester, TMSi ether), respectively. As an alternative method, a “profile” was used to select the TMSi ether-methyl ester components in the hydrolysate fraction. The following restrictions were used for the profile: rnle 73 > 50%, mle 75 > 15% ((Me)3Si+ = mle 73 and (Me)2SiOH+ = m/e 75) and rnle 74 < m/e 73 and 75 (last constraint will exclude n- esters). All spectra satisfying the profile constraints were searched against the reference file of TMSi derivatives. The results are summarized in Table 11. The scan No. refers to the numbers tagged on the peaks in the GC-MS trace in Figure 1. The results, as far as identifications go, correspond to what have been found in previous work for Magnolia grandiflora ( 2 3 ) .

CONCLUSION The application of several different overlapping approaches to the identification of spectra has been illustrated for a number of compounds from the same mixture. 418

SCORE

78

P*

373

REFERENCE TGSH 24

Figure 3. Lineprinter log of ion-series classification with file search against automatically selected reference files for scans No. 13 and

14. The log shows first the result of the classification procedure, then the identification of the matching reference spectra derived from the search file

It is intended that “customized” analytical systems be constructed from these general approaches for the analysis of the components in particular mixtures. The search of separate files containing standards appropriately chosen for the analytical problem makes the analysis fast and particularly useful in studies of complex mixtures of natural products where a time-consuming part of the analysis is to confirm the presence of the expected components. By choosing an appropriate correlation set, all routine compound classificationlidentification is handled automatically and attention is drawn to unusual components which cannot be classified or identified in the known reference file. In more complex applications, “profile” selection and search files of manually selected ions may be used to identify specific types of compounds occurring in complex mixtures. A complementary approach for interpretation of low resolution mass spectra has been developed for small computers. It applies various spectrum identification methods as used by a mass spectrometrist ( 2 4 ) .

ACKNOWLEDGMENT The authors wish to thank Andrew Caldicott for the supply of cutins and cutin hydrolysates for building up the reference files.

LITERATURE CITED (1)E. Petterssonand R. Ryhage, Ark. Kemi, 26, 293 (1967). (2)L. R . Crawford and J. D. Morrison, Anal. Chem., 40, 1464 (1968). (3)E. A. Knock, I. C. Smith, D. E. Wright, R . G. Ridley, and W. Kelly, Anal. Chem., 42, 1516 (1970). (4)S.L. Grotch, Anal. Chem., 42, 1214 (1970). (5) S.L. Grotch, Anal. Chem., 43, 1362 (1971). (6) S.L. Grotch, Anal. Chem., 45, 2 (1973). (7)L. Wangen, W. S. Woodward, and T. L. Isenhour, Anal. Chem., 43, 1605 (1971). (8) H. S.Hertz, R. A. Hites. and K. Biemann, Anal. Chem., 43, 681 (1971). (9)S.R. Heller, Anal. Chem., 44, 1951 (1972). (IO)S.R. Heller, H. M. Fates, and G. W. A. Milne, Org. Mass Spectrom., 7, 107 (1973). (11)-H. Nau and K. Biemann, Anal. Lett., 6, 1071 (1973). (12)F. Erni and J. T. Clerc, Helw. Chim. Acta, 55, 489 (1972). (13)D. H. Smith, Anal. Chem., 44, 536 (1972). (14)B. Pettersson and R. Ryhage, Anal. Chem., 39, 790 (1967). (15)L. R. Crawford and J. D. Morrison, Anal. Chem., 40, 1469 (1968). (16)Finnigan Corporation, Sunnyvale, Calif. 94086. (17)F. P. Abramson, Twenty-First Annual Conference on Mass Spectrometry and Allied Topics, May 20-25, 1973,San Francisco, Calif., p 76. (18)P. Toft, E. A. Lodge, and M. E. Simard, Can. J. Pharm. Sci., 7, 604 (1972). (19)E. S.Finkle. D. M. Taylor, and E. J. Bonelli, J. Chromatogr. Sci.. I O , 312 (1972).

ANALYTICAL CHEMISTRY, VOL. 47, NO. 3, MARCH 1975

(20) S. R. Heller, "Computer Techniques for Interpretation of Mass Spectral Data," in "Computer Presentation and Manipulation of Chemical Information," W. T. Wipke, S. R. Heller, R. J. Feldmore, and E. Hyde. NATOAdvanced Studv Institute. Noodwiikohont. 1973. (21) P. E. Kolattukudy and T. J. Walt&, Prog. Chem. Fats Other Lipids, 13, No. 3 (1973). (22) D. H. Hunneman and G. Ealinton. Phvtochem.. 11. 1989 (1971). (23) A. Caldicot, Ph.D. Thesis.Bristol, England. 1973. (24) N. A. B. Gray and T. 0. Gronneberg, Anal. Chem., 47, 419 (1975). '

RECEIVEDfor review January 24, 1974. Accepted Septem-

ber 25, 1974. Financial support for these studies was proand vided by the Department Of Education and the Roval Norwegian Council for Scientific and Industrial Resear& GC-& facilities were financed by the Natural Environment Research Council and the data acquisition system by the Nuffield Foundation. The programs described in this paper would be available from the National Research and Development Corporation, Kingsgate House, 66/74 Victoria Street, London SWlE 6SL, U.K.

Programs for Spectrum Classification and Screening of Gas Chromotographic-Mass Spectrometric Data on a Laboratory Computer N. A. B. Gray' and T. 0. Gronneberg2 Organic Geochemistry Unit, School of Chemistry, Bristol University, Bristol, U.K.

A system of programs is described for use in GC-MS screening studies of multicomponent mixtures. The system has two components. The first component allows the analyst to define the classes of compounds expected from some specific environment and to provide empirical rules for classification on the basis of mass spectral features. The second component is a routine, which combined with a data-acquisition system, can use the information provided by the analyst to classify the components in a fraction analyzed by GC-MS. This classification can take place simultaneously with continued data acquisition on a small laboratory computer. The rules for determining structures from mass spectra are expressed in terms of conventional intermediate concepts such as comparison of ions, series of ions, and key ions. The method does not require transformation of spectra or a large number of standard compounds to define a class, and the chemist can develop the discrimination functions with little or no knowledge of programming.

Many applications of gas chromatography/mass spectrometry (GC-MS) require the screening of large numbers of multicomponent samples to detect the occurrence of particular types of molecules. A typical application might be the screening of samples derived from oil shales for the presence of steranes of some specific skeletal structure. A system of programs has been developed to assist the analyst in the interpretation of the data acquired. In such screening studies, the types of compounds to be expected in a sample, and the procedures for recognizing the spectra of such compounds, are known. This system of programs attempts to utilize this knowledge to automate most of the routine interpretation of spectra. The main component in this system of programs is a subroutine for assigning structural features to compounds using low resolution mass spectra. When combined with a Present address, King's College Research Centre, King's College, Cambridge CB2 IST, England. Present address, Agricultural University of Norway, Isotope Laboratory, 1432 Aas-NLH, Norway.

data acquisition system, this routine permits the printing of summary identification data for constituents of a sample during the course of a computerized GC-MS run. The suomary data so produced can alert the analyst to the presence of contaminants in the sample and can identify those constituents of interest. This routine is implemented as an interpretive program for the application of spectrum recognition rules to the low resolution mass spectra acquired from the GC-MS instrument. Sets of spectrum recognition rules for the compounds to be expected in samples from differing environments are held in the computer file store. For a sample of known origin, the analyst can select the appropriate file of spectrum recognition rules to be used by the data acquistionldata interpretation program. Besides the interpretive routine for applying the recognition rules to spectra, the system of programs includes routines for defining, combining, and modifying files of spectrum recognition rules. The method for assigning structural features to spectra is similar in principle to the programs of Crawford and Morrison ( I ) and Venkataraghavan e t al. (2). Both of these programs use the familiar concepts of searching the spectrum for possible molecular ions and key fragment ions that are employed by a mass spectrometrist when examining spectra. This empirical knowledge about mass spectra was incorporated into FORTRAN subroutines that check for the spectral features characterizing functional groups. Our program system is devised for an environment where there are analysts with considerable experience in the empirical interpretation of mass spectra but who can not express their knowledge in FORTRAN type programming languages. The program package avoids the necessity for the analyst to be able to program in FORTRAN by providing an interactive program which permits the spectrum recognition rules to be built up from a menu of basic operations which includes checking for series of ions, for possible molecular ions in particular series, and for ions produced by the loss of neutral fragments from molecular ions. Our use of conventional intermediate concepts, such as key ions, distinguishes our method from the pattern recognition approaches currently being developed (3, 4 ) . Like

ANALYTICAL CHEMISTRY, VOL. 47, NO. 3, MARCH 1975

419