Computer Identification and Interpretation of Unknown Mass Spectra

Jun 1, 1975 - Computer Identification and Interpretation of Unknown Mass Spectra Utilizing a Computer Network System. R. VENKATARAGHAVAN , GAIL M. PES...
0 downloads 0 Views 913KB Size
12

C o m p u t e r Identification a n d I n t e r p r e t a t i o n o f

Downloaded by UNIV OF CALIFORNIA SANTA BARBARA on April 3, 2018 | https://pubs.acs.org Publication Date: June 1, 1975 | doi: 10.1021/bk-1975-0019.ch012

U n k n o w n M a s s Spectra U t i l i z i n g a C o m p u t e r N e t w o r k System R. VENKATARAGHAVAN, GAIL M. PESYNA, and F. W. McLAFFERTY Department of Chemistry, Cornell University, Ithaca, Ν. Y. 14850 Mass spectrometry/computer systems in routine use in many research laboratories are capable of producing a complete mass spectrum (unit resolution) every few seconds (1, 2). The unique applicability of mass spectrometry to nanogram samples, and the ability to obtain spectra directly on components of com­ plex mixtures separated by a gas (3) or liquid chromatograph (4) have tremendously increased the number of spectra taken for the purpose of compound identification. These capabilities have provided unique solutions to research and control problems in a wide variety of fields, including environmental pollution, metabo­ lism studies, medical diagnoses, insect pheromones, forensic analyses, military detection systems, and conversion of coal or shale to liquid fuels. The importance of these applications has motivated a large increase in basic research in organic mass spectrometry, greatly expanding our knowledge of mass spectral fragmentation behavior. Unfortunately, the quantity (and quality) of chemists trained in the interpretation of mass spectra has not increased as rapidly, and thus the interpretation process is becoming an in­ creasingly serious bottleneck in proper utilization of this tech­ nique. The modern computer is an obvious possibility to alleviate these problems, and a number of computer systems for identifica­ tion and interpretation have been proposed (i, 2, 5-13). If a reference mass spectrum of the unknown compound is in the data file, computer matching programs can be used for its retrieval. Although such retrieval is relatively efficient, the present mass spectral reference file (14/15) of 30/000 different compounds represents a very small fraction of the possible organic com­ pounds, so that interpretation is required whenever a poor match is obtained. Thus both "retrieval" and "interpretive" systems are necessary to solve problems in most important areas. These systems have been reviewed in detail recently 2, 5); here we will only describe the Cornell systems which are now available over a computer networking system (TYMNET). 183 Lykos; Computer Networking and Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1975.

Downloaded by UNIV OF CALIFORNIA SANTA BARBARA on April 3, 2018 | https://pubs.acs.org Publication Date: June 1, 1975 | doi: 10.1021/bk-1975-0019.ch012

184

COMPUTER NETWORKING AND CHEMISTRY

P B M . A " P r o b a b i l i t y B a s e d M a t c h i n g " s y s t e m for the r e t r i e v a l of unknown mass s p e c t r a has r e c e n t l y b e e n d e s c r i b e d (12, 13). R e s e a r c h i n the area of document and information r e t r i e v a l has f i r m l y e s t a b l i s h e d that s y s t e m e f f i c i e n c y i s i n ­ c r e a s e d by the proper w e i g h t i n g of the r e l a t i v e importance of items u s e d for i d e n t i f y i n g e a c h member of a l i b r a r y (16). For P B M the m/e v a l u e s of the p e a k s are w e i g h t e d a c c o r d i n g to t h e i r u n i q u e n e s s i n the f i l e , and the abundance v a l u e s are w e i g h t e d a c c o r d i n g to a l o g normal d i s t r i b u t i o n (17). The u s e of t h e s e v a l u e s i s b a s e d u p o n the " G e n e r a l Rule of M u l t i p l i c a t i o n " of p r o b a b i l i t y theory; thus i f p e a k s w i t h m a s s e s m^ and rn_2 h a v i n g i n t e n s i t i e s j . ^ and_i2 o c c u r i n mass s p e c t r a w i t h p r o b a b i l i t i e s χ>ι and £2 / p r o b a b i l i t y that both o c c u r at random i n an unknown spectrum i s ρ χ times P2 · product i s s m a l l , i t i s much more l i k e l y that the p r e s e n c e of p e a k s m i and rn_2 i s due to the i d e n t i t y of the unknown spectrum and the compared reference spectrum i n w h i c h both o c c u r w i t h the i n t e n s i t i e s J.j a n d j . ^ . The low v a l u e of t h i s p r o b a b i l i t y p r o v i d e s a c o n f i d e n c e that this i d e n t i f i c a t i o n i s c o r r e c t , w h i c h i s measured b y a " c o n f i d e n c e v a l u e , K " . This m e a s u r e , as w e l l as a l l the i n d i v i d u a l p r o b a ­ b i l i t i e s , i s e x p r e s s e d as the c o r r e s p o n d i n g b a s e two logarithm for c o n v e n i e n c e of c a l c u l a t i o n ; i n v e r s e p r o b a b i l i t i e s are a l s o u s e d to s i m p l i f y the c a l c u l a t i o n s and to produce a f i n a l r e s u l t w h i c h i s a d i r e c t measure of " c o n f i d e n c e . " In t h i s r e v e r s e s e a r c h , there i s computed for e a c h reference spectrum matched a g a i n s t the unknown a c o n f i d e n c e v a l u e , " K " , e q u a l to the sum of the i n d i v i d u a l Kj v a l u e s c a l c u l a t e d for e a c h peak i n the u n ­ known w h o s e i n t e n s i t y agrees w i t h i n a predetermined range to that of the c o r r e s p o n d i n g peak i n the reference s p e c t r u m . K. c o m b i n e s four t e r m s , t

n

e

I f

t

n

i

s

K. = U . + A. + W . - D 3 J 3 3 where U i s the c o n t r i b u t i o n to the p r o b a b i l i t y of the " u n i q u e n e s s " of the m/e v a l u e of the peak; A i s the c o n t r i b u t i o n to the p r o b a ­ b i l i t y of the abundance v a l u e of the peak as i t appears i n the reference spectrum; W , the " w i n d o w f a c t o r " , i s a measure of the agreement required between the abundance of the peak i n the r e f ­ erence and i n the unknown; and D , the " d i l u t i o n f a c t o r " for m i x ­ ture s p e c t r a , i s a measure of the o v e r a l l r e d u c t i o n of peak i n t e n ­ s i t i e s i n the unknown due to the p r e s e n c e of other components (if the unknown spectrum i s of a pure c o m p o u n d , D = 0). The s y s t e m i s d e s c r i b e d i n d e t a i l e l s e w h e r e (13). E x t e n s i v e s t a t i s t i c a l s t u d i e s have shown (13) that the p r e c i s i o n / r e c a l l performance of the s y s t e m i s s u b s t a n t i a l l y better than others w h i c h employ no or l i m i t e d w e i g h t i n g . The " r e v e r s e s e a r c h " feature i s e s p e c i a l l y v a l u a b l e w i t h m i x t u r e s , as i l l u s ­ trated by r e s u l t s from the m a s s spectrum of a mixture shown i n Table I. The mixture c o n t a i n e d amobarbital ( 5 - e t h y l - 5 - i s o a m y l -

Lykos; Computer Networking and Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1975.

Lykos; Computer Networking and Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1975. 9 9 , 95 70, 70 69, 100 73, 60 60, 58 6 1 , 51

74, 74 74, 74 74, 74 62, 62 62, 62 62, 62

42, 42 47, 58 62, 65 4 3 , 53 5 3 , 55 63, 71

77+, 77+ 72*+, 6 1 * * 57+, 5 4 * * 76+, 66 66, 64** 56, 48*

nicotine

5 -ethyl-5 - i s o a m y l barbituric a c i d (amobarbital, amytal)

b

a

C o m p o u n d s s e l e c t e d of h i g h e s t Κ v a l u e s from a data b a s e of 3 5 , 8 2 8 s p e c t r a . M u l t i p l e v a l u e s represent different s p e c t r a of the same compound; the a s t e r i s k s i n d i c a t e the number of " f l a g g e d " p e a k s (see text) omitted i n c a l c u l a t i n g that Κ v a l u e , and the p l u s s i g n i n d i c a t e s that the m o l e c u l a r i o n of the reference w a s found i n the unknown and u s e d i n the Κ c a l c u l a t i o n .

74, 55 5 5 , 32 37, 44 44

% Com­ ponent

5 9 , 59 5 9 , 72 80, 79 79

% Contam­ ination

10, 37 37, 44 5 9 , 69 74

Confidence Value K*> ΔΚ 109+, 82*+ 82*+, 75+ 60+, 50**+ 45**+

Compound

a

P B M R e s u l t s f o r a n " U n k n o w n " M a s s Spectrum of 30% A m o b a r b i t a l , 30% H e x o b a r b i t a l , and 40% N i c o t i n e

5-cyclohexenyl-l ,5dimethylbarbituric acid (hexobarbital, cyclonal)

Table I.

Downloaded by UNIV OF CALIFORNIA SANTA BARBARA on April 3, 2018 | https://pubs.acs.org Publication Date: June 1, 1975 | doi: 10.1021/bk-1975-0019.ch012

Downloaded by UNIV OF CALIFORNIA SANTA BARBARA on April 3, 2018 | https://pubs.acs.org Publication Date: June 1, 1975 | doi: 10.1021/bk-1975-0019.ch012

186

COMPUTER NETWORKING AND CHEMISTRY

barbituric a c i d ) , hexobarbital ( 5 - c y c l o h e x e n y l - l , 5 - d i m e t h y l b a r b i t u r i c a c i d ) , and n i c o t i n e . P B M has s u c c e s s f u l l y retrieved a r e l a t i v e l y large number of s p e c t r a of the c o r r e c t c o m p o u n d s , measured under a w i d e v a r i e t y of c o n d i t i o n s , without r e t r i e v i n g a s i n g l e i n c o r r e c t compound of comparable Κ v a l u e . The m a s s spectrum of p e n t o b a r b i t a l , the s e c - a m y l i s o m e r of the f i r s t c o m p o u n d , i s c l o s e l y s i m i l a r to that of a m o b a r b i t a l , so that the s e l e c t i v i t y e x h i b i t e d by P B M i n this c a s e , d e s p i t e the p r e s e n c e of a s e c o n d b a r b i t u r a t e , i s g r a t i f y i n g . STIRS. F o r the " S e l f - T r a i n i n g Interpretive and R e t r i e v a l S y s t e m " (11) a number of c l a s s e s of mass s p e c t r a l data known to h a v e h i g h structural s i g n i f i c a n c e , s u c h as c h a r a c t e r i s t i c i o n s , s e r i e s of i o n s , and m a s s e s of neutrals l o s t , are i d e n t i f i e d ; for e a c h c l a s s the computer m a t c h e s the data of the unknown m a s s spectrum a g a i n s t the c o r r e s p o n d i n g data of a l l reference s p e c t r a . In e a c h data c l a s s the reference compounds w h o s e s p e c t r a have the h i g h e s t " m a t c h - f a c t o r " (MF) v a l u e s are examined by the c h e m i s t for any common structural f e a t u r e s , w i t h a h i g h f r e q u e n c y of o c c u r r e n c e i n d i c a t i n g a h i g h p r o b a b i l i t y that the structural feature i s present i n the u n k n o w n . In a recent m o d i f i c a t i o n (18) the 15 s e l e c t e d compounds of h i g h e s t M F v a l u e are examined i n s t e a d by the computer for the p r e s e n c e of s p e c i f i c s u b s t r u c t u r a l groups to p r o v i d e a s t a t i s t i c a l e v a l u a t i o n of the p r o b a b i l i t y of the p r e s e n c e of e a c h group i n the unknown c o m p o u n d . At present STIRS i s a b l e to p r e d i c t the p r e s e n c e of 179 substructures at the 98 percent c o n f i d e n c e l e v e l w i t h an a v e r a g e r e c a l l of 49 percent ( i . e . , u s i n g c r i t e r i a i n w h i c h STIRS i s wrong o n l y o n c e i n 50 t i m e s , the substructure c a n be i d e n t i f i e d i n h a l f of the compounds i n w h i c h it i s a c t u a l l y p r e s e n t ) . B e c a u s e of the nature of mass s p e c t r a we do not have STIRS attempt to p r e d i c t the a b s e n c e of a substructure; the i n f l u e n c e of a p a r t i c u l a r substructure o n the m a s s s p e c t r a l b e h a v i o r c a n be greatly r e d u c e d by the p r e s e n c e i n the m o l e c u l e of another substructure w h i c h more strongly d i r e c t s the f r a g m e n t a t i o n . Structural data for a l l compounds i n the reference f i l e have b e e n c o d e d i n W i s w e s s e r L i n e N o t a t i o n ( W L N ) , w h i c h i s a l i n e a r r e p r e s e n t a t i o n of the c o m p o u n d ' s structure r e q u i r i n g a r e l a t i v e l y s m a l l v o l u m e of computer s t o r a g e . In W L N , s y m b o l s s u c h as 1, R, Μ , Ζ , V are u s e d to represent i n d i v i d u a l c h e m i c a l units s u c h as - C H - , C 6 H 5 , ^ N H , -NH2, g respectivesc o n n e c t i o n transfers are u s e d to show breaks from l i n e a r r e p r e s e n t a t i o n , ring units to show the p r e s e n c e of rings i n the s t r u c t u r e , and ring f u s i o n u n i t s to show ring i n t e r r e l a t i o n s i n a m u l t i c y c l i c s y s t e m . In c o m p a r i s o n to other modes of structure r e p r e s e n t a t i o n , W L N has s e v e r a l s p e c i a l advantages for the interpretation of m a s s s p e c t r a l d a t a . In an a c y c l i c s y s t e m the l i n e a r notation i s u s e f u l i n i d e n t i f y i n g fragmentations r e s u l t i n g from s i m p l e c l e a v a g e s , the c h e m i c a l units of W L N often b e i n g d i r e c t l y related to m a s s e s and mass d i f f e r e n c e s i n the s p e c t r u m . 2

Lykos; Computer Networking and Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1975.

Downloaded by UNIV OF CALIFORNIA SANTA BARBARA on April 3, 2018 | https://pubs.acs.org Publication Date: June 1, 1975 | doi: 10.1021/bk-1975-0019.ch012

12.

VENKATARAGHAVAN

E T AL.

Computer

Mass

Spectra

187

The p r e s e n c e of c e r t a i n c h e m i c a l units s u c h as V (carbonyl) and Ζ (primary amino) c a n exert a strong i n f l u e n c e on the f r a g m e n t a ­ t i o n b e h a v i o r of the m o l e c u l e and c a n thus be r e a d i l y u s e d for the i d e n t i f i c a t i o n and a s s i g n m e n t of e l e m e n t a l c o m p o s i t i o n s to different i o n s . D e v i a t i o n s from structural l i n e a r i t y are d e s i g n a t e d c l e a r l y i n the notation by c o n n e c t i o n t r a n s f e r s , and c a n thus be r e c o g n i z e d by the computer s y s t e m to relate p o i n t s of b r a n c h i n g and s u b s t i t u t i o n i n a m o l e c u l e to m a s s s p e c t r a l c l e a v a g e s c o m ­ monly triggered by s u c h structural f e a t u r e s . Complex c y c l i c m o l e c u l e s are among the most d i f f i c u l t types of structures for mass s p e c t r a l interpretation; the a b i l i t y of W L N to represent the r e l a t i o n s h i p s of different rings and s u b s t i t u e n t s w i t h i n the s t r u c ­ ture often makes it p o s s i b l e for the computer to r e c o g n i z e major fragmentations and to d e v e l o p s p e c t r a - s t r u c t u r e c o r r e l a t i o n s . As an example (11), running the spectrum of c h o l e s t e r o l as an " u n ­ k n o w n " , STIRS for match f a c t o r 5 s e l e c t e d neutral l o s s e s of 18, 0 ( M t ) 33, 15, 17, and 61; the s i x compounds found w i t h the h i g h e s t M F 5 v a l u e s are the s t e r o i d a l d e r i v a t i v e s a l l o p r e g n a n o l 3 a - o n e - 2 0 , pregnenolone a l c o h o l , p r e g n e n o l o n e , 7 β - h y d r o x y c h o l e s t a n y l 3 p - a c e t a t e , 1 6 a - m e t h y l p r e g n e n o l o n e (a large number of of other oxygenated c a r b o c y c l i c compounds are present i n the data b a s e ) . Although the l o s s of t h e s e s i m p l e neutral s p e c i e s from c h o l e s t e r o l i s quite c o n s i s t e n t w i t h present k n o w l e d g e of the m a s s s p e c t r a l b e h a v i o r of s u c h m o l e c u l e s , i t i s doubtful that t h e s e l o s s e s were known to be c h a r a c t e r i s t i c of s u c h m o l e c u l e s . /

The " A r t i f i c i a l I n t e l l i g e n c e " (9) method u s e s the computer to a p p l y human m a s s s p e c t r a l k n o w l e d g e to p r e d i c t s p e c t r a of i s o m e r i c p o s s i b i l i t i e s . As far as p o s s i b l e a l l of the known m a s s s p e c t r a l fragmentation b e h a v i o r i s programmed into the computer, and then the mass s p e c t r a of f e a s i b l e i s o m e r s (the p o s s i b i l i t i e s are generated by the D E N D R A L algorithm) are p r e d i c t e d and c o m ­ pared to the unknown m a s s s p e c t r u m . This demands that the unknown compound be i n the rather narrow c l a s s for w h i c h the program i s written and that its e l e m e n t a l c o m p o s i t i o n has b e e n e s t a b l i s h e d . A n A r t i f i c i a l I n t e l l i g e n c e program for estrogens (10) appears to be the o n l y one that has b e e n t e s t e d e x t e n s i v e l y o n true u n k n o w n s . STIRS i s complementary to the " A r t i f i c a l I n t e l l i ­ g e n c e " t e c h n i q u e i n that it c a n be a p p l i e d to the s p e c t r a of total unknowns (such as p o l l u t a n t s , i n s e c t s e c r e t i o n s , and abnormal urinary constituents) to o b t a i n p a r t i a l structural i n f o r m a t i o n . For e x a m p l e , STIRS i n g e n e r a l c a n e a s i l y i d e n t i f y estrogens and often some of the substitutents t h e r e o n , but the A r t i f i c i a l I n t e l l i g e n c e method i s s u p e r i o r for properly p l a c i n g the s u b s t i t u e n t s and e l u c i d a t i n g other structural d e t a i l s . STIRS u t i l i z e s d i r e c t l y the information of a l l a v a i l a b l e reference s p e c t r a without prior s p e c t r a - s t r u c t u r e c o r r e l a t i o n , " t r a i n i n g " i t s e l f s e p a r a t e l y for e a c h submitted unknown spectrum; thus the o n l y s p e c i a l p r e p a r a ­ t i o n n e c e s s a r y to make STIRS s e n s i t i v e for estrogens i s to make sure that there are r e p r e s e n t a t i v e m a s s s p e c t r a of e s t r o g e n s i n the reference data b a s e .

Lykos; Computer Networking and Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1975.

Downloaded by UNIV OF CALIFORNIA SANTA BARBARA on April 3, 2018 | https://pubs.acs.org Publication Date: June 1, 1975 | doi: 10.1021/bk-1975-0019.ch012

188

COMPUTER

NETWORKING A N D CHEMISTRY

C h e m i s t s sometimes forget the p o t e n t i a l c o m p l e x i t y of the problem of determining the e x a c t m o l e c u l a r structure of a new compound of a m o l e c u l a r weight of e v e n a f e w h u n d r e d . Although a r e l a t i v e l y s m a l l amount of information u s u a l l y c a n greatly reduce the m i l l i o n s of p o s s i b i l i t i e s ( b i l l i o n s i f l e s s common elements are i n c l u d e d ) , the effort required to narrow the p o s s i b i l i t i e s further u s u a l l y i n c r e a s e s e x p o n e n t i a l l y . Thus s c i e n c e c a n o n l y afford the luxury of determining the e x a c t m o l e c u l a r structure of a new unknown compound for s p e c i a l c a s e s . Knowledge of e v e n a f e w of the g r o s s structural features o f t e n i s enough to t e l l the s c i e n t i s t that the compound i s not germane to h i s i n v e s t i g a t i o n — f o r e x a m p l e , that i t i s p r o b a b l y n o n - t o x i c , s h o u l d not show the d e s i r e d p h a r m a c o l o g i c a l a c t i v i t y , or i s not a l o g i c a l metabolite of the drug u s e d . O u t s i d e U s e of STIRS and P B M . STIRS h a s b e e n a v a i l a b l e free to o u t s i d e u s e r s s i n c e January 1974 through a p h o n e l i n e l i n k to our laboratory P D P - 1 1 / 4 5 , and has b e e n u s e d at a rate of ~100 unknown s p e c t r a per month for a s u b s t a n t i a l part of t h i s t i m e . This i n d i c a t e d that STIRS w a s u n i q u e l y meeting a r e a l s c i e n t i f i c n e e d , but a l s o showed that a v e r y s u b s t a n t i a l proportion of the unknowns s h o u l d have been examined f i r s t b y a r e t r i e v a l s y s t e m . This l e d to implementing both P B M and STIRS o n the C o r n e l l I B M 370/168 to make them a v a i l a b l e i n t e r n a t i o n a l l y o v e r the T Y M N E T computer network s y s t e m . This s y s t e m has the further advantage that i t employ s a data b a s e that currently h a s approximately 4 0 , 0 0 0 m a s s spectra of 3 0 , 0 0 0 d i f f e r e n t compounds (15) w h i c h , to our k n o w l e d g e , i s larger than any other a v a i l a b l e c o l l e c t i o n . The C o r n e l l PBM/STIRS s y s t e m became o p e r a t i o n a l o n T Y M N E T i n M a y 1975 and the i n i t i a l r e s p o n s e h a s b e e n e n c o u r a g ing. The programs appear to be u s e d i n a complementary f a s h i o n ; if P B M cannot i d e n t i f y the unknown m a s s spectrum as a compound a l r e a d y i n the reference f i l e at a s a t i s f a c t o r y c o n f i d e n c e l e v e l , STIRS u s u a l l y e l u c i d a t e s at l e a s t p a r t i a l structural information c o n c e r n i n g the u n k n o w n . N o t s u r p r i s i n g l y , e s p e c i a l l y h i g h e n t h u s i a s m has b e e n e x p r e s s e d b y t h o s e l a b o r a t o r i e s w i t h r e l a t i v e l y l i t t l e e x p e r i e n c e i n m a s s s p e c t r a l i n t e r p r e t a t i o n . Although h i g h l y automated g a s chromatograph/mass spectrometer/computer s y s t e m s are common i n many s e r v i c e l a b o r a t o r i e s , s u c h as f o r drug i d e n t i f i c a t i o n , f o r e n s i c a n a l y s i s , and c l i n i c a l a s s a y s , the s p e c t r a l r e t r i e v a l s y s t e m s a v a i l a b l e w i t h t h e s e are r e l a t i v e l y p r i m i t i v e , u s u a l l y b a s e d o n a l i m i t e d number of p e a k s , and u t i l i z i n g o n l y s m a l l s p e c i a l i z e d data b a s e s . More experienced m a s s spectrometry l a b o r a t o r i e s seem to be u s i n g the C o r n e l l system mainly for " d i f f i c u l t " spectra, finding P B M valuable b e c a u s e of our more c o m p r e h e n s i v e data b a s e , a n d STIRS of s p e c i a l u s e for those types of compounds w i t h w h i c h the l a b o r a tory i s not p a r t i c u l a r l y f a m i l i a r . The H e l l e r / N I H " C o n v e r s a t i o n a l M a s s Spectral S e a r c h S y s t e m " (19) h a s been a v a i l a b l e for p h o n e - l i n e u s e s i n c e 1971,

Lykos; Computer Networking and Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1975.

Downloaded by UNIV OF CALIFORNIA SANTA BARBARA on April 3, 2018 | https://pubs.acs.org Publication Date: June 1, 1975 | doi: 10.1021/bk-1975-0019.ch012

12.

VENKATARAGHAVAN

E T AL.

Computer

Mass

Spectra

189

and s i n c e 1973 o v e r the G E i n t e r n a t i o n a l computer n e t w o r k . The p r e s e n t l i s t of o v e r 200 u s e r s attests to the need as w e l l as the u s e f u l n e s s of t h i s s y s t e m . At the time of t h i s w r i t i n g , the s y s t e m u t i l i z e d o n l y approximately 1 3 , 0 0 0 reference s p e c t r a , but e n largement of the data b a s e to 3 5 , 0 0 0 s p e c t r a i s p l a n n e d f o r the immediate f u t u r e . This i n t e r a c t i v e s y s t e m c a n be u s e d for both r e t r i e v a l and interpretive p u r p o s e s , interrogating the reference f i l e c o n c e r n i n g the p r e s e n c e of p a r t i c u l a r p e a k s on an i n d i v i d u a l b a s i s . Although this s y s t e m i s not s u i t a b l e for automatic m a t c h ing of complete unknown mass s p e c t r a , s u c h as i s P B M , a d e t a i l e d c o m p a r i s o n of the advantages and d i s a d v a n t a g e s of this c o n v e r s a t i o n a l s y s t e m v e r s u s the PBM/STIRS s y s t e m has not b e e n made. A p p l i c a b i l i t y of a Computer N e t w o r k i n g S y s t e m to A n a l y s i s of Unknown S p e c t r a . O u r e x p e r i e n c e to date d o e s g i v e us some i n d i c a t i o n of the types of problems w h i c h are b e s t examined by s u c h computer s y s t e m s . It seems l o g i c a l that the b e s t way to implement a r e t r i e v a l s y s t e m i s as part of the computer data a c q u i s i t i o n and r e d u c t i o n s y s t e m of the m a s s spectrometer. Here the p r a c t i c a l l i m i t a t i o n s of computer s p e e d , p o w e r , and reference data storage c a p a c i t y w i l l prevent s u c h a s y s t e m from b e i n g u s e d i n p a r t i c u l a r a p p l i c a t i o n s . H o w e v e r the a p p l i c a t i o n of P B M i n a G C / M S s y s t e m c o n t r o l l e d by a d e d i c a t e d microcomputer (12) shows that this i s c e r t a i n l y the method of c h o i c e under the proper circumstances. F u r t h e r , i f s u c h a s y s t e m w i l l produce answers q u i c k l y and a c c u r a t e l y i n a s u b s t a n t i a l proportion of c a s e s , the remainder c a n then o b v i o u s l y be examined i n the more s o p h i s t i c a t e d PBM/STIRS s y s t e m a v a i l a b l e on the network. It d o e s appear that the ready a c c e s s i b i l i t y to most l a b o r a tories w h i c h w i l l on o c c a s i o n require s u c h a s e r v i c e i s an i m portant k e y to the s u c c e s s of s u c h a c e n t r a l i z e d s y s t e m for m a s s s p e c t r a l r e t r i e v a l and i n t e r p r e t a t i o n . It i s v e r y d i f f i c u l t for most l a b o r a t o r i e s to m a i n t a i n a growing m a s s s p e c t r a l data b a s e of high accuracy. Further, s y s t e m s s u c h as P B M and STIRS are undergoing rapid r e s e a r c h improvements and o b v i o u s l y there must be a large time l a g i n i n c o r p o r a t i o n of s u c h c h a n g e s u n l e s s there i s a c l o s e c o n n e c t i o n w i t h the r e s e a r c h laboratory i n v o l v e d . The proper method of funding s u c h a c e n t r a l i z e d r e s o u r c e has not b e e n r e s o l v e d ; at present the f i n a n c i a l support of the C o r n e l l P B M / STIRS s y s t e m i s e n t i r e l y b a s e d on computer u s e c h a r g e s , w i t h further r e s e a r c h and d e v e l o p m e n t of the s y s t e m s supported through F e d e r a l g r a n t s . A number of u s e r s have e x p r e s s e d strong o p i n i o n s that the w h o l e o p e r a t i o n s h o u l d be funded by the F e d e r a l G o v e r n m e n t , w i t h at l e a s t a c a d e m i c and governmental u s e r s p a y i n g at most a s m a l l s e r v i c e charge for the networking o p e r a tion. Acknowledgment. The authors are d e e p l y indebted to the N a t i o n a l Institutes of H e a l t h , the N a t i o n a l S c i e n c e F o u n d a t i o n , and the Environmental P r o t e c t i o n A g e n c y for generous f i n a n c i a l support of the r e s e a r c h programs o n P B M and STIRS. The P B M

Lykos; Computer Networking and Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1975.

190

COMPUTER NETWORKING AND CHEMISTRY

system was developed in cooperation with Dr. R. H. Hertel and R. D. Villwock of the Universal Monitor Corporation, Pasadena, California. Implementation of PBM and STIRS on the network system would have not have been possible without the close cooperation of J. W. Rudan; J. Aikin, R. Cogger, and B. A. Meyer of the Office of Computer Services, Cornell University.

Downloaded by UNIV OF CALIFORNIA SANTA BARBARA on April 3, 2018 | https://pubs.acs.org Publication Date: June 1, 1975 | doi: 10.1021/bk-1975-0019.ch012

Literature Cited 1. Ridley, R. G., in Waller, G. R., Ed., "Biochemi­ cal Applications of Mass Spectrometry," p 177, John Wiley, New York City, 1972. 2. Fennessey, P. V . , in Milne, G. W. Α., Ed., "Mass Spectrometry: Techniques and Applications," p 77, John Wiley, New York City, 1971. 3. Fenselau, C., Appl. Spectr., (1974), 28, 305. 4. Arpino, P. J., Dawkins, B. G., and McLafferty, F. W., J. Chrom. Sci., (1974), 12, 574. 5. Pesyna, G. Μ . , and McLafferty, F. W . , "Determina­ tion of Organic Structures by Physical Methods," Volume 6, Nachod, F. C., Zuckerman, J. J., and Randall, E. W . , Academic Press, New York City, 1975. 6. Crawford, L. R., and Morrison, J. D . , Anal. Chem., (1968), 40, 1464. 7. Jurs, P. C., Kowalski, B. R., and Isenhour, T. L., Anal. Chem., (1969), 41, 21. 8. Isenhour, T. L., and Jurs, P. C., Anal. Chem., (1971), 43, 21A. 9. Buchs, Α . , Delfino, A. B., Duffield, Α. Μ . , Djerassi, C., Buchanan, B. G., Feigenbaum, Ε. Α . , and Lederberg, J., Helv. Chim. Acta, (1970), 53, 1394. 10. Smith, D. Η . , Buchanan, B. G., Engelmore, R. S., Duffield, Α. Μ . , Yeo, Α . , Feigenbaum, Ε. A . , Lederberg, J., and Djerassi, C., J. Am. Chem. Soc., (1972), 94, 5962. 11. Kwok, K.-S., Venkataraghavan, R., and McLafferty, F. W . , J. Am. Chem. Soc., (1973), 95, 4185. 12. McLafferty, F. W . , Hertel, R. Η . , and Villwock, R. D . , Org. Mass Spectrom., (1974), 9, 690. 13. Pesyna, G. Μ . , McLafferty, F. W., Venkataraghavan, R., and Dayringer, Η. Ε . , Anal. Chem., submitted. 14. Stenhagen, Ε . , Abrahamsson, S., and McLafferty, F. W., "Registry of Mass Spectral Data," Wiley-Interscience, New York City, 1974. 15. Mass Spectral Data Collection, Cornell University and the Environmental Protection Agency. 16. Salton, G., "Automatic Information, Organization, and Retrieval," McGraw-Hill, New York City, 1968. 17. Pesyna, G. Μ . , McLafferty, F. W . , and Venkatarag­ havan, R., Anal. Chem., June, (1975).

Lykos; Computer Networking and Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1975.

12.VENKATARAGHAVANETAL.Computer Mass Spectra

191

Downloaded by UNIV OF CALIFORNIA SANTA BARBARA on April 3, 2018 | https://pubs.acs.org Publication Date: June 1, 1975 | doi: 10.1021/bk-1975-0019.ch012

18. Dayringer, H. E., Pesyna, G. Μ., Venkataraghavan, R., and McLafferty, F. W . , Anal. Chem., submitted. 19. Heller, S. R., Koniver, D. Α., Fales, Η. Μ . , and Milne, G. W. Α., Anal. Chem., (1974), 46, 947.

Lykos; Computer Networking and Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1975.