18 Structure Elucidation Based on Computer Analysis of High and Low Resolution Mass Spectral Data
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
DENNIS H. SMITH and RAYMOND E. CARHART Departments of Chemistry, Genetics, and Computer Science, Stanford University, Stanford, CA 94305
A tremendous effort has been directed toward development of advanced instrumentation for mass spectrometric analysis. Advancements include ever-increasing sensitivities and resolving powers, new ionization techniques, metastable ion probes of ion decomposition and structure and computer systems for rapid acquisition and reduction of data. We sometimes lose sight of the fact that these developments are designed to provide information about chemical and biochemical structures at greater depth and in greater detail than previously available. The ultimate goal in most research in mass spectrometry is to provide powerful tools for molecular structure elucidation, either directly, by exploitation of existing techniques, or indirectly by development of new techniques. Concurrently, several computer-based techniques designed to assist chemists in the analysis and interpretation of mass spectral data have been developed. Analytical procedures for treatment of combined gas chromatographic/mass spectrometric data obtained at low resolving powers (1) (gc/1rms) ©0-8412-0422-5/78/47-070-325$10.00/0 Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
326
HIGH
PERFORMANCE
MASS
SPECTROMETRY
p r o v i d e mass s p e c t r a o f h i g h q u a l i t y f o r subsequent e x a m i n a t i o n by manual or computer methods. Library s e a r c h p r o c e d u r e s (2) and t h e i r e x t e n s i o n s (30 o r p a t t e r n r e c o g n i t i o n programs (4J) may p r o v i d e c l u e s t o the i d e n t i t y o f the s t r u c t u r e or be used t o d e t e r m i n e the s t r u c t u r e u n i q u e l y . A computer program f o r a n a l y s i s o f s p e c t r a based on c l a s s s p e c i f i c f r a g m e n t a t i o n r u l e s i s a v a i l a b l e (5) . These t e c h n i q u e s have o b v i o u s l i m i t a t i o n s (3,4,5) ; i n f a c t , the a b i l i t y t o i n t e r p r e t mass s p e c t r a l d a t a i n terms o f m o l e c u l a r s t r u c t u r e l a g s f a r b e h i n d the c a p a b i l i t i e s o f modern s p e c t r o m e t e r s t o produce h i g h q u a l i t y d a t a . There are s e v e r a l r e a s o n s f o r t h i s l a g : 1. There i s no f o r m a l t h e o r y r e l a t i n g molecular s t r u c t u r e s to t h e i r r e s p e c t i v e mass s p e c t r a w h i c h has p r e d i c t i v e power o f use t o the s t r u c t u r a l c h e m i s t . 2. (a c o r o l l a r y o f 1) Mass s p e c t r o m e t r y r a r e l y p r o v i d e s d e t a i l e d s u b s t r u c t u r a l i n f o r m a t i o n to a s s i s t i n e l u c i d a t i n g a s t r u c t u r e e x c e p t i n known c h e m i c a l c o n t e x t s where p r e v i o u s l y d e v e l o p e d r u l e s may be a p p l i e d r e t r o s p e c tively. 3. C u r r e n t methods do not make adequate use o f o t h e r knowledge about a p a r t i c u l a r compound, and; 4. The the c o m b i n a t o r i a l c o m p l e x i t y o f d e a l i n g w i t h the a c t u a l i n f o r m a t i o n c o n t e n t o f a mass spectrum has n o t , u n t i l now, been a d d r e s s e d . P o i n t s (3) and (4) w i l l be d i s c u s s e d i n some d e t a i l subsequently. In our l a b o r a t o r i e s we have been t r y i n g t o b r i n g n e w l y - d e v e l o p e d c o m p u t a t i o n a l t o o l s t o bear on g e n e r a l approaches t o a s s i s t i n g s t r u c t u r a l c h e m i s t s i n i n t e r p r e t a t i o n o f mass s p e c t r a . In t h i s paper we w i l l d i s c u s s the s t r e n g t h s and l i m i t a t i o n s o f t h e s e new t o o l s w h i l e assuming t h a t the r e q u i s i t e mass s p e c t r a l d a t a are a v a i l a b l e . We are engaged i n r e s e a r c h w h i c h i n v o l v e s the gc/ms a n a l y s i s o f complex m i x t u r e s t o g e t h e r w i t h subsequent a n a l y s i s of these data to e x t r a c t s p e c t r a of i n d i v i d u a l components ( l b ) and s e a r c h f o r the s p e c t r a i n l i b r a r i e s o f mass s p e c t r a l d a t a . The gc/lrms a n a l y s e s p r o v i d e an i m p o r t a n t p r e - s c r e e n i n g o f m i x t u r e s . Combined gc/ms d a t a o b t a i n e d a t h i g h mass s p e c t r o m e t e r r e s o l v i n g powers (gc/hrms) (6) y i e l d e l e m e n t a l c o m p o s i t i o n d a t a f o r n o v e l components. In any case the computer programs d e s c r i b e d i n subsequent s e c t i o n s a c c e p t e i t h e r low or h i g h r e s o l v i n g power d a t a (nominal masses or e l e m e n t a l compositions, r e s p e c t i v e l y ) . Computer-Assisted
S t r u c t u r e E l u c i d a t i o n Based
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
18.
Computer Analysis of High and Low Resolution 327
SMITH AND CARHART
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
P r i m a r i l y on Mass S p e c t r a l Data. Assume t h a t an i m p o r t a n t unknown component has been o b s e r v e d i n d a t a from a gc/ms a n a l y s i s o f a m i x t u r e ( e . g . , F i g u r e 1 ) . Assume f u r t h e r t h a t t h e spectrum o f t h e component ( F i g u r e 2) was n o t found in existing libraries. T h i s problem becomes a c l a s s i c problem o f s t r u c t u r e e l u c i d a t i o n . One, f o r example, might attempt t o i s o l a t e l a r g e r q u a n t i t i e s and o b t a i n a d d i t i o n a l s p e c t r a l d a t a . F o r many problems t h i s i s time-consuming and d i f f i c u l t . R e a l i z i n g t h a t h i g h r e s o l v i n g power d a t a a r e l e s s ambiguous t h a n d a t a p r o v i d e d by t h e low r e s o l u t i o n spectrum ( F i g u r e 2 ) , one might o b t a i n a gc/hrms spectrum and d e t e r m i n e e l e m e n t a l c o m p o s i t i o n s f o r the o b s e r v e d i o n s . The d a t a o b t a i n e d i n t h e example a r e p r e s e n t e d i n T a b l e I f o r major i o n s i n t h e spectrum. These d a t a may be t h e o n l y d a t a one c a n TABLE I . Elemental Compositions f o r S i g n i f i c a n t Fragment Ions i n t h e Mass Spectrum G i v e n i n F i g u r e 2. m/e
Composition
293
C
1 5
H
1 9
N0
5
261
C
1 4
H
1 5
N0
4
234
C
13 16
202
C
8 12
174
C H N0
142
C H N0
116
C H NO
91
H
H
7
6
5
C H ?
N 0
N 0
5
1 2
8
1 0
3
4
3
2
7
e a s i l y o b t a i n t o determine s t r u c t u r a l i n f o r m a t i o n about t h e unknown. Many c u r r e n t programs o p e r a t e under t h e assumption t h a t t h e s e a r e t h e o n l y d a t a . But, o f course, t h i s i s not t r u e . I n almost e v e r y instance a great deal of a d d i t i o n a l information about t h e unknown i s a v a i l a b l e . Some o f t h i s i n f o r m a t i o n i s f a c t u a l , f o r example, t h e p h y s i c a l and c h e m i c a l p r o p e r t i e s and t h e s o u r c e o f t h e m a t e r i a l s ,
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
328
HIGH
PERFORMANCE
Figure 1. Total ion current plot, scans 470-565, of the GCfirms analysis of the ether/ethyl acetate extracts of an acidi fied human urine subsequent to a 1 hr IN NaOH hydrolysis. Numbers and names associated with component spectra de tected by the CLEANUP program (lb) refer to the match scores and names of close library matches. Tetracosane, scan 508, is an internal standard for relative retention index cahuhtions.
MASS
SPECTROMETRY
5 0 0
5 5 0
91
547
116
142 174 202
234 261
,
Ί T'H"i"T Γ 150
200
250
1^293 300
Figure 2. 70 eV low resolution mass spectrum obtained for the component eluting at scan 547 (see total ion current plot, Figure 1)
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
18.
SMITH AND
CARHART
Computer Analysis of high and
Low
Resolution 329
and i s o l a t i o n and d e r i v a t i z a t i o n p r o c e d u r e s . Other i n f o r m a t i o n i s j u d g m e n t a l ; f o r example, knowledge o f o t h e r compounds p r e s e n t i n the same m i x t u r e , p l a u s i b i l i t y o f c h e m i c a l or b i o c h e m i c a l p r o c e s s e s w h i c h may have y i e l d e d the compound, and good intuitions. I f t h i s i n f o r m a t i o n can be b r o u g h t t o b e a r on the p r o b l e m , i t s h o u l d be e a s i e r t o s o l v e . We have d e v e l o p e d the CONGEN program f o r c o m p u t e r - a s s i s t e d s t r u c t u r e e l u c i d a t i o n . (7_) This program has a f l e x i b l e mechanism f o r e x p r e s s i o n and use o f c o n s t r a i n t s on the p l a u s i b i l i t y o f c e r t a i n chemical features (substructures, r i n g systems). (8J) A l t h o u g h d e s i g n e d f o r the g e n e r a l p r o b l e m o f i n c o r p o r a t i n g s t r u c t u r a l i n f e r e n c e s from d i f f e r e n t s p e c t r o s c o p i c techniques or other sources of chemical i n f o r m a t i o n , we are i n t r o d u c i n g c a p a b i l i t i e s f o r more t h o r o u g h use o f mass s p e c t r a l d a t a . The c a p a b i l i t i e s were d e v e l o p e d i n i t i a l l y i n e a r l i e r DENDRAL work. (5,9) The i m p o r t a n c e o f CONGEN i n problems such as t h a t o u t l i n e d above i s t h a t i t g i v e s the c h e m i s t a mechanism f o r e x p l o r i n g s t r u c t u r a l p o s s i b i l i t i e s under c o n s t r a i n t s e x p r e s s i n g h i s / h e r f a c t u a l o r j u d g m e n t a l knowledge. T h i s knowledge can be d e f i n e d , a p p l i e d t o a c u r r e n t p r o b l e m , and saved f o r f u t u r e use i n r e l a t e d p r o b l e m s , as i l l u s t r a t e d below. R e t u r n i n g t o the example, i t i s f o o l h a r d y t o c o n s i d e r s t r u c t u r a l possibilités i n the l i g h t o f the presumed m o l e c u l a r i o n , C 1 5 H 1 9 N O 5 . Without c o n s t r a i n t s , the number o f p o s s i b l e s t r u c t u r e s i s huge. However, knowledge t h a t the compound i s a component o f a m i x t u r e o f o r g a n i c compounds i s o l a t e d from human u r i n e , and t h a t the u r i n e was s u b j e c t e d to b a s i c h y d r o l y s i s p r i o r to e x t r a c t i o n provides a d d i t i o n a l i n f o r m a t i o n w h i c h can t o some e x t e n t be e x p r e s s e d as s t r u c t u r a l c o n s t r a i n t s (see b e l o w ) . More s p e c i f i c s t r u c t u r a l i n f o r m a t i o n i s a v a i l a b l e from the f a c t t h a t the f r a c t i o n c o n s i s t s o f e t h e r / e t h y l a c e t a t e e x t r a c t a b l e o r g a n i c a c i d s , w h i c h were s u b s e q u e n t l y e s t e r i f i e d u s i n g diazomethane p r i o r t o gc/ms a n a l y s i s . T h i s " h i s t o r y " of the sample p r o v i d e s a tremendous r e d u c t i o n i n the scope o f p o s s i b l e s t r u c t u r e s . Chemists use t h i s r e a s o n i n g a u t o m a t i c a l l y i n manual examination of s t r u c t u r a l p o s s i b i l i t i e s . To be t r u l y e f f e c t i v e , a program must somehow p r o v i d e the same c a p a b i l i t i e s when c o n f r o n t e d w i t h a mass spectrum. To a s s i s t c h e m i s t s i n making use o f such r e a s o n i n g , we are e x t e n d i n g CONGEN t o a l l o w e x p l o r a t i o n o f s t r u c t u r a l p o s s i b i l i t i e s f o r an unknown
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
330
HIGH
PERFORMANCE
MASS
SPECTROMETRY
w i t h i n c o n s t r a i n t s p r o v i d e d by the mass spectrum and by f a c t u a l and j u d g m e n t a l knowledge s u p p l i e d by t h e c h e m i s t . We a r e d e v e l o p i n g two g e n e r a l approaches to t h e use o f mass s p e c t r a l d a t a . These two a p p r o a c h e s , MSPRUNE and Mass D i s t r i b u t i o n Graphs (MDG's), m i r r o r t h e c l a s s i c i n t e r p l a y between maximum use o f i n f o r m a t i o n i n r e t r o s p e c t i v e t e s t i n g v s . p r o s p e c t i v e g u i d a n c e ( p l a n n i n g ) toward hypot h e t i c a l s o l u t i o n s i n the p r o b l e m - s o l v i n g paradigm of h e u r i s t i c s e a r c h . (5^7J) i n p r i n c i p l e , the approches a r e complementary. They w i l l y i e l d the same answers by w o r k i n g on a p r o b l e m from two d i f f e r e n t d i r e c t i o n s . In p r a c t i c e we have made more p r o g r e s s t o date on t h e former (see b e l o w ) . We w i l l i l l u s t r a t e b o t h w i t h examples. I.
MSPRUNE - R e t r o s p e c t i v e T e s t i n g o f S t r u c t u r a l Candidates.
If i t i s p o s s i b l e to a r r i v e at a set of s t r u c t u r a l c a n d i d a t e s f o r an unknown based on c o n s t r a i n t s d e r i v e d from c h e m i c a l c o n s i d e r a t i o n s , o t h e r s p e c t r o s c o p i c d a t a and/or c h a r a c t e r i s t i c i o n s i n a mass s p e c t r u m , i t s h o u l d be p o s s i b l e t o t e s t each c a n d i d a t e i n some d e t a i l t o d e t e r m i n e i f i t i s c a p a b l e of p r o d u c i n g t h e o b s e r v e d spectrum. MSPRUNE makes such t e s t s and w i l l "prune", o r r e j e c t those s t r u c t u r e s w h i c h c o u l d not have y i e l d e d o b s e r v e d i o n s . In many p r o b l e m s , i n c l u d i n g the example ( F i g u r e 2, T a b l e I ) , i t i s p o s s i b l e t o a r r i v e a t a r e a s o n a b l e set o f c a n d i d a t e s t r u c t u r e s f o r an unknown u s i n g a v a i l a b l e d a t a and the f o l l o w i n g g e n e r a l p r o c e d u r e . A. Determine the m o l e c u l a r w e i g h t and f o r m u l a . I n t h e example, a c a n d i d a t e m o l e c u l a r i o n i s found a t m/e 293, o f c o m p o s i t i o n C i H i N 0 . T h i s i o n i s s e l e c t e d by MOLION ( 1 0 ) , our m o l e c u l a r i o n d e t e r m i n 5
9
5
TABLE I I . M o l e c u l a r Ion C a n d i d a t e s and Rankings f o r t h e Spectrum G i v e n i n F i g u r e 2. Candidate m/e
293 294 325 352 308
Ranking 100 71 50 46 43
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
18.
SMITH AND
CARHART
Computer Analysis of High and Low
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
a t i o n program, and makes sense c h e m i c a l l y . f i v e b e s t c a n d i d a t e s and t h e i r r a n k i n g s are i n Table I I .
Resolution 331
The given
B. D e r i v e superatoms (7) and c o n s t r a i n t s from a v a i l a b l e data" In the example, knowledge o f the source o f the sample t e l l s us t h a t i t i s an o r g a n i c a c i d from human u r i n e and was e s t e r i f i e d t o form m e t h y l e s t e r s p r i o r t o gc/ms a n a l y s i s . T h i s f r a c t i o n c o n t a i n s a r o m a t i c and a l i p h a t i c a c i d s i n a d d i t i o n t o c o n j u g a t e s o f these a c i d s w i t h b a s i c n i t r o g e n s , such as i n amino a c i d s . We can d e f i n e a s e t o f superatoms, or b u i l d i n g b l o c k s , w h i c h can be used to c o n s t r u c t s t r u c t u r e s and can be saved on a computer f i l e f o r f u t u r e use i n r e l a t e d p r o b l e m s . Such a s e t o f superatoms w i t h t h e i r a s s o c i a t e d names i s shown i n F i g u r e 3, where the bonds t o u n s p e c i f i e d atoms a r e f r e e v a l e n c e s w h i c h w i l l subs e q u e n t l y be bonded t o o t h e r atoms i n c l u d i n g h y d r o gen. In the example, the abundant m/e 91 i o n s u g g e s t s the superatom BZ ( F i g u r e 3 ) , w i t h no o t h e r s u b s t i t u e n t s a t t a c h e o f t o the a r o m a t i c r i n g . The number o f oxygen atoms and degree o f u n s a t u r a t i o n s u g g e s t two m e t h y l e s t e r f u n c t i o n a l i t i e s (EST, F i g u r e 3 ) . The s i n g l e n i t r o g e n s u g g e s t s a t l e a s t the p a r t s t r u c t u r e AMI, a r i s i n g from an a c i d c o n j u g a t e w i t h a b a s i c n i t r o g e n . There are perhaps o t h e r ways t o p h r a s e t h i s p r o b l e m , and a l t e r n a t i v e a s s u m p t i o n s , but t h e s e a s s u m p t i o n s w i l l s u f f i c e f o r i l l u s t r a t i v e purposes. C. Generate s t r u c t u r e s under a p p r o p r i a t e c o n s t r a i n t s from the c o m p o s i t i o n o f superatoms and r e m a i n i n g atoms. In our example, the c o m p o s i t i o n i s B Z i E S T AIM1C3H5. W i t h o u t c o n s t r a i n t s t h e r e are 78 s t r u c tural possibilities. T h i s l i s t c o n t a i n s many i m p l a u s i b l e s t r u c t u r e s . For example, i f we assume t h a t the compound i s an amino a c i d c o n j u g a t e , t h e n a p a r t s t r u c t u r e s i m i l a r to ACI must be p r e s e n t , i n fact -NHCH _2^~ 3Implementation of t h i s c o n s t r a i n t ^ l e a v e s 16 s t r u c t u r e s . 2
C 0 0 C H
f l
D. Use MSPRUNE t o t e s t the r e m a i n i n g s t r u c t u r a l c a n d i d a t e s t o d e t e r m i n e w h i c h c o u l d y i e l d key i o n s i n the o b s e r v e d s p e c t r u m ? MSPRUNE i s an e x t e n s i o n t o CONGEN w h i c h a l l o w s i n t e r a c t i o n w i t h the program t o c a r r y out the t e s t s . MSPRUNE o p e r a t e s u s i n g the f o l l o w i n g sequence o f s t e p s : D.l Obtain fragmentation r u l e s : A s e r i e s of q u e s t i o n s t o the u s e r o f MSPRUNE/CONGEN e l i c i t s the
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
332
HIGH
PERFORMANCE
MASS
SPECTROMETRY
mass s p e c t r o m e t r i c f r a g m e n t a t i o n r u l e s t o be used i n i n t e r p r e t a t i o n o f the d a t a . We are c u r r e n t l y r e s t r i c t e d t o r u l e s used p r e v i o u s l y i n the MetaDENDRAL program INTSUM ( 9 ) . These r u l e s i n c l u d e c o n s t r a i n t s on c l e a v a g e o f a r o m a t i c r i n g s , m u l t i p l e bonds, more t h a n one bond t o t h e same atom, number o f s t e p s i n a f r a g m e n t a t i o n p r o c e s s , hydrogen t r a n s f e r s and l o s s (or t r a n s f e r ) o f o t h e r n e u t r a l s p e c i e s such as w a t e r o r c a r b o n monoxide. For t h e example, the c o n s t r a i n t s summarized i n T a b l e I I I were used. This set of c o n s t r a i n t s i s p a r t i c u l a r l y restrictive. The o n l y danger i n t h i s l e v e l o f r e s t r i c t i o n i s t h a t an i n c o r r e c t s t r u c t u r e may y i e l d a s i m p l e r e x p l a n a t i o n o f the spectrum t h a n the c o r r e c t s t r u c t u r e . The c o r r e c t s t r u c t u r e i n such a case may be m i s s e d . D.2 Input mass s p e c t r a l i o n s t o be e x p l a i n e d . The u s e r i s t h e n asked t o i n p u t the i o n s he/she w i s h e s t o be e x p l a i n e d . These i o n s may be e n t e r e d TABLE I I I . F r a g m e n t a t i o n P r o c e s s C o n s t r a i n t s Used i n t h e A n a l y s i s o f the Mass Spectrum G i v e n i n F i g u r e 2. Constraints Allow Adjacent Breaks:
a
User Response No
Allow Aromatic Breaks: A l l o w Breaks o f Double o r T r i p l e Bonds: Max Bonds t o Break i n a S i n g l e S t e p : Max Steps Per P r o c e s s : Max Bonds t o Break i n a P r o c e s s Allowed H Transfers: Allowed Neutral Transfer:
No^ No 1 2 2 -2-1012
Cleavage o f more t h a n one, non-hydrogen bond t o t h e same atom. I . e . , do not c l e a v e the a r o m a t i c r i n g . There a r e no a r o m a t i c r i n g s o r m u l t i p l e bonds a l l o w e d t o c l e a v e ; any number >1 i s m e a n i n g l e s s because t h e r e a r e no o t h e r degrees o f u n s a t u r a t i o n ( r i n g s ) ; thus e v e r y c l e a v a g e y i e l d s a fragment i o n . (9)
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
18.
SMITH AND CARHART
Computer Analysis of High and Low Resolution 333
as e i t h e r n o m i n a l masses o r e l e m e n t a l c o m p o s i t i o n s . O b v i o u s l y t h e l a t t e r form i s much more e f f e c t i v e t h a n n o m i n a l masses o f p o s s i b l e c o m p o s i t i o n a l a m b i g u i t y . The method o f s e l e c t i n g t h e i o n s t o be used i n t h e a n a l y s i s i s up t o t h e u s e r . I n t h e example, we chose i o n s on t h e b a s i s o f a) h i g h mass ( i n t u i t i v e l y o f g r e a t e r s t r u c t u r a l u t i l i t y ) , and, b) h i g h abundance. Ions o f low mass and low abundance have a g r e a t e r chance o f r e s u l t i n g from e i t h e r s e v e r a l d i f f e r e n t p l a c e s i n t h e m o l e c u l e o r from complex p r o c e s s e s beyond t h e a b i l i t y o f t h e s i m p l e r u l e s (Table I I I ) t o explain. D. 3 T e s t each c a n d i d a t e s t r u c t u r e t o d e t e r m i n e i f i t ~ c o u l d y i e l d i o n s i n p u t . A l l p o s s i b l e fragment a t i o n s a l l o w e d by t h e c o n s t r a i n t s a r e d e t e r m i n e d f o r each s t r u c t u r e , u s i n g an a l g o r i t h m s i m i l a r t o t h a t d e v e l o p e d f o r INTSUM ( 9 ) . For each f r a g m e n t a t i o n t h e mass and c o m p o s i t i o n oF t h e r e s u l t i n g i o n i s d e t e r mined, i n c l u d i n g a l l o w e d hydrogen and/or n e u t r a l t r a n s f e r s . A simple comparison o f these ions w i t h the i o n s i n p u t r e v e a l s whether o r n o t t h e s t r u c t u r e could y i e l d a l l ions input. I f not, the s t r u c t u r e i s rejected. I f so, the s t r u c t u r e i s r e t a i n e d . This e x p e r i m e n t a l v e r s i o n o f MSPRUNE t a k e s no c o g n i z a n c e o f i o n abundances i n t h i s c o m p a r i s o n . I t makes a simple e x i s t e n c e t e s t only. Given the ions of Table I and t h e c o n s t r a i n t s o f T a b l e I I , t h e l i s t o f 16 b e s t c a n d i d a t e s i s trimmed t o f i v e 1 - 5 , (see F i g u r e 4 ) . I f t h e o r i g i n a l s e t o f 78 p o s s i b i l i t i e s i s t e s t e d under t h e above c o n d i t i o n s * f o r MSPRUNE, 15 s t r u c t u r e s r e m a i n ; t h e spectrum i t s e l f i s a p o w e r f u l c o n s t r a i n t on p o s s i b l e s t r u c t u r e s . I f o n l y n o m i n a l masses a r e i n p u t , t h e s e t o f 16 s t r u c t u r e s i s n o t r e d u c e d by MSPRUNE; 16 s t r u c t u r e s remain. T h i s k i n d o f c o m p a r i son can q u a n t i t a t e t h e i n f o r m a t i o n c o n t e n t o f low v s . high r e s o l u t i o n spectra. E. E v a l u a t e Remaining S t r u c t u r e s . The s t r u e t u r e s w h i c h r e s u l t can be examined w i t h t h e h e l p o f CONGEN t o d e t e r m i n e a d d i t i o n a l c o n s t r a i n t s o r t o d e s i g n e x p e r i m e n t s t o d i f f e r e n t i a t e among t h e possibilities. W i t h knowledge o f human m e t a b o l i c p r o c e s s e s and t h e c h e m i s t r y o f t h e i s o l a t i o n p r o c e d u r e , i t i s easy t o a s s i g n 1, (see F i g u r e 4) p h e n y l a c e t y l g l u t a m i c a c i d d i m e t h y l e s t e r ( 6 ) , as t h e c o r r e c t s t r u c t u r e . Phenylacetic a c i d i s normally c o n j u g a t e d w i t h g l u t a m i n e and e x c r e t e d as phenyl a c e t y l g l u t a m i n e . The base c a t a l y z e d h y d r o l y s i s c o n v e r t e d t h e p r i m a r y amide f u n c t i o n a l i t y i n t o t h e o b s e r v e d c a r b o x y l i c a c i d ( 6 , see F i g u r e 4 ) : t h e
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
334
HIGH
PERFORMANCE
MASS
SPECTROMETRY
d i m e t h y l e s t e r was formed on subsequent d e r i v a t i zation. In l a r g e r p r o b l e m s , i t i s p o s s i b l e t o use o t h e r f e a t u r e s o f CONGEN t o t e s t i n t u i t i o n s on possible structures. For example, i n t h i s problem where an amino a c i d c o n j u g a t e was s u s p e c t e d , i t was p o s s i b l e t o t e s t a u t o m a t i c a l l y every s t r u c t u r e f o r the p r e s e n c e o f one o f t h e known amino a c i d s k e l e t o n s . Of t h e f i v e f i n a l s t r u c t u r e s , ( 1 - 5 , see F i g u r e 4) o n l y one, 1, p o s s e s s e s a known amino a c i d s k e l e t o n . Of t h e 16 assumed c o n j u g a t e s , f o u r f o r m a l l y p o s s e s s a g l y c i n e , two a p h e n y l a l a n i n e , one an a s p a r t i c and one 1, a g l u t a m i c s k e l e t o n . Possible o r i g i n s of i m p o r t a n t f r a g m e n t a t i o n s i n t h e spectrum o f 1 a r e i l l u s t r a t e d i n Scheme 1. We have no i s o t o p i c l a b e l l i n g data t o support these suggestions. An A p p l i c a t i o n o f MSPRUNE. The p r o c e d u r e o u t l i n e d above p r o v e d e x t r e m e l y h e l p f u l i n a n a l y s i s o f unknown compounds o b s e r v e d i n a gc low r e s o l u t i o n ms e x p e r i m e n t . A p a t i e n t e x h i b i t i n g s i g n s o f m e n t a l r e t a r d a t i o n was r e f e r r e d t o S t a n f o r d . We examined o r g a n i c compounds i n t h i s p a t i e n t ' s u r i n e u s i n g p r o c e d u r e s d e s c r i b e d above f o r t h e example o f s t r u c t u r e 1. In f a c t , t h e s e compounds were o b s e r v e d i n t h e same o r g a n i c a c i d f r a c t i o n , b u t t h i s t i m e p r i o r t o any a l k a l i n e hydrolysis. The p o r t i o n o f t h e t o t a l i o n c u r r e n t v s . scan number p l o t where t h e unknowns were o b s e r v e d i s shown i n F i g u r e 5. The components i n q u e s t i o n , A^ C, were d e t e c t e d by t h e program CLEANUP ( l b ) a t scans 382, 402, and 406 ( F i g u r e 4 ) . R e s o l v e d s p e c t r a ( l b ) a r e shown i n F i g u r e s 6a and 6b. The spectra bear obvious s i m i l a r i t i e s ; i n f a c t the s p e c t r a a t scans 402 and 406 a r e n e a r l y superi m p o s a b l e . Gc/hrms (6) a n a l y s i s o f t h i s f r a c t i o n l e n t f u r t h e r e v i d e n c e "to s u p p o r t t h e r e l a t i o n s h i p # o f t h e unknown s t r u c t u r e s ; a l l s i g n i f i c a n t i o n s common t o t h e s p e c t r a p o s s e s s t h e same e l e m e n t a l c o m p o s i t i o n s (shown i n F i g u r e 6a f o r t h e s i g n i f i c a n t , h i g h e r mass i o n s ) . The MOLION program f i n d s m/e 207, C 1 1 H 1 3 N O 3 , t h e h i g h e s t r a n k i n g m o l e c u l a r i o n c a n d i d a t e f o r a l l t h r e e components. The unknowns a r e a p p a r e n t l y s t r u c t u r a l i s o m e r s . Thus, t h e y can be i n v e s t i g a t e d by CONGEN and MSPRUNE i n a s i n g l e r u n . F o r t h i s a p p l i c a t i o n , we assumed t h e p r e s e n c e o f an a r o m a t i c r i n g , a m e t h y l e s t e r f u n c t i o n a l i t y and an amide ( c o n j u g a t e )
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
SMITH AND CARHART
CH3O-?-
HO-
EST
OH
CH3O-
>=0
MEO
CO
Computer Analysis of High and Low
J?-NHAMI
—HNACI
Figure S. A set of superatoms useful for considering structural candidates in the context of solvent extractable organic acids from human urine (derivatized to methyl esters)
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
,JL^ '\\f ^^Α^μ B2
Resolution 335
*
Figure 4. Candidate structures for unknown repre sented by the mass spectrum in Figure 2
Scheme 1
(—202 -CH^OHx* 142 174.
M16(-C H 02) 2
2
HDCH-234
261
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
336
HIGH
PERFORMANCE
MASS
SPECTROMETRY
Figure 5. Total ion current plot, scans 357-426, of the solvent extractable organic acids from the urine of a patient exhibiting signs of mental retardation
119
ι υ 91
382 118 00
en
υ I
ζ
CO
u
25
if 148
207
175
4^
,..,|».. Ι.ΙΙ|... ... .... .,η|..,.|Ι...|... ...ρ.. ....|.... .ΙΙ.| |
50
150
τ
Ι
|
τ
|
Ι
200
Figure 6α. 70 eV low resolution mass spectrum of component A, scan 382, of the total ion current plot of Figure 5. Elemental compositions were determined by a subsequent GC/hrms experiment
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
18.
SMITH AND
CARHART
Computer Analysis of High and Low Resolution 337
l i n k a g e (superatoms PH, EST, and AMI r e s p e c t i v e l y ) . C o n s t r a i n t s f o r b i d d i r e c t c o n n e c t i o n o f t h e amide and e s t e r f u n c t i o n s , and a l k y l c h a i n s o f two o r more c a r bons c o n n e c t e d t o t h e a r o m a t i c r i n g . G e n e r a t i o n o f s t r u c t u r e s under these c o n s t r a i n t s y i e l d e d 52 p o s s i b i l i t i e s . A number o f methods were used t o examine t h e s e s t r u c t u r e s . F o r example, auto m a t i c s u r v e y o f t h e s t r u c t u r a l p o s s i b i l i t i e s showed 6 s t r u c t u r e s w h i c h f o r m a l l y p o s s e s s known amino a c i d s k e l e t o n s . Use o f MSPRUNE under c o n s t r a i n t s used f o r the p r e v i o u s example (Table I I I ) , and w i t h t h e i o n s shown i n F i g u r e 5a was n o t t o o h e l p f u l ; 36 c a n d i d a t e s remained a f t e r t h i s t e s t . More s e v e r e c o n s t r a i n t s were then u s e d , s p e c i f i c a l l y c o n s i d e r i n g o n l y s i n g l e r a t h e r t h a n two s t e p p r o c e s s e s . This e f f e c t e d a dramatic r e d u c t i o n , l e a v i n g o n l y f o u r s t r u c t u r e s , 7 (see F i g u r e 7) and t h e t h r e e isomers r e p r e s e n t e d by 87 A p o s s i b l e source o f each i o n i s shown i n Scheme 2 f o r 8. L i t e r a t u r e s u r v e y s r e v e a l e d t h a t 7 has been o b s e r v e d i n the dog b u t n e v e r i n man. S t r u c t u r e 8 i s p a r t i c u l a r l y a t t r a c t i v e because t h e r e a r e t h r e e s u b s t i t u t i o n isomers p o s s i b l e . These compounds a r e f o r m a l l y c o n j u g a t e s o f t o l u i c a c i d s w i t h g l y c i n e . We s u b s t a n t i a t e d t h i s h y p o t h e s i s and proved t h e s t r u c t u r e s by s y n t h e s i s o f the t h r e e i s o m e r s . The r e t e n t i o n i n d i c e s ( T a b l e IV) and mass s p e c t r a agree c o m p l e t e l y . Further i n v e s t i g a t i o n s i n d i c a t e d t h a t x y l e n e s a r e e x c r e t e d by t h e body by f i r s t , o x i d a t i o n o f one o f t h e m e t h y l groups, and second, c o n j u g a t i o n w i t h g l y c i n e . Further, the r e l a t i v e c o n c e n t r a t i o n s o f t h e t h r e e compounds c l o s e l y a p p r o x i m a t e t h e r e l a t i v e amount o f o r t h o , meta and p a r a isomers i n c o m m e r c i a l m i x t u r e s o f x y l e n e . The p a t i e n t had somehow been exposed t o q u a n t i t i e s o f x y l e n e . TABLE IV. R e l a t i v e R e t e n t i o n Indexes (R.R.I.) f o r Unknowns A-C and S y n t h e t i c O r t h o , Meta and P a r a - t o l u y l g l y c i n e s , as Determined by CLEANUP ( l b ) . Unknown
Scan#
R.R.I.
A
382
2060
Β
402
2128
C
406
2141
S y n t h e t i c Compound ortho-toluylglycine methyl e s t e r meta-toluylglycine methyl e s t e r para-toluylglycine methyl e s t e r
Gross; High Performance Mass Spectrometry: Chemical Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1978.
R.R.I. 2058 2128 2143
HIGH
PERFORMANCE
119
MASS
SPECTROMETRY
Β 402
75-^
50 A
91
25 H
148 175
Downloaded by STONY BROOK UNIV SUNY on May 20, 2018 | https://pubs.acs.org Publication Date: June 1, 1978 | doi: 10.1021/bk-1978-0070.ch018
50
Π"Τ1»"1 100
T T H
Γ Η
207
ΓΤΤΤΤ'^Τ'Η Η" 150 200
1
119 75
Α
406
91
148 50
175
207
"Ί""""!"" """"Γ""Ί''"[""i-'i""!""!""!'"! 1
100
1, Ι
Ι
,,
1
150
200
Figure 6b. 70 eV low resolution mass spectra of components Β and C (Figure 5). Elemental com positions of major ions are those given in Figure 6a.
r
2
;-NHCH COOCH 2
Figure 7. Candidate struc tures for unknowns repre sented by spectra in Figures 6a and 6b Scheme 2
2
3
3
175-^1
1l9
50H
c ο Ό C D Ώ