Computer-Assisted Structure Elucidation

1 3 C NMR spectra combine to yield an apparently limitless .... 0 1 1 3 0. 1 0. 2 0. 1 0 and not as. CH3 CH2 CH2 C((CH(CH3 )CH2 CH3 )CH2 CH3 )CH3. 0 1...
9 downloads 0 Views 929KB Size
6 Computerized Structural Predictions from

C NMR

13

Spectra

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

HENRY L. SURPRENANT and CHARLES N . REILLEY Department of Chemistry, University of North Carolina, Chapel Hill, NC 27514

The wide chemical shift range and narrow line width of C NMR spectra combine to yield an apparently limitless number of spectra. For organic molecules, C spectra are unique finger-printing tools and in an effort to understand C chemical shift patterns, classes of compounds have been studied and spectral features correlated to structural changes within the class. These correlations are usually cast in the form of linear additivity parameters and have been largely successful for alkanes (1, 2) and monofunctional molecules, such as amines (3, 4) and alcohols (5). An unassisted human cannot effectively consider the enormous number of possibilities that exist for molecular structures and C spectra. A number of attempts have been made involving the use of computers to predict structures directly from C spectra. Burlingame, McPherson and Wilson (6) have correctly predicted the structures of seven acyclic saturated hydrocarbons up to empirical formula of C20H42. These compounds were not in the original data set used for the parameterization (2). Their procedure involved (a) nonredundantly generating codes for the structure of each geometrical isomer, (b) checking the code for the proper multiplicity (i.e., the correct number of primary, secondary, tertiary and quaternary carbons, information which can be obtained from an off-resonance decoupling experiment), (c) predicting the spectrum from the code using additivity parameters, and (d) comparing predicted and observed spectra, saving the best matches. Because their program could rapidly generate the structural codes, but was slow in computing the spectrum from that code, a multiplicity f i l t e r was employed to minimize computation time. An off-resonance decoupling experiment is particularly time consuming and i t s requirement is a major drawback of their approach. Although identification of seven structures is far from exhaustive proof of the u t i l i t y , success in these cases i s nevertheless 13

13

13

13

13

77

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED STRUCTURE E L U C I D A T I O N

78

encouraging, p a r t i c u l a r l y i n l i g h t o f the s m a l l number of available spectra. A more e f f i c i e n t approach was i n v e s t i g a t e d by Carhart and D j e r a s s i (7_) i n the a n a l y s i s o f C s p e c t r a of amines. T h e i r program (a) enumerated the p o s s i b l e s u b s t r u c t u r e s t h a t could account f o r the observed chemical s h i f t o f each l i n e , (b) comb i n e d the s u b s t r u c t u r e s i n t o l a r g e r fragments, "pruning" i l l o g i c a l and mismatching fragments from the t r e e of p o s s i b l e s t r u c t u r e s , and (c) generated the spectrum f o r each p o s s i b l e s t r u c t u r e and compared these w i t h t h a t o f the unknown. A l l o f the 100 amine s p e c t r a s t u d i e d , w i t h the e x c e p t i o n o f s i x , were unambiguously i d e n t i f i e d , and even i n those s i x cases the c o r r e c t s t r u c t u r e was always the best or second best match. An i n t e r e s t i n g feature of t h e i r program was i t s a b i l i t y t o p r e d i c t s t r u c t u r e s from chemical s h i f t i n f o r m a t i o n a l o n e , not r e q u i r i n g knowledge of the number o f carbon atoms a s s o c i a t e d w i t h each peak. The C s t r u c t u r a l a n a l y s i s problem was examined by J e z l and Dalrymple (8_) using a c l a s s i c a l search system approach. While p o l y f u n c t i o n a l and c y c l i c s t r u c t u r e s can r e a d i l y be handled w i t h i n t h e i r search system, the a b i l i t y t o decipher s t r u c t u r e s of compounds not present i n t h e i r data f i l e i s , o f course, l i m i t e d . This paper summarizes a recent i n v e s t i g a t i o n (9_) o f the uniqueness o f C s p e c t r a and how u n c e r t a i n t i e s i n s p e c t r a l i n f o r m a t i o n a f f e c t the confidence i n the p r e d i c t e d s t r u c t u r e s . In a d d i t i o n , a method f o r h a n d l i n g f u n c t i o n a l groups as d e r i v a t i v e s of hydrocarbons has been developed, and a search based system f o r hydrocarbon, amine, a l c o h o l , and e t h e r molecules i s demonstrated. Four FORTRAN programs have been w r i t t e n t o accomplish these g o a l s . A l l programs were executed on a Modular Computer Systems Model 11/25 minicomputer. More program i n f o r m a t i o n i s a v a i l a b l e elsewhere (9_).

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

13

13

13

Spectra Generation and Uniqueness Tests A d d i t i v i t y Parameters. L i n e a r a d d i t i v i t y schemes have been used i n various forms i n many s c i e n t i f i c f i e l d s . The a p p l i c a t i o n t o C NMR chemical s h i f t s began w i t h Grant and P a u l s (1_) study o f s a t u r a t e d a c y c l i c hydrocarbons. T h e i r scheme showed s i g n i f i c a n t d e v i a t i o n s f o r h i g h l y branched a l k a n e s , and was m o d i f i e d by Lindeman and Adams (20 who a l s o expanded the observed data f i l e . This p a r a m e t e r i z a t i o n scheme has been used as a s t a r t i n g b a s i s f o r i n c l u d i n g the e f f e c t s of the presence o f f u n c t i o n a l groups such as amines (_3, MO. The standard e r r o r i n the hydrocarbon p a r a m e t e r i z a t i o n was o n l y 0.79 ppm i n d i c a t i n g a h i g h degree o f r e l i a b i l i t y o f the p r e d i c t e d s p e c t r a . This r e l i a b i l i t y and the l a r g e s i z e o f t h e i r o r i g i n a l data set were the primary reasons f o r the choice o f hydrocarbons f o r t h i s study. 13

T

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

6.

SURPRENANT AND REILLEY

Predictions

from

C NMR

13

Spectra

79

S t r u c t u r e Generation. The generation o f a l l t h e nonredundant g e o m e t r i c a l isomers f o r C H2n+2 hydrocarbons was accomplished using program ISOMER. To help e n v i s i o n the magnitude o f the problem an enumeration o f the number o f isomers vs. carbon number i s l i s t e d i n Table I . Because i t was not p o s s i b l e t o check every s t r u c t u r e code by hand, ISOMER was assumed t o execute p r o p e r l y because the t o t a l number o f isomers matched e x a c t l y w i t h previous enumerations (10). Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

n

Table I .

Number o f Geometrical Isomers: C H2n+2 n

n

No. o f Isomers

n

No. o f Isomers

1 2 3 4 5 6 7 8 9 10

1 1 1 2 3 5 9 18 46 75

11 12 13 14 15 16 17 18 19 20

159 355 802 1,858 4,347 10,359 24,894 60,523 148,284 366,319

Figure 1 g i v e s a flow chart o f the o p e r a t i o n o f ISOMER. For a given number o f carbon atoms, ISOMER f i r s t generates a m u l t i p l i c i t y (number o f primary, secondary, t e r t i a r y and quaternary carbons) and checks the v a l i d i t y o f the m u l t i p l i c i t y . A l i n e n o t a t i o n i s generated f o r t h a t m u l t i p l i c i t y , and t h a t l i n e n o t a t i o n s t o r e d on a disk f i l e . The l i n e n o t a t i o n i s then m o d i f i e d attempting t o form new g e o m e t r i c a l isomers o f t h e same m u l t i p l i c i t y . Each nonredundant isomer i s a l s o s t o r e d on the disk f i l e . Because only f o u r types o f carbons occur i n a s a t u r a t e d hydrocarbon, only two b i t s are necessary t o encode each atom: Code 00 01 10 11

= = = =

Meaning 0 1 2 3

= = = =

CH ) = CH f o r f i r s t and l a s t CH CH( C(( 3

3

2

The parentheses are used t o e l i m i n a t e ambiguity o f s t r u c t u r e s around branches. A methyl group, which i s a chain t e r m i n a t o r , ends the most recent chain s t a r t e d , not the f i r s t c h a i n . This n o t a t i o n i s analogous t o an a r i t h m e t i c e x p r e s s i o n i n which the contents w i t h i n the innermost parentheses are evaluated

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED STRUCTURE E L U C I D A T I O N

0

ENTER -#• OF CARBONS

./GENERATE "y

7_

MULTIPLICITY/^-

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

LAST MULTIPLICITY^—YES - ^ E X I ^

/

GENERATE LINE NOTATION, BRANCHES/ PUSHED TO RIGHT /

->(ST0RE LINE NOTATION)

7

REORDER ATOMS TO (A). RIGHT OF MOVED BRANCH

/ REORGANIZE NOTATION (B) TO MINIMUM NUMERICAL V A L U E / WITHOUT BREAKING BONDS /

Analytical Chemistry

ure 1. Flowchart for program ISOMER (9)

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

6.

SURPRENANT AND REILLEY

Predictions

from

C

13

NMR

Spectra

81

p r i o r t o any outer parentheses. For example, 3-methyl-4ethylheptane would be encoded as CH CH2CH2CH(CH2CH3)CH(CH3)CH2CH 3

0 1 1 2 1 0

2 0

3

1 0

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

Another r u l e i s t h a t the arrangement o f chains a f t e r a branch should give the lowest numerical code. For example, 3,4dimethyl-4-ethylheptane would be encoded: CH CH CH C((CH )CH2CH )CH(CH )CH CH 3

2

2

3

0 1 1 3 0

3

3

1 0

2 0

2

3

1 0

and not as CH CH CH C((CH(CH )CH CH )CH CH )CH 3

2

2

3

0 1 1 3 2 0

2

1 0

3

2

1 0

3

3

0

The s t r u c t u r e s are s t o r e d bit-compressed, 2 b i t s p e r atom, 8 atoms p e r 1 6 - b i t word. Spectra Generation. The l i n e n o t a t i o n s t r u c t u r e codes from ISOMER are r e t r i e v e d as w e l l as program SPECGEN which then computes t h e expected C s p e c t r a on the b a s i s o f Lindeman and Adams a d d i t i v i t y r e l a t i o n s h i p s (2). The flow chart f o r SPECGEN i s given i n Figure 2. Each l i n e n o t a t i o n i s r e a d , unpacked, and processed t o determine the number o f bonds which separate each atom from every other atom i n the molecule. On the b a s i s o f t h i s i n f o r m a t i o n , a r e l a t i v e l y s t r a i g h t f o r w a r d a l g o r i t h m can be c o n s t r u c t e d t o add up proper constants according t o the a d d i t i v i t y parameter r u l e s and thereby compute the p r e d i c t e d spectrum. A 1 ppm. bar graph showing the chemical s h i f t d i s t r i b u t i o n f o r the hydrocarbons C2 through i s presented i n Figure 3. The f i n e s t r u c t u r e observed i n the d i s t r i b u t i o n i s i n t e r e s t i n g and probably stems from r e g u l a r i t i e s inherent i n hydrocarbon molecular s t r u c t u r e s . While the chemical s h i f t s f o r methyl groups f a l l i n a d i f f e r e n t r e g i o n from those f o r methylene groups, the e x t e n s i v e overlap prevents d e f i n i t i v e d i f f e r e n t i a t i o n between the two simply on the b a s i s o f t h e i r chemical s h i f t s . 13

1

1 3

Uniqueness Tests. In t h i s study, a C spectrum was considered unique when, w i t h each o f i t s l i n e s d e v i a t i n g by a given amount from the p r e d i c t e d v a l u e , the proper s t r u c t u r e was s t i l l the best matching and non-unique otherwise. We found t h a t , i n almost every i n s t a n c e when the proper s t r u c t u r e was not the best matching, i t was the second best matching. The matching procedure was simple when the number o f l i n e s i n

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED

STRUCTURE

ELUCIDATION

0

READ DISC DIRECTORY

READ NUMBER OF CARBONS

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

- ^ R E A D LINE NOTATION^ / UNPACK 7 / L I N E NOTATION / CREATE POSITION MATRIX i.e. Atom i is 2 Bonds from Atom 3, etc. READ POSITION MATRIX AND PROCESS ACCORDING TO PARAMETERIZATION> RULES TO GIVE SPECTRUM

Analytical Chemistry

Figure

2.

Flowchart

for program

SPECGEN

(9)

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

SURPRENANT A N D REILLEV

Predictions

from

C

NMR

13

Spectra

Total

15,000 #

of

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

carbons

60

50

40

30

20

10

CH

1

1

60

1

50

40

IM

30

3

10

20

0

CHo

1— 60

ML 50

40

30

20

10

CH

60

50

40

20

rri-rfTmfhhp 60

50

40 1 3

Figure

3.

Computed

C

Chemical

r~

~i

30

30 Shifts,

10

1

1

20

10

ppm v s .

2

r 0

TMS

distribution of the number of carbons having C chemical shifts, C —C 13

0

various

16

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

84

STRUCTURE

ELUCIDATION

the r e f e r e n c e and unknown s p e c t r a were equal. The chemical s h i f t s o f both the r e f e r e n c e and the unknown s p e c t r a were ordered from low t o h i g h , and the sum o f the squares o f the d e v i a t i o n s computed. When the r e f e r e n c e spectrum had more l i n e s than the unknown, the chemical s h i f t s were again o r d e r e d , but t o determine which o f the r e f e r e n c e l i n e s should be omitted i n the comparison c a l c u l a t i o n i n v o l v e d more t e s t i n g . To o b t a i n the minimum sum o f the squares o f the d e v i a t i o n s each combination o f r e f e r e n c e l i n e s was t e s t e d against the unknown spectrum. These f u n c t i o n s were implemented by the program COMPARE. The r e s u l t s o f comparisons o f the s p e c t r a l f i l e w i t h normal "unknown" s p e c t r a are l i s t e d i n Table I I . The "unknowns" were the p r e d i c t e d s p e c t r a f o r each isomer but where each l i n e o f the "unknown" spectrum was s h i f t e d by a g i v e n amount, A. The term "normal" r e f e r s t o those proton-decoupled C s p e c t r a w i t h a l l chemical s h i f t s and number o f carbons ( v i a i n t e n s i t i e s ) c o r r e c t l y determined. As i n d i c a t e d i n Table I I u n t i l the number o f carbon atoms becomes l a r g e o r a s i g n i f i c a n t e r r o r e x i s t s i n the chemical s h i f t , the c o r r e c t s t r u c t u r e was p r e d i c t e d . For l a r g e molecules, C13 and above, the number o f wrong matches i n c r e a s e d . Here, i t may be necessary t o perform an off-resonance decoupling experiment t o determine the m u l t i p l i c i t i e s . 1 3

13

Table I I . Normal C Spectra

No. o f Carbons No. o f Isomers O.lia 0.22 0.33 0.44 0.55 0.66 0.77 0.88 0.99 1.10 1.21 1.32 1.43 1.54 1.65 a

% Wrong Matches 9

10

11

12

13

14

35 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

75 0 0 0 0 0 0 0 0 0 0 0 1.3 1.3 2.6 2.6

159 0 0 0 0 0 0 0 0 0 0 1.9 2.5 4.4 6.9 8.8

355 0 0 0 0 0 0 0 0 0.3 0.6 1.7 2.5 5.9 11.0 20.6

802 0 0 0 0 0 0 0.2 0.5 1.4 2.5 4.5 9.7 15.8 26.7 37.9

1858 0 0 0 0 0 0.1 0.5 0.9 2.1 4.6 11.1 20.1 32.4 44.8 56.6

V e r t i c a l column shows A, d e v i a t i o n o f each l i n e , (ppm).

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

6.

SURPRENANT AND REILLEY

Predictions

from

C

13

NMR

85

Spectra

E x c e l l e n t confidence can be p l a c e d i n the p r e d i c t e d s t r u c t u r e , even f o r C13-C20 molecules, i f the m u l t i p l i c i t y i s known. This i s evidenced by the s t r i k i n g l y low percentage o f wrong matches, even f o r s p e c t r a whose chemical s h i f t s are q u i t e d e v i a n t ; see Table I I I .

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

Table I I I .

13

C

No. o f Carbons No. o f Isomers 0.11* 0.22 0.33 0.44 0.55 0.66 0.77 0.88 0.99 1.10 1.21 1.32 1.43 1.54 1.65 a

Spectra w i t h Known M u l t i p l i c i t i e s a Wronjg Matches 12

13

14

355 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

802 0 0 0 0 0 0 0 0 0 0 0 0 .12 0 .12 0 .12 0 .12

1858 0 0 0 0 0 0 0.05 0.05 0.05 0.05 0.11 0.16 0.16 0.22 0.48

V e r t i c a l column shows A, d e v i a t i o n o f each l i n e ,

ppm.

The values given i n Tables I I and I I I assume, however, that i n t e n s i t y i n f o r m a t i o n was s u f f i c i e n t l y accurate t o permit c o r r e c t determination o f the number o f carbons under each peak, and t h a t each peak i s observable. Because o f the v a r i a b i l i t y of the n u c l e a r Overhauser enhancement (NOE) and wide range o f r e l a x a t i o n t i m e s , i n a c c u r a t e area measurements are the r u l e and not the exception w i t h t y p i c a l C s p e c t r a . Indeed, carbons w i t h no attached protons are o f t e n not observed. Two a d d i t i o n a l uniqueness s t u d i e s were c a r r i e d out t o study the influence of inaccurate i n t e n s i t y information. As i n d i c a t e d i n Table IV where i t i s assumed t h a t quaternary carbons were not observed, confidence i n p r e d i c t i o n drops c o n s i d e r a b l y . Even more dramatic i s the drop i n confidence i f i n t e n s i t y i n f o r m a t i o n i s s u f f i c i e n t l y poor so t h a t the number o f carbons f o r a given peak cannot be assigned and thus l i n e s f o r e q u i v a l e n t carbons cannot be d i s t i n g u i s h e d from other l i n e s , Table V. These r e s u l t s p o i n t t o p o t e n t i a l problems a r i s i n g from the manner i n which s p e c t r a are obtained. High pulse r e p e t i t i o n r a t e s without e i t h e r gated decoupling o r 13

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED

86

STRUCTURE

ELUCIDATION

p u r p o s e f u l a d d i t i o n o f r e l a x a t i o n reagents o f t e n reduce the i n t e n s i t y o f quaternary carbons and d i s t o r t areas f o r other carbons.

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

Table IV.

1 3

C Spectra w i t h Unobserved Quaternary Carbons

0. Wrong Matches 0 No. o f Carbons 12 8 9 10 11 No. o f 18 35 355 Isomers 75 159 O.lia 0 0 0 0 0 0.22 0 0 0 0 0 0.33 0 0 0 0 0 0.44 0 0 0 0 0 0.55 0.5 0 0 0 0 0.66 0 0 0.5 0 0 0.77 0 0 0.5 0 0 0.88 0 0.9 0 0 1 .1 0.99 0 0 5.5 0 3 .2 1.10 0 0 10.5 0 5 .4 0 0 15.0 0 5 .4 1.21 1.32 0 5.9 22.7 5.3 12 .9 1.43 0 5.9 7.9 30.5 22 .6 1.54 0 11.8 39.5 15.8 26 .9 1.65 0 17.6 52.7 21.0 31 .2 Number o f isomers ; c o n t a i n i n g quaternary carbon(s 7 17 38 93 220 a

13

14

802 0 0 0 0 0.2 0.4 1.1 3.4 10.4 19.0 29.6 42.8 52.1 61.3 69.8 ) 537

1858 0 0 0 0.2 0.5 1.5 4.1 7.1 17.8 30.3 44.7 57.5 67.8 75.9 81.6 1306

V e r t i c a l column shows A, d e v i a t i o n of each l i n e , ppm.

The conclusions are obvious. In order t o i d e n t i f y l a r g e molecules n e a r l y unambiguously, C s p e c t r a must be obtained without s a t u r a t i n g quaternary resonances, w i t h minimum NOE e f f e c t s , and perhaps w i t h off-resonance decoupling. I t i s a l s o a d v i s a b l e t o use the same temperature and s o l v e n t c o n d i t i o n s u t i l i z e d i n o b t a i n i n g the data base employed i n t h e parameterization. 1 3

Hydrocarbon

Based

1 3

C Search System 1 3

Program MATCH i s a hydrocarbon based C search system designed t o handle s a t u r a t e d a c y c l i c hydrocarbons and r e l a t e d monofunctional amines, a l c o h o l s , and e t h e r s . The number o f C peaks observed, the type o f f u n c t i o n a l group search d e s i r e d , and the observed chemical s h i f t s are the i n p u t s t o the program. Correct i n t e n s i t y i n f o r m a t i o n i s i m p l i e d by i n p u t t i n g the proper number o f resonances. MATCH outputs the t e n best s t r u c t u r e s and the assignment o f the spectrum. In Figure 4 the o u t l i n e o f the program i s d e p i c t e d . 1 3

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

SURPRENANT A N D REILLEY

Predictions

from

C

13

NMR

Spectra

( Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

E N T E R N NUMBER OF CARBONS ) FUNCTIONAL GROUP CODEJ f ENTER N (CHEMICAL SHIFTS ) V^OF UNKNOWN J

__^/READ STRUCTURE^ FROM FILE J /UNPACK STRUCTURE"^

GENERATE REFERENCE SPECTRUM / /COMPARE REFERENCE / A N D UNKNOWN SPECTRA

/

7

f OUTPUT TEN BEST "\ \STRUCTURES AND ASSIGNMENTS/ Analytical Chemistry

Figure

4. Flowchart

for program

MATCH

(9)

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED S T R U C T U R E E L U C I D A T I O N

88

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

Table V.

No. o f Carbons No. o f Isomers O.lia 0.22 0.33 0.44 0.55 0.66 0.77 0.88 0.99 1.10 1.21 1.32 1.43 1.54 1.65 Number of a

Equivalent Carbons Treated as One Carbon % Wrong Matches 13

14

802 355 9 18 75 159 35 3.0 0 2.0 4.8 0 0 5.8 4.1 0 3.4 0 6.0 0 7.2 6.9 4.6 0 0 4.8 9.4 7.2 0 6.2 0 9.4 7.9 6.1 8.7 0 8.0 5.9 7.2 9.1 9.4 13.0 0 5.9 9.5 12.1 12.2 9.4 13.0 0 17.6 15.6 15.9 10.9 16.0 18.8 0 17.6 15.6 20.3 12.9 20.5 29.2 0 23.5 18.8 24.6 20.4 27.5 38.3 0 23.5 21.9 26.1 23.8 34.7 49.2 12.5 23.5 21.9 31.9 30.6 43.2 58.3 12.5 23.5 21.9 39.1 37.4 50.5 65.4 12.5 29.4 25.0 44.9 44.9 56.5 71.7 12.5 29.4 31.3 49.3 53.1 63.4 77.2 25.0 35.3 46.9 50.7 57.1 69.2 80.9 isomers c o n t a i n i n g equivalent: carbons 8 760 17 331 32 69 147

1858 1.8 2.6 4.2 7.3 9.1 15.2 23.8 36.8 49.2 60.0 67.9 75.8 81.6 86.0 88.8

7

8

9

10

11

12

1779

V e r t i c a l column shows A, d e v i a t i o n o f each l i n e , ppm.

MATCH reads the f i l e o f encoded hydrocarbon s t r u c t u r e s and computes the expected spectrum f o r each s t r u c t u r e , and then compares each spectrum with t h a t o f the unknown. The storage o f the compressed s t r u c t u r e codes, but not t h e a s s o c i a t e d s p e c t r a , saves s i g n i f i c a n t amounts o f storage without undue degradation o f searching speed. Storage o f the expected hydrocarbon s p e c t r a would double the searching speed f o r hydrocarbons, but would not change the speed f o r f u n c t i o n a l group searches. The hydrocarbons l i s t e d i n Table V I , i n a d d i t i o n t o t h e 69 hydrocarbons u t i l i z e d i n forming the parameters data (2_) were c o r r e c t l y i d e n t i f i e d by MATCH u s i n g t h e i r chemical s h i f t s and i n t e n s i t y i n f o r m a t i o n . To handle f u n c t i o n a l groups, MATCH simply searches t h a t s t r u c t u r e f i l e f o r hydrocarbons which i s one carbon l a r g e r than the molecule o f i n t e r e s t . Nonequivalent carbons i n the s t r u c t u r e s are then determined and the f u n c t i o n a l group r e p l a c e s one o f the nonequivalent carbons t o o b t a i n one new molecule. Because a given hydrocarbon isomer can have more than one nonequivalent carbon, more than one isomer o f the f u n c t i o n a l groups s u b s t i t u t e d molecule i s obtained from each hydrocarbon. F o r example, a l l the monofunctional C^s a l c o h o l i s o m e r i c s t r u c t u r e s can be d e r i v e d from the C^g hydrocarbon

In Computer-Assisted Structure Elucidation; Smith, Dennis H.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

6.

SURPRENANT AND REILLEY

Predictions

from

C NMR

13

89

Spectra

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on January 30, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0054.ch006

Table V I . Hydrocarbons I d e n t i f i e d by Program MATCH Number o f Possible Isomers 75 75 75 75 159 355 355 10,359 10,359

Number o f Carbons 10 10 10 10 Cn C12 Cl2 Cl6 Cl6

Unknown Compound n-Decane 2,7-Dimethyloctane 4-n-Propylheptane 2,4,6-Trimethylheptane 2,8-Dimethylnonane 2,5,8-Trimethylnonane 2,9-Dimethyldecane n-Hexadecane 2,5,8,11-Tetramethyldodecane

c

c

c

c

f i l e , by r e p l a c i n g , one a t a t i m e , each nonequivalent methyl group w i t h a h y d r o x y l . S i m i l a r l y , f o r ethers nonequivalent methylenes are r e p l a c e d w i t h an oxygen. Some o f the f u n c t i o n a l groups t h a t can be handled i n t h i s manner are shown i n Table V I I . Once the i s o m e r i c s t r u c t u r e s are o b t a i n e d , the s p e c t r a can q u i c k l y be generated u s i n g the a d d i t i v i t y parameters f o r t h a t f u n c t i o n a l group ( 4 , 5 ) . Table V I I . Extension o f Hydrocarbon Scheme t o Cover F u n c t i o n a l Groups Base Hydrocarbon:

R"

R"

(

R

R-CH3

1

R-ICHI-R

R-{CJ-R

t

••(CHU-R' Functional Groups:

-NH2

-NH-

-OH

-0-

-C00H

-S-

R

>