Computer-Assisted Structure Elucidation: Modelling Chemical

phases of structure elucidation, all of which are amenable to ... Of course there can be some ..... (see Methods) and applied to a list of structures*...
0 downloads 0 Views 2MB Size
9 Computer-Assisted Structure Elucidation: Modelling Chemical Reaction Sequences Used in Molecular Structure Problems1,2

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

TOMAS H. VARKONY, RAYMOND E. CARHART, and DENNIS H. SMITH Departments of Chemistry, Computer Science, and Genetics, Stanford Univ., Stanford, Calif. 94305 Our research i n the a p p l i c a t i o n s of computer techniques to c h e m i c a l problems has focused on elucidation of molecular structures o f unknown compounds. We have been applying problem solving methods derived from research on artificial intelligence to c r e a t e a program which emulates c e r t a i n phases of manual approaches to s t r u c t u r e e l u c i d a t i o n . This program, called "CONGEN", provides a general mechanism f o r assembly o f c h e m i c a l atoms and structural fragments i n f e r r e d from any of a v a r i e t y of s o u r c e s . Such fragments ("superatoms" ) a r e i n f e r r e d manually and s u b s e q u e n t l y s u p p l i e d to the program. Statements about structural fragments and c o n s t r a i n t s on the ways in which they may be assembled a r e i n p u t to CONGEN using a graphical language for representation of structures. T h i s language has important r a m i f i c a t i o n s i n e x t e n s i o n s to CONGEN, as we o u t l i n e s u b s e q u e n t l y . 3

3

There a r e , however, several other important phases of s t r u c t u r e e l u c i d a t i o n , all o f which a r e amenable to c o m p u t e r - a s s i s t a n c e . A r e p r e s e n t a t i o n o f major milestones in typical structure elucidation problems is presented i n F i g . 1. For the purposes of the subsequent d i s c u s s i o n we c o n s i d e r two different categories of s t r u c t u r e problems, both of which fit i n t o the scheme of F i g . 1. The first c a t e g o r y we view as the g e n e r a l problem of structure elucidation, wherein an unknown compound is isolated and characterized. The second category we term "mechanistic" structure elucidation. Into this c a t e g o r y fall s y n t h e t i c r e a c t i o n s where the p r e c u r s o r , or s t a r t i n g m a t e r i a l , i s known but the p r o d u c t ( s ) and the p r e c i s e r e a c t i o n pathways a r e not. Of course there can be some o v e r l a p between these c a t e g o r i e s , and CONGEN handles both i n a s i m i l a r way, but we will

188 Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

9.

VARKONY

ET AL.

Computer-Assisted Structure

Elucidation

189

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

d i s c u s s them a s separate topics* Thus. the "chemical h i s t o r y " c o l l e c t e d ( F i g * 1 ) may, i n t h e f o r m e r c a s e , be actual chemical tests or reactions carried out to c h a r a c t e r i z e t h e unknown* For mechanistic s t u d i e s , the h i s t o r y may i n c l u d e a s p e c i f i c r e a c t i o n e x e r c i s e d on a known structure* Data interpretation and s t r u c t u r e a s s e m b l y w i l l be i n t h e f o r m e r c a s e a c t u a l assembly of i n f e r r e d fragments, while i n the mechanistic case i t usually will involve manipulations of, or slight modifications t o a known s t r u c t u r e * Until recently, CONGEN p e r f o r m e d only structure a s s e m b l y and some d a t a i n t e r p r e t a t i o n and provided the facilities to assist examination of structural possibilities and elimination of inconsistent structures* We are now actively pursuing other elements of Fig* 1 * F o r example, examination of s t r u c t u r e s and s u b s e q u e n t d e s i g n o f new experiments i s an interesting chemical and artificial intelligence problem c u r r e n t l y under i n v e s t i g a t i o n * Although actual c o l l e c t i o n of s p e c t r o s c o p i c and c h e m i c a l d a t a i s b e y o n d the scope of our current i n t e r e s t i n s o f a r as p r o g r a m s for symbolic reasoning are concerned, the element of d a t a i n t e r p r e t a t i o n i s a l s o of major importance t o us* The use of chemical transforms, or reaction sequences, i n s t r u c t u r e e l u c i d a t i o n ( F i g . 1 ) i n both the g e n e r a l and m e c h a n i s t i c s e n s e s m e n t i o n e d previously i s the subject of t h i s report* Reactions or sequences of reactions may be carried o u t on an unknown f o r several reasons* The reaction may a) test for a s p e c i f i c f u n c t i o n a l group; b) s i m p l i f y t h e p r o b l e m by decomposing t h e unknown into smaller, more easily characterizable m o l e c u l e s ; c) modify the s k e l e t o n or functional groups to define more accurately their respective environments or make the unknown more amenable t o a n a l y s i s (e_*&* , i n c r e a s e i t s volatility); o r d) u n a m b i g u o u s l y r e l a t e t h e unknown t o a previously characterized compound* Mechanistic studies, usually involving rearrangements or eyelizations, employ reaction sequences to help characterize reaction pathways and establish relationships among sets of related structures* The m u l t i p l i c i t y o f p a t h w a y s o p e n t o such processes frequently prevents establishing structures of products without additional collection and examination of data. In such cases, the chemical t r a n s f o r m which forms p a r t o f the c h e m i c a l h i s t o r y can be c a r r i e d o u t t o y i e l d t h e s e t o f c a n d i d a t e structures

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Figure

1.

PHYS.HISTORY

CHEM./BIOL./

SPECTROSCOPY

milestones

PLAN NEW EXPERIMENTS

I N F E R E N C E S AND CONSTRAINTS

NEW S T R U C T U R A L

OTHER DATA

SPECTRA

UNIQUE FEATURES

COMMON AND

INTERPRETATION

DATA

structure

or on

FINAL STRUCTURES

CANDIDATE STRUCTURES

STRUCTURE ASSEMBLY

out on a known

EXAMINE STRUCTURES

STRUCTURES

ELIMINATE INCONSISTENT

AND CONSTRAINTS

INFERENCES

in the elucidation of molecular structure. Reaction sequences carried candidate structures for an unknown are the topic of this paper.

SEQUENCES

REACTION

CHEMICAL TRANSFORMS

MORE SPECTROSCOPY

Major

KNOWN COMPOUNDS

REARRANGED

OR

COMPOUND

UNKNOWN

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

9.

VARKONY

E T AL.

Computer-Assisted Structure Elucidation

191

w h i c h may t h e n be t r e a t e d j u s t a s candidate structures for an unknown* However, the application of constraints on the r e a c t i o n products has d i f f e r e n t meaning i n m e c h a n i s t i c s t u d i e s as c o n t r a s t e d to general unknowns a s we d e s c r i b e i n t h e M e t h o d s s e c t i o n *

PURPOSE

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

Similarities Assisted Synthesis *

to

and

Contrasts

wi t h

Computer-

As we w i l l i l l u s t r a t e i n s u b s e q u e n t s e c t i o n s , o u r work on chemical reaction sequences has several important s i m i l a r i t i e s t o and c o n t r a s t s with current e f f o r t s d i r e c t e d toward c o m p u t e r - a s s i s t e d s y n t h e s i s »-\ The s i m i l a r i t i e s f a l l p r i m a r i l y i n areas of structure and reaction d e f i n i t i o n and m a n i p u l a t i o n * We s h a r e common problems of user interaction and i n t e r f a c i n g between the outside world and the more rigidly s t r u c t u r e d domain o f the computer program* Reactions and s t r u c t u r a l c o n s t r a i n t s on them must be d e f i n e d and s u c h d e f i n i t i o n s must be s a v e d t o p r o v i d e a knowledge base which c a n be called upon by future users* Internal t o the programs a r e common problems of perceiving important molecular features and e x e c u t i n g the reaction by appropriate manipulations of the structure representation according to the d e f i n i t i o n of the reaction* Algorithms common t o both problems include ring perception ("cycle finding"), "path f i n d i n g " t o d e t e r m i n e c o n n e c t i v i t y , s o m e form of graph matching t o detect given substructures, recognition of symmetry p r o p e r t i e s o f r e a c t a n t s o r p r o d u c t s , a v o i d a n c e o f o r d e t e c t i o n and e l i m i n a t i o n o f d u p l i c a t e structures and representation o f c h e m i c a l as opposed to grapht h e o r e t i c a l c o n c e p t s o f s t r u c t u r e , s u c h as a r o m a t i c i t y * x

y

Despite the f a c t that both our e f f o r t s and t h o s e of c o m p u t e r - a s s i s t e d s y n t h e s i s d e s i g n i n v o l v e executing a representation of a chemical reaction i n the computer, there are fundamental philosophical and m e t h o d o l o g i c a l d i f f e r e n c e s , as summarized i n F i g * 2 and d e t a i l e d as f o l l o w s : I) A s y n t h e s i s problem has a specific target molecule* The g o a l i n developing a synthesis i s to define precursors w h i c h a r e i n some s e n s e s i m p l e r * The precursors become the t a r g e t s f o r the next l e v e l and the procedure i s recursive until the terminating condition of sufficiently simple precursors is

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

IN

A R E

DIRECTION

ROUTES

"GOOD"

M O L E C U L E

SYNTHESIS

2) F I N D I N G

I)TARGET

VARIABLES

2) R E A C T I O N S

ANTITHETIC

I) D E V E L O P T R E E

MATERIALS

PLANNING

PRODUCTS

G O A L S (A)

2) PATHWAYS

IDENTIFY

REACTION

PRODUCTS

UNKNOWN(S)

FROM AMONG

I) I D E N T I F Y

METHOD



2

g

PRODUCTS REACTION PRODUCTS

*

ί\

*

IN

ώ

A R E

Γ*

i \

g



BY

FROM

-

x

j' 4

PRODUCTS

CANDIDATES CONSTRAINING

AMONG

I) I D E N T I F Y

ύ

UNKNOWN

; \

g

Q

CONSTANTS

DIRECTION

G O A L S (Β)

2) R E A C T I O N S

SYNTHETIC

J

g

Π

T R E E

6 I)DEVELOP

PRODUCTS

REACTIONJL+jVi

IJ

g

STRUCTURES

Β

Ρ3 φ g g

CANDIDATE

REACTION

SEQUENCES

-ù-

REACTION

Figure 2. Contrasts between the methods and goals of development of a synthesis tree in computer-assisted synthesis and development of a reaction sequence tree in structure elucidation. We denote two applications of reaction sequences: A) conversion of known pre­ cursors into unknown compounds, a problem encountered in mechanistic studies; B) conversion of candidate structures for an unknown into a series of products, usually employed in the general problem of structure elucidation.

GOALS

METHOD

STARTING

SYNTHESIS

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

9.

VARKONY E T A L .

Computer-Assisted

Structure

Elucidation

193

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

achieved* R e a c t i o n s e q u e n c e s h a v e no p r e d e f i n e d target molecules* The known p r e c u r s o r s (Case A, F i g * 2) o r structural candidates (Case B, Fig* 2) represent reactants. A given reaction transforms each s t r u c t u r e which can undergo the reaction into one or more products* The p r o d u c t s t h e m s e l v e s may be subjected to further reaction* The goals in these reaction sequences a r e : Case

A) identify unknown s t r u c t u r e s i n the set of p r o d u c t s and by so doing, elucidate the r e a c t i o n pathway(s)•

Case

B) i d e n t i f y t h e unknown c a n d i d a t e s by c o n s t r a i n t s

s t r u c t u r e from among t h e applied to the products*

II) Reaction sequences operate i n the s y n t h e t i c rather than the retrosynthetic, or antithetic direction* This d i f f e r e n c e i s i l l u s t r a t e d i n F i g * 2» Each e x p a n s i o n of the s y n t h e s i s tree represents a set of reactions applied i n the reverse, or a n t i t h e t i c direction* An a c t u a l s y n t h e s i s w o u l d p r o c e e d b a c k w a r d s a l o n g one p a t h * Each e x p a n s i o n o f the CONGEN r e a c t i o n t r e e , however, r e s u l t s from t h e a p p l i c a t i o n o f a s i n g l e r e a c t i o n ( a p p l i e d i n t h e s y n t h e t i c d i r e c t i o n ) on one o r more s t a r t i n g materials. III) Expansion of the synthesis tree is c o n t r o l l e d by c o n s t r a i n t s on the r e a c t i o n s which are applied* Expansion of the r e a c t i o n sequence tree in CONGEN i s controlled by constraints on s t r u c t u r e s , i * e.. , p r o d u c t s . Differences ( I I ) and ( I I I ) a r e r e f l e c t i o n s o f t h e fact that synthesis programs use reactions as variables* Several reactions from a library of possible reactions may apply to any target* We c o n s i d e r r e a c t i o n s as c o n s t a n t s * A reaction i s defined ( s e e M e t h o d s ) and a p p l i e d t o a l i s t o f s t r u c t u r e s * The p r o d u c t s a t any l e v e l a r e o b t a i n e d from s t r u c t u r e s at t h e p r e c e d i n g l e v e l t h r o u g h one o r more a p p l i c a t i o n s o f that reaction. A l t h o u g h many r e a c t i o n s may be a p p l i e d to a given l i s t of s t r u c t u r e s , leading to branches i n the r e a c t i o n sequence t r e e (see Methods) our b a s i c task is the exhaustive e x p l o r a t i o n or evaluation, not o f r e a c t i o n s , but of s t r u c t u r a l possibilities*

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED ORGANIC SYNTHESIS

194

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

Relationship Elucidation*

of

Reaction

Sequences

to S t r u c t u r e

In t h e c o u r s e of determination of the s t r u c t u r e o f an unknown compound, r e a c t i o n s may be c a r r i e d o u t on the unknown to gather additional structural information. I n many c a s e s s u c h new i n f o r m a t i o n c a n be expressed directly as constraints on the possible s t r u c t u r e s f o r the unknown* F o r example, i f the base catalyzed exchange of e n o l i z a b l e hydrogen atoms w i t h d e u t e r i u m atoms y i e l d s a new compound whose m o l e c u l a r w e i g h t i s t h r e e amu g r e a t e r t h a n t h e unknown, then the candidate structures f o r the unknown can be t e s t e d directly for the presence of three hydrogens exchangeable under these c o n d i t i o n s without c o n s i d e r i n g the s t r u c t u r e of the transformed material* In f a c t , one c r i t e r i o n f o r useful chemical transforms designed to yield new structural information is that observations on the resulting products be easily translated back to the starting structural possibilities• There i s , however, an important class of r e a c t i o n s i n which the t r a n s l a t i o n of o b s e r v a t i o n s on t h e p r o d u c t s i n t o d i r e c t c o n s t r a i n t s on the s t r u c t u r a l p o s s i b i l i t i e s i s d i f f i c u l t i f not i m p o s s i b l e * In these cases i t i s e s s e n t i a l to c o n s i d e r the application of the reaction to each structural candidate and t h e relationship of these candidates to their respective products* The most common e x a m p l e s o f t h i s class are r e a c t i o n s i n which a given product or s e t of products may be o b t a i n e d f r o m d i f f e r e n t c a n d i d a t e s t r u c t u r e s f o r the unknown* Or, stated slightly d i f f e r e n t l y , the c l a s s o f r e a c t i o n s i n w h i c h t h e r e i s more t h a n one way for a given product or s e t of products to undergo the reverse, or antithetic reaction* The following are some s i m p l e , b u t i l l u s t r a t i v e e x a m p l e s * v

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

9.

Computer-Assisted Structure Elucidation

VARKONY ET AL.

195

In the absence of additional information concerning the original relationships of the a l k y l groups i n ! t o t h e d o u b l e bond and k e t o g r o u p s o f J., t h e r e a r e f o u r ways o f r e a s s e m b l i n g 2 and 3. ( f o u r ways of c a r r y i n g out the a n t i t h e t i c r e a c t i o n ) * This i s an e x a m p l e w h e r e t h e f u n c t i o n a l g r o u p w h i c h was i n t r o d u c e d is a group already present i n the molecule* Thus, there is ambiguity in referring data measured on products back to the starting structures* Similar p r o b l e m s a r i s e i n any f r a g m e n t a t i o n r e a c t i o n where more t h a n two f r a g m e n t s are produced, the worst case being mass s p e c t r a l f r a g m e n t a t i o n s , where t h e fragment ions can be reassembled in many consistent ways. Transformations exemplified by 4. 5. and £L 1 represent removal or i n t r o d u c t i o n of m u l t i p l e bonds w h e r e t h e r e i s a m b i g u i t y , b a s e d on t h e s t r u c t u r e s o f 5. and J., concerning the structures of 4. and 6., respectively* The p r o b l e m becomes r a p i d l y more c o m p l e x if a sequence of reactions is used on a set of candidate structures* Keeping the data and structural possibilities organized is a difficult job to do manually• Other important members of this class of reactions include the c y c l i z a t i o n s and rearrangements mentioned previously with respect to mechanistic reactions* B e c a u s e t h e s e r e a c t i o n s g e n e r a l l y h a v e many ways t o occur, i n perhaps s e v e r a l consecutive steps, they are not n o r m a l l y used t o help solve an unknown structure* Rather, such r e a c t i o n s are c a r r i e d o u t on known m a t e r i a l s and the problem i s to determine the s t r u c t u r e s of o b s e r v e d p r o d u c t s based a t l e a s t in part on knowledge of the r e a c t i o n itself* Carbonium i o n rearrangements and cyclization reactions such as cyclization of squalene epoxide and congeners to lanosterol and r e l a t e d compounds are two important examples• 5

We d e s i g n e d t h e r e a c t i o n s e q u e n c e c a p a b i l i t i e s o f CONGEN t o make i t s i m p l e f o r a u s e r of the program to carry out reactions in either the general or mechanistic category of a p p l i c a t i o n * For the g e n e r a l c a t e g o r y , m e a s u r e m e n t s made on p r o d u c t s o f a reaction c a n be u s e d d i r e c t l y to t e s t the p r o d u c t s without the necessity for translation of each o b s e r v a t i o n back t o the starting materials* These tests have direct e f f e c t s on t h e i m m e d i a t e p r e c u r s o r s o f t h e p r o d u c t s and e v e n t u a l l y on t h e c a n d i d a t e s t r u c t u r e s i n a multi-step sequence; removal of one product can result in e l i m i n a t i n g whole branches of the r e a c t i o n t r e e (see

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

196

COMPUTER-ASSISTED ORGANIC SYNTHESIS

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

Methods section)* The c o m p l e x i t i e s of dealing with many s t r u c t u r a l candidates and s e v e r a l r e a c t i o n s and associated products are handled by the i n t e r n a l b o o k k e e p i n g o f t h e program* In the case o f m e c h a n i s t i c studies, the r e a c t i o n c a p a b i l i t i e s are important i n d e f i n i n g t h e a l t e r n a t i v e s t r u c t u r e s which can a r i s e a t each step o f c y c l i z a t i o n or rearrangement* This guarantees that a l l plausible products will be considered i n d e c i d i n g t h e outcome o f such r e a c t i o n s * Note that the ambiguities of r e l a t i n g observed p r o d u c t s t o s t a r t i n g m a t e r i a l s a r e n o t r e m o v e d by u s i n g a computer program* The a d v a n t a g e o f u s i n g a program is that a l l alternatives will be systematically considered* I t i s easy t o h y p o t h e s i z e one p a r t i c u l a r s t r u c t u r e which obeys a l l observed d a t a ; t h e program provides a straightforward s u p p o r t o r d e n i a l o f s u c h an hypothesis *

METHODS Reaction

Definition *

The i n i t i a l s t e p i n c a r r y i n g o u t a r e a c t i o n on a structure or group of structures i s to define the reaction and a l l its characteristics* This includes definition o f a) the reaction site. or local e n v i r o n m e n t o f a m o l e c u l e w h i c h w i l l be a f f e c t e d by t h e r e a c t i o n ; b) t h e t r a n s f o r m * o r t h e m o d i f i c a t i o n s of the reaction site which yield the product(s); and c ) c o n s t r a i n t s on the r e a c t i o n s i t e , or features of the l o c a l o r remote environment which a r e e i t h e r n e c e s s a r y for the r e a c t i o n to occur or w i l l prevent i t from ο ccurring• The definition of the r e a c t i o n i s under t h e i n t e r a c t i v e c o n t r o l o f t h e u s e r o f CONGEN (unless the r e a c t i o n was p r e v i o u s l y d e f i n e d t o h i s / h e r s a t i s f a c t i o n and input from a library of reactions)* This introduces t h e need f o r a g e n e r a l , f l e x i b l e and s i m p l e language o f m o l e c u l a r s t r u c t u r e i n which the reactions can be expressed* We have adopted our general structure editor (EDITSTRUC^), together with the constraints mechanisms of CONGEN, for reaction definition* Thus, the language used i n CONGEN t o describe reactions i s an e a s i l y learned extension of the language needed t o c o n s t r u c t the o r i g i n a l setof candidate structures*

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

9.

VARKONY ET AL.

Computer-Assisted

Structure

Elucidation

EDITSTRUC i s a g r a p h i c a l language for structure description* Each statement about rings, chains, b r a n c h e s , e t c * , r e s u l t s i n c o n s t r u c t i o n of a graphical representation of the statement, i n the form of a connection table* In t h i s r e s p e c t i t d i f f e r s from the ALCHEM language^ developed by Wipke, which allows (restricted) English language statements about a reaction; statements which are subsequently compiled i n t o an i n t e r n a l f o r m used to c a r r y out the r e a c t i o n * Our g r a p h i c a l r e p r e s e n t a t i o n has t h e a d v a n t a g e that i t c a n be u s e d d i r e c t l y t o c a r r y o u t t h e r e a c t i o n because all s t a t e m e n t s are u n d e r s t o o d by the current graphmatcher/pathfinder/cycle-finder and structure manipulation functions within CONGEN* Also, the i m p o r t a n t f e a t u r e of t h e symmetry o f t h e r e a c t i o n can be c o m p u t e d from t h e s e g r a p h i c a l representations (see below)* The full complement of the structural c o n s t r a i n t s a v a i l a b l e i n CONGEN c a n be b r o u g h t t o b e a r t o d e s c r i b e t h e r e a c t i o n i n more d e t a i l * This includes the capability of specifying substructures of any complexity, variable length bridges or chains, arbitrary atom names or bond orders, proton d i s t r i b u t i o n and r i n g s i z e s t o be good o r bad f o r the reaction* W i p k e ' s ALCHEM language i s c u r r e n t l y more complete, because it allows s p e c i f i c a t i o n of three d i m e n s i o n a l p r o p e r t i e s and t h e use o f c e r t a i n Boolean connectives, relating constraints, which we do not c u r r e n t l y have i n CONGEN. An example illustrating the interactive definition of a r e a c t i o n and related constraints is presented in F i g * 3* The t e x t t y p e d by t h e user i s underlined* The r e a c t i o n i s d e h y d r o c h l o r i n a t i o n (8. -> 3_) , a s s u m e d t o be c a r r i e d o u t i n b a s i c conditions with the r e l a t i v e l y b u l k y t.-butoxide ion* For illustrative purposes, assume t h a t the s k e l e t o n J_0 i s known, and o n l y the placement of a c h l o r i n e atom a t one o f the methylene groups i s in question (candidate structures t h e n d i f f e r by t h i s p l a c e m e n t ) .

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED ORGANIC SYNTHESIS

198

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

/ED ITREACT NAME: DEHYDROCHLORI NAT I ON (NEW REACTION) "SITE >CHAIN 3 >ATNAME I C L >HRANGE 3 I 3 >SH0W NAME * DEHYDROCHLOR INATION ATOM TYPE ARTYPE. NEIGHBORS HRANGE 1 CL NON-AR 2 2 C NON-AR I 3 3 C NON-AR 2 1-3 >ADRAW DEHYDROCHLOR INAT ION:

(HRANGES NOT INDICATED)

CL-C-C >DONE

"CONSTRAINTS >BADLIST BADLIST CONSTRAINTS CONSTRAINT NAME:CCTBU CONSTRAINT NAME: >DONE_

"SHOW NAME:

DEHYDROCHLOR I NAT I ON

SITE: ATOM TYPE ARTYPE NEIGHBORS HRANGE 1 CL NON-AR 2 2 C NON-AR I3 3 C NON-AR 2 1-3 DEHYDROCHLOR INAT ION: (HRANGES NOT INDICATED) NON-C ATOMS: I- CL 1-2-3

'TRANSFORM >NDRAW DEHYDROCHLOR INAT ION: NON-C ATOMS: l->CL

(HRANGES NOT

INDICATED)

TRANSFORM: UNJOIN I 2 JOIN 2 3 DELATS I CONSTRAINTS:

1-2-3 BADLIST NAME CCTBU

>UNJOIN I 2 >JOIN 2 3 >ADRAW DEHYDROCHLOR INAT ION: CL C=C

(HRANGES NOT INDICATED)

CONSTRAINTS

"•DONE (DEHYDROCHLOR INAT ION

DEFINED)

(DEHYDROCHLOR INAT ION ADDED TO THE REACTION

LIST)

>DELATS I >DONE

Figure

3. An interactive session with CONGEN including definition of the reaction site the reaction transform (TRANSFORM) and constraints on the reaction site (CONSTRAINTS) for the example reaction, dehydrochlorination. A summary of the complete reaction is provided by the SHOW command. User responses to CONGEN are underlined (carriage-returns terminate each command).

(SITE),

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

9.

VARKONY ET

AL.

Computer-Assisted

Structure

Elucidation

199

Reaction S i t e • The r e a c t i o n s i t e represents the segment o f a m o l e c u l e w h i c h w i l l be transformed* The segment i n c l u d e s the atoms a c t u a l l y involved i n the reaction transformation together with any other structural features necessary for the reaction to occur* The site is defined using the appropriate EDITSTRUC commands* The r e a c t i o n (8. -> 3_) i n v o l v e s t h e r e m o v a l of the e l e m e n t s o f HC1 from adjacent carbons. The CHAIN and ATNAME commands ( F i g . 3 ) d e f i n e t h e site, w h i c h i s drawn for i l l u s t r a t i o n (Fig. 3)* The HRANGE command requires that there be from one to t h r e e h y d r o g e n atoms on atom 3 , w h i c h i s obviously necessary f o r the e l i m i n a t i o n o f HC1. (The p r o g r a m actually is c a p a b l e of determining this i t s e l f by examination of the t r a n s f o r m , so t h e HRANGE i n f o r m a t i o n i s r e d u n d a n t . ) The atom numbers are critical parameters for the reaction. T h e s e numbers a r e " s t i c k y " i n t h e s e n s e t h a t they w i l l always be a s s o c i a t e d w i t h the same a t o m s . Subsequent d e f i n i t i o n of the r e a c t i o n transform itself w i l l make e x p l i c i t r e f e r e n c e t o t h e s e atom n u m b e r s . Reaction Transform. The reaction transform i s the actual series of s t r u c t u r a l m o d i f i c a t i o n s which, when a p p l i e d t o t h e atoms i n the r e a c t i o n site, yield the products. The user defines the transformation explicitly by m o d i f i c a t i o n s to the p r e v i o u s l y named site. The TRANSFORM command (Fig. 3) r e s t o r e s the actual connection table representing that site. Then, again u s i n g EDITSTRUC commands, t h e m o d i f i c a t i o n s to t h a t s i t e which e x p r e s s the r e a c t i o n are defined. In the example ( F i g . 3 ) , the r e a c t i o n i n v o l v e s l o s s of HC1 yielding a d o u b l e bond (8. -> 5 . ) , e x p r e s s e d as UNJOIN (break the C-Cl b o n d ) and JOIN t o form the new bond. The DELATS command deletes the c h l o r i n e atom as an inconsequential product. Reaction Site Constraints * These constraints r e f e r to f e a t u r e s of the m o l e c u l e ( o t h e r than those i n the r e a c t i o n s i t e ) which a f f e c t the reaction, either positively by a l l o w i n g i t to o c c u r or n e g a t i v e l y by p r e v e n t i n g i t from o c c u r r i n g * T h e s e f e a t u r e s may be i n the l o c a l environment of the r e a c t i o n s i t e or may be r e m o t e as i n the case o f an i n t e r f e r i n g or competing functionality elsewhere in the molecule. In the example ( F i g * 3), we know that this reaction w i l l be hindered by t h e e x i s t i n g t . - b u t y l g r o u p i n the skeleton (J_0 ) . We previously defined a s u b s t r u c t u r e named, a r b i t r a r i l y , CCTBU as t h e s t r u c t u r e JJ_* Placing this s u b s t r u c t u r e on Β A D L I S T ^ · i s i n t e r p r e t e d by CONGEN as " c a r r y out the r e a c t i o n t r a n s f o r m at the s i t e g i v e n by

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED

200 the reaction encountered".

site,

except

when

ORGANIC

CCTBU

SYNTHESIS

(JJL) i s

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

12

The SHOW command ( F i g , 3 ) p r e s e n t s t h e u s e r w i t h a complete summary of the reaction i n i t s current definition• Carrying

Out t h e R e a c t i o n *

Product Constraints* I n many c a s e s , c e r t a i n ways of c a r r y i n g o u t a r e a c t i o n which a r e l e g a l a c c o r d i n g t o d e f i n i t i o n s of the reaction s i t e , constraints and t h e transform y i e l d p r o d u c t s which a r e u n d e s i r e d * In the e x a m p l e ( F i g * 3 ) we wish to avoid formation of double bonds a t t h e b r i d g e h e a d s ( B r e d t ' s r u l e ) * We s u p p l y , a s a BADLIST c o n s t r a i n t , t h e name of a superatom c a l l e d BREDT w h i c h i s p r e v i o u s l y d e f i n e d a s s u b s t r u c t u r e 1 2* The s t a r r e d a t o m s r e p r e s e n t "linknodes" and a r e used t o represent a p a t h o f atoms o f a g i v e n l e n g t h o r range o f lengths* The u n s t a r r e d atoms i n substructure J_2 a r e the bridgeheads, the linknodes the three associated paths* The d o u b l e bond i n J _ 2 i s t o one o f t h e bridgehead atoms, completing an e x p r e s s i o n of the constraint• Applying t h e T r a n s form• Actual use of the reaction transform i s straightforward with the exception o f some f e a t u r e s t o a l l e v i a t e t h e p r o b l e m s o f duplication ( s e e below)* The p r o g r a m examines each structure i n the l i s t of structures t o which the r e a c t i o n i s applied f o r the presence o f the r e a c t i o n site. If a site(s) i s f o u n d , and t h e s t r u c t u r e obeys a l l r e a c t i o n c o n s t r a i n t s then t h e r e a c t i o n t r a n s f o r m i s a p p l i e d t o t h e s t r u c t u r e , once f o r each unique s i t e , and a p r o d u c t i s c r e a t e d f o r each application* Then, i f t h e user has s p e c i f i e d a multi-step reaction, the p r o d u c t m o l e c u l e may be t e s t e d a g a i n f o r the presence of a d d i t i o n a l r e a c t i o n s i t e s and t h e r e a c t i o n c a r r i e d out again* This e f f e c t i v e l y allows us t o e m u l a t e a reaction w h i c h has been c a r r i e d out with a specific

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

9.

VARKONY ET AL.

Computer-Assisted

Structure

Elucidation

201

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

molar r a t i o of r e a g e n t to starting material; i f only one mole o f r e a g e n t was used, the procedure can be stopped after a single application of a transform; alternatively it can be applied to completion (exhaustively)• C o n s i d e r the r e a c t i o n summarized i n F i g u r e 3 and i t s e f f e c t s on s t r u c t u r e s J_3. - J_5* I n s t r u c t u r e J_3. t h e r e a c t i o n s i t e m a t c h e s t w i c e , o n c e a t C-6,7, o n c e a t C7,8* The r e a c t i o n i s n o t c a r r i e d o u t , h o w e v e r , b e c a u s e both r e a c t i o n s i t e s v i o l a t e the undesired environment r e p r e s e n t e d by JJ_. F o r J_4, t h e r e a c t i o n s i t e matches once at C-6,12 and t h e reaction is carried out* But the product c o n s t r a i n t BREDT on B A D L I S T (J_2) r e j e c t s t h e B r e d t ' s r u l e v i o l a t o r J_6, r e s u l t i n g i n no p r o d u c t s for s t r u c t u r e _1_4* The r e a c t i o n s i t e f i t s t w i c e i n 15. at and C-4,5, and b o t h f i t t i n g s yield products, 17 and J_8, r e s p e c t i v e l y *

Duplication

Among

Products

of

a. R e a c t i o n *

When a reaction is a p p l i e d to a given list of structures, i t is frequently true that some p r o d u c t s t r u c t u r e s o c c u r many t i m e s i n t h e "raw" products l i s t * In mechanistic studies, this is the desired result because each occurrence of a product represents a unique r e a c t i o n pathway (see R e s u l t s and Discussion)* In s t r u c t u r e e l u c i d a t i o n s t u d i e s , though, the i m p o r t a n t information is the chemical identity of, not the p a t h w a y s t o , e a c h p r o d u c t , and i n s u c h applications i t is necessary to e l i m i n a t e d u p l i c a t e structures* This i s not a s i m p l e m a t t e r b e c a u s e a l t h o u g h structures are chemically equivalent their representations within the

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED

202

ORGANIC

SYNTHESIS

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

program may be different (.§_•&• , t h e atoms may be numbered differently from one representation of a structure to the next)* One method of duplicate e l i m i n a t i o n which a v o i d s c o s t l y atom-by-atom structure comparisons between a l l p a i r s of structures involves casting each product into a standard representation ("canonical f o r m " ) * Duplicates t h e n c a n be d e t e c t e d e a s i l y by d i r e c t comparison of these representations* The canonicalization process is relatively timec o n s u m i n g , t h o u g h , and i t i s d e s i r a b l e t o e x p l o r e more efficient methods of duplicate e l i m i n a t i o n wherever possible• One type o f d u p l i c a t i o n which c a n be d e t e c t e d without recourse to canonicalization is symmetry d u p l i c a t i o n , w h i c h c a n a r i s e e i t h e r when the r e a c t i o n itself p o s s e s s e s some symmetry o r when the s t a r t i n g structure i s symmetrical* Our g r a p h - m a t c h i n g algorithm which i s r e s p o n s i b l e for locating possible f i t t i n g s of t h e r e a c t i o n s i t e w i t h i n a m o l e c u l e t a k e s no a c c o u n t o f symmetry* F o r example, suppose the r e a c t i o n i s the a d d i t i o n of one mole of hydrogen t o an alkene* The r e a c t i o n s i t e h e r e i s J_2. a n d t h e t r a n s f o r m a t i o n i s J_2. -> 20»

c=c 1 2 IS

22

c-c 1 2 20

21

23

C-C 1 2 19

C—(C N)

\/ Ν

28

2 1 24

C=C 1 2 25

C-C—OH 1 2 26

C—C

V Ν

29

C—Ν

\/ Ν

30

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

9.

VARKONY E T AL.

Computer-Assisted

Structure

Elucidation

203

B e c a u s e a t o m s 1 and 2 a r e e q u i v a l e n t i n both the reaction s i t e J_2. and t h e transformed site ( . 2 0 . ) , the reaction has 2-fold symmetry* If the reacting structure i s 2J_, w h i c h i t s e l f has a t w o - f o l d symmetry p l a n e , then the f o u r m a t c h i n g s 22. - 23. a l l y i e l d t h e same product, cyclohexene* These f o u r matchings are members of an equivalence class determined by t h e symmetries of the r e a c t i o n and t h e reacting molecule* If the reaction were unsymmetrical, say with a transform of J_2_ - > 2 J > (this would be a hydration r e a c t i o n ) , t h e n t h e r e w o u l d be two equivalence classes among t h e m a t c h i n g s 2 _ 2 - 23.9 one c o n t a i n i n g 22 and 23 (each of t h e s e w o u l d l e a d t o e y e l o h e x e n - 3 - o l ) and one c o n t a i n i n g 23. and 2Ά (each y i e l d i n g eyelohexen-4-ol)* If the reacting molecule also had an u n s y m m e t r i c a l s t r u c t u r e , s a y £2, t h e n e a c h m a t c h i n g would constitute a separate one-element equivalence class, and f o u r d i s t i n c t s t r u c t u r e s would result* The g e n e r a l problem, then, i s to eliminate a l l b u t one member o f e a c h e q u i v a l e n c e c l a s s p r e s e n t i n t h e complete set of matchings* T h i s i s a form of the socalled double coset problem of combinatorial m a t h e m a t i c s w h i c h has b e e n d i s c u s s e d p r e v i o u s l y i n the context of c o n s t r u c t i v e graph l a b e l i n g . Our s o l u t i o n c o n s i s t s o f two p a r t s * F i r s t , b e f o r e the matchings are c a l c u l a t e d , a c r i t e r i o n i s d e f i n e d f o r o r d e r i n g any s e t of matchings* This criterion provides for the comparison of two matchings and, based upon the c o r r e s p o n d e n c e o f r e f e r e n c e numbers o f t h e a t o m s i n t h e r e a c t i n g s t r u c t u r e t o r e f e r e n c e numbers i n t h e r e a c t i o n site, d e f i n e s one matching to be " s m a l l e r " than the other* Second, as each matching is o b t a i n e d , the symmetry g r o u p s of b o t h t h e r e a c t i o n and the r e a c t i n g m o l e c u l e a r e used t o form a l l p o s s i b l e symmetry i m a g e s of the matching* I f t h e m a t c h i n g i s " s m a l l e r " t h a n any of these symmetry images, i t is kept as the r e p r e s e n t a t i v e of i t s e q u i v a l e n c e c l a s s * Otherwise i t is discarded as being a duplicate of some representative elsewhere in the complete set of matchings * The symmetry of the reacting molecule is a p r o p e r t y o f i t s s t r u c t u r e and c a n be c o m p u t e d p r i o r to the matching* The symmetry o f the r e a c t i o n depends u p o n t h e p r o p e r t i e s (.e.g.* , atom names, a l l o w a b l e r a n g e s o f h y d r o g e n s ) and i n t e r c o n n e c t i o n s of the " k e y " atoms in the reaction site, and upon the transform modifications* H e r e , "key" atoms a r e t h o s e atoms w h i c h are actually altered by the application of the

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

STRUCTURES

Figure 4. Development and pruning of a reaction sequence tree. Candidate structures are 31-39. Interconnecting Unes and the size of a structure convey information on the fate of each candidate. Broken lines pointing to small structures mean that the product(s) and its predecessor(s) are invalid and would be removed by constraints. Regular lines mean the product(s) and its associated candidate structure remain after reaction 1; medium lines connect products and associated structures which are viable after reaction 2; and heavy lines indicate the products from the one structure, 38, which survives after all constraints are applied.

CANDIDATE

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

9.

VARKONY E T A L .

Computer-Assisted

Structure

Elucidation

205

t r a n s f o r m to the s i t e as o p p o s e d t o atoms which are n e c e s s a r y t o , but do n o t d i r e c t l y p a r t i c i p a t e i n , the reaction* I n some c a s e s the p r o p e r t i e s of t h e s e key a t o m s , o r t h e o r d e r s o f t h e b o n d s b e t w e e n them, c o v e r a r a n g e o f p o s s i b i l i t i e s a s , f o r e x a m p l e , i n t h e use o f a special atom name which w i l l match any non-hydrogen atom, or of a bond o r d e r o f "any" which w i l l match a bond o f any multiplicity* In such cases it i s not a l w a y s p o s s i b l e t o c o m p u t e an o v e r a l l symmetry f o r the r e a c t i o n as a s e p a r a t e e n t i t y , b u t o n l y t h e symmetry i n the context of a particular matching* For example, c o n s i d e r the hypothetical reaction s i t e 2_8 where the "polyname" (C N) represents an atom which can be either C or N* This site really represents two p o s s i b i l i t i e s , 2_9. and 3.0., e a c h of which has two-fold symmetry. O n l y a f t e r a m a t c h i n g has b e e n obtained can it be determined which of the two possibilities p e r t a i n s and thus which symmetry i s appropriate. In these cases i t i s p o s s i b l e at l e a s t to define, before matching, a set of p o s s i b l e r e a c t i o n symmetries which may be applicable* Then for each matching i t is necessary only to make t h e a p p r o p r i a t e s e l e c t i o n from this set, not to recompute completely the reaction symmetry * Development. Sequence Tree•

Indexing

and

Pruning

of

the

Reaction

A reaction sequence may be of arbitrary complexity* A convenient representation for describing a r e a c t i o n sequence i s a t r e e s t r u c t u r e * We illustrate the d e v e l o p m e n t and i n d e x i n g of a r e a c t i o n sequence tree in Figure 4 * We assume f o r this example t h a t there are nine candidate s t r u c t u r e s ( _ 3 J _ - 33.) f o r an unknown w h i c h i s a 1 , 1 ,-eyeloheptane diol, possessing no g e m - d i o l f u n c t i o n a l i t y . In the example (Fig* 4 ) we present the results (and their ultimate c o n s e q u e n c e s ) o f t h e a p p l i c a t i o n o f two r e a c t i o n s in a s t e p w i s e manner, a single-step oxidation (reaction 1 ) followed by a dehydration (reaction 2 ) * A third reaction, exhaustive dehydration, is also a p p l i e d to t h e s e t o f c a n d i d a t e s t r u c t u r e s (3A. - 3 9 ) * A r e a c t i o n sequence t r e e has several important features* B r a n c h i n g of the t r e e o c c u r s w h e n e v e r more than one reaction is applied to a single set of s t r u c t u r e s (or products), e..jg.* , r e a c t i o n s 1 and 3 , F i g * 4* There is not necessarily a one-to-one correspondence of s t r u c t u r e s to products* First, a g i v e n r e a c t i o n may p r o d u c e more t h a n one p r o d u c t f r o m a

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

206

COMPUTER-ASSISTED ORGANIC SYNTHESIS

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

given structure, e i t h e r : 1) b e c a u s e t h e r e a c t i o n s i t e applies more t h a n once to the s t r u c t u r e (§.*&* > two products a r e produced from ϋ , 3Λ a n d 3_6 39 by reaction 1 ) ; o r 2) because i t i s a fragmentation reaction* S e c o n d , i t may be p o s s i b l e to obtain the same p r o d u c t f r o m two d i f f e r e n t s t r u c t u r e s , 61 . 62 a n d .64 a r e e a c h produced from t h e same r e a c t i o n a p p l i e d t o two d i f f e r e n t s t r u c t u r e s , J*i a n d j>J_, Hfi a n d 54. and 4_0 and respectively)* It i s possible to develop the complete r e a c t i o n s e q u e n c e t r e e by a p p l y i n g a p l a n n e d s e r i e s o f r e a c t i o n s t o t h e c a n d i d a t e s t r u c t u r e s b e f o r e any l a b o r a t o r y work i s a c t u a l l y done* In r e a l a p p l i c a t i o n s , however, t h e tree would be developed in a stepwise manner by c a r r y i n g out a r e a c t i o n i n the laboratory, acquiring data on t h e p r o d u c t s a n d then t u r n i n g t o CONGEN t o explore the implications of this information* We a t t e m p t t o i l l u s t r a t e what i s a v e r y dynamic p r o c e s s w i t h t h e s t a t i c f o r m o f F i g u r e 4* Each r e a c t i o n y i e l d s a s e t o f p r o d u c t s which a r e indexed by pointers to t h e i r precursors* These pointers are maintained a t each step i n the e x p a n s i o n of the t r e e so t h a t i n f o r m a t i o n (constraints) applied to products a t any level automatically results i n appropriate a c t i o n ( p r u n i n g away undesired structures) a t a l l l e v e l s below and above t h e g i v e n level* There a r e s e v e r a l types of c o n s t r a i n t s which can be a p p l i e d t o s t r u c t u r e s i n t h e r e a c t i o n s e q u e n c e t r e e * One constraint is a minimum t o maximum number o f products* In the l a b o r a t o r y , o x i d a t i o n ( r e a c t i o n 1) of t h e unknown structure yielded two structures* Applying the oxidation to the s e t of candidate structures (3_1 - 3.2.) y i e l d s two products from each s t r u c t u r e e x c e p t ^ 2 , 23. a n d 23.9 w h i c h are, therefore, r e j e c t e d a s c a n d i d a t e s by CONGEN* ( T h e s t r u c t u r e s w h i c h a r e s i n g l e p r o d u c t s o f ^2., 23 a n d 23 p r o d u c e d by CONGEN p r i o r t o a p p l i c a t i o n o f t h e c o n s t r a i n t a r e .42., 4jJ a n d 46. respectively*) W i t h no further c o n s t r a i n t s , the remaining candidate structures are s t i l l viable* 1 2

Any of the e x i s t i n g structural constraints i n CONGEN c a n be a p p l i e d t o p r o d u c t s a t a n y s t e p * In the laboratory, t h e two products of reaction 1 were s e p a r a t e d and e a c h s u b j e c t e d to a dehydration (reaction 2)* A major component o b t a i n e d from e a c h p r o d u c t was an α , g-unsaturated ketone* Applying a GOODLIST" constraint expressing this observation, structures -

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

9.

VARKONY E T AL.

Computer-Assisted

Structure

Elucidation

207

60 a r e p r u n e d away by CONGEN, l e a v i n g o n l y the α , 3 u n s a t u r a t e d k e t o n e s 6J_ - _6_4• This pruning also r e s u l t s i n the r e j e c t i o n of products 4 J _ , Jj_4, A £ > Hi and JJ3 a t the p r e v i o u s l e v e l because they d i d not y i e l d α , 3 unsaturated ketones* Rejecting these leads, i n turn, t o r e j e c t i o n o f 2A> 2Λ a n d 3J5 candidate structures because t h e i r products of r e a c t i o n 1 d i d not both y i e l d an ot , 3 - u n s a t u r a t e d k e t o n e * T h i s l e a v e s o n l y 21 - 23 as c a n d i d a t e s t r u c t u r e s *

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

a

s

R e a c t i o n s c a n a l s o be c a r r i e d o u t e x h a u s t i v e l y by r e p e t i t i v e a p p l i c a t i o n o f the r e a c t i o n u n t i l there are no more r e a c t i o n s i t e s r e m a i n i n g i n t h e m o l e c u l e . This is illustrated i n t h e example by reaction 3, an exhaustive dehydration* In the laboratory, this reaction yielded three different dienes* Without further elaboration of the s t r u c t u r e s of these products, the correct s t r u c t u r e c a n be a s s i g n e d a s 2Â b e c a u s e 21 and 23 y i e l d o n l y one p r o d u c t ( t h e same o n e , 66) * I n c a r r y i n g o u t t h e r e a c t i o n w i t h CONGEN, 6_5. and 67 a r e r e j e c t e d by a BADLIST constraint forbidding aliènes* P r o d u c t s J56>, 6_8 and 6_2. a r e p r o d u c e d from 3 8 . the f i n a l s t r u c t u r e . When reactions are r e l a t i v e l y simple and w e l l u n d e r s t o o d , t h e k i n d o f p r u n i n g d e s c r i b e d above c a n be used* As l o n g a s one has c o n f i d e n c e i n t h e r e a c t i o n p r o c e e d i n g as d e f i n e d , t h e n one c a n u s e t h e p r e d i c t e d r e s u l t s a s p o w e r f u l c o n s t r a i n t s on t h e p r o d u c t s and a l l i n t e r r e l a t e d s t r u c t u r e s i n the tree* Otherwise, there would be no grounds f o r r e j e c t i n g a structure* A c h a r a c t e r i s t i c o f m e c h a n i s t i c s t u d i e s , however, i s t h a t t h e g e n e r a l d i r e c t i o n o f t h e r e a c t i o n i s known, but i n insufficient detail to rule out a m u l t i p l i c i t y of products* O t h e r w i s e one could p r e d i c t the products a p r i o r i and t h e r e w o u l d be no p r o b l e m * In a d d i t i o n , of c o u r s e , one b e g i n s w i t h a s i n g l e , known s t r u c t u r e and i t i s nonsense t o use the p r u n i n g mechanism described above. When we p r u n e t h e r e a c t i o n s e q u e n c e t r e e for a m e c h a n i s t i c problem we p r u n e o u t i n d i v i d u a l p a t h w a y s * Rejection of a p a r t i c u l a r product i n the tree prunes away a l l s t r u c t u r e s w h i c h p o i n t o n l y t o i t o r t o w h i c h only i t points. I f a s t r u c t u r e has a n o t h e r source or an a l t e r n a t i v e f a t e , i t i s r e t a i n e d . In the process o f f o c u s i n g i n on t h e s t r u c t u r e s o f unknown products of c y c l i z a t i o n s o r r e a r r a n g e m e n t s we a l s o f o c u s i n on t h e p o s s i b l e pathways o f f o r m a t i o n .

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED ORGANIC SYNTHESIS

208 RESULTS AND

DISCUSSION

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

A) Ln Example of Application of Reaction Sequences to M e c h a n i s t i c Problems• There are at l e a s t two ways t o a p p l y t h e r e a c t i o n s e q u e n c e c a p a b i l i t i e s o f CONGEN t o m e c h a n i s t i c problems* Some o f t h e p r o c e d u r e s d i s c u s s e d s u b s e q u e n t l y c a n be c a r r i e d o u t with current computer-assisted s y n t h e s i s programs whenever a single compound r e p r e s e n t s t h e s t a r t i n g p o i n t * One a p p l i c a t i o n of reaction sequences i n v o l v e s detailed mechanistic s t u d i e s of p o s s i b l e r e a r r a n g e m e n t s of a particular compound* If a mechanism is t o be e l u c i d a t e d i n d e t a i l , i t i s i n s u f f i c i e n t t o know m e r e l y that one compound can c o n v e r t to another* One must a l s o know the i d e n t i t y of e a c h atom i n v o l v e d and i t s fate i n the reaction. This i s normally followed by tagging v a r i o u s atoms i n the starting m a t e r i a l with isotopic or s u b s t i t u e n t labels* This requirement i s t r a n s l a t e d i n the computer program to the f a c i l i t y for retaining structures ( a t t h e same level in the tree) which are f o r m a l l y d u p l i c a t e s but which are in fact d i f f e r e n t i n terms of the numbering of the atoms ( s e e Methods s e c t i o n ) * U s i n g t h i s a p p r o a c h the f a t e of each atom at each s t e p of a sequence c a n be traced* We think that this c a p a b i l i t y w i l l help a user to d e s i g n the best places for l a b e l l i n g the starting material, b a s e d on t h e c o m p u t e r ' s s i m u l a t i o n of the c o u r s e of a reaction* C o n s i d e r , as a b r i e f e x a m p l e , possible 1,2alkyl shifts in structure J_0, under constraints forbidding formation of 3 and 4 membered r i n g s and methyl groups* Although there are only three new structures (in the absence of l a b e l l i n g ) produced i n the first st^p, there are eight different ways o f p e r f o r m i n g the s h i f t to y i e l d four pairs of f o r m a l l y e q u i v a l e n t s t r u c t u r e s , two o f w h i c h , 70a and 70b. are formally t h e same as t h e starting material (10.) but have different numberings* Each of the three new skeletons appears as a pair of s t r u c t u r e s (71 a.b. 72a , b . and 73a b ) * The members o f e a c h p a i r could be d i s t i n g u i s h e d by a p p r o p r i a t e l a b e l l i n g o f £0* Because there i s a one-to-one c o r r e s p o n d e n c e between t h e atom numbers o f t h e p r o d u c t s and t h e s t a r t i n g material, 70 « it is simple to visualize the course of each rearrangement * t

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

9.

VARKONY ET

AL.

Computer-Assisted

Structure

Elucidation

209

Figure 5. The eight structures, unique in terms of the numbering of their atoms, are produced by a single 1,2-alkyl shift of 70. Transitions involving bridgehead carbonium ions were allowed, but three- and four-membered rings were forbidden.

A second a p p l i c a t i o n of reaction sequences to mechanistic s t u d i e s i n v o l v e s r e a c t i o n s where one needs to explore possible products and interconversion pathways, but without regard to preserving identities o f atoms. An e x a m p l e of t h i s type of r e a c t i o n i s the c l a s s i c problem of the i n t e r c o n v e r s i o n of isomers of 10 16 adamantane. This problem has been the s u b j e c t of s e v e r a l recent a r t i c l e s using computer-based approaches to help e l u c i d a t e the course of v a r i o u s interconversions 3 . A characteristic of such problems i s that they i n v o l v e r e a c t i o n s which are run to completion because the reaction and associated c o n d i t i o n s do n o t a l l o w s t o p p i n g the r e a c t i o n after a p r e c i s e number o f s t e p s have t a k e n place. Generally, c y c l i z a t i o n s take place u n t i l further plausible sites for cyclization are exhausted; rearrangements take place until there is no change in the r a t i o s of products. Reaction c a p a b i l i t i e s o f CONGEN can model such reactions. Cyclizations usually involve only a s m a l l number o f s t e p s , so this represents no s p e c i a l problem. Rearrangements, however, can proceed i n d e f i n i t e l y i f no s t o p p i n g c o n d i t i o n i s s p e c i f i e d . In the program we carry reactions to completion by a p p l y i n g t h e r e a c t i o n one s t e p a t a t i m e , s t o p p i n g when no new structures, compared to a l l those produced p r e v i o u s l y , are encountered. Any f u r t h e r s t e p s would be c i r c u l a r and y i e l d no new p r o d u c t s or pathways. C

H

t

o

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

Figure

6.

Complete interconversion bridgehead carbonium

map for C10H16 ring systems ions (solid lines); pathways

devoid of three- and four-memhered rings. Pathways involving bridgehead carbonium ions (dashed lines).

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

involving

no

to

GO

V3

ΚΛ

Ω

hH

il

C/î

>

ι

ο S

Ο

h-» Ο

9.

VARKONY E T A L .

Computer-Assisted

Structure

Elucidation

211

Because f i v e structures ( i l , U , U> - i 8 ) were missing from t h e o r i g i n a l s e t o f adamantane i s o m e r s considered by W h i t l o c k and Siefkin ^, completion of t h e i r i n t e r c o n v e r s i o n map r e p r e s e n t s a good e x a m p l e f o r a mechanistic application. Although i t i s possible to do t h i s problem as o u t l i n e d above, b e g i n n i n g with a specific precursor and running the reaction to completion, i n fact t h e r e i s a much simpler way t o develop the complete i n t e r c o n v e r s i o n map* Under t h e structural constraints p r e s e n t e d ^, there a r e 21 possible isomers • Whenever the complete set of possibilities is available, the complete i n t e r c o n v e r s i o n map c a n be g e n e r a t e d by subjecting a l l p o s s i b i l i t i e s (21 i s o m e r s ) t o one s t e p o f t h e r e a c t i o n , i n t h i s case a 1,2-alkyl s h i f t * The r e v e r s e reaction i s i m p l i c i t i n t h i s s t e p and a l l p o s s i b l e pathways from one s t r u c t u r e t o o t h e r s a r e e s t a b l i s h e d * The c o m p l e t e i n t e r c o n v e r s i o n map i s shown i n F i g u r e 6* 5

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

1

1

Earlier work ^ demonstrated that conversion of tetrahydrodicyclopentadiene ( i O ) t o a d a m a n t a n e (_8Z_) was not p o s s i b l e w i t h o u t i n v o k i n g f o r m a t i o n o f a b r i d g e h e a d carbonium i o n . The e x i s t e n c e o f a d d i t i o n a l s t r u c t u r e s ( 7 1 , 74. 7 6 - 7 8 ) w h i c h l i e i n t h e p a t h o f c o n v e r s i o n o f 70 t o a d a m a n t a n e (£2_) meant t h a t t h i s q u e s t i o n must be reinvestigated* We c a r r i e d o u t t h e a b o v e r e a r r a n g e m e n t reaction under the c o n s t r a i n t s o f no formation of bridgehead carbonium ions, using as d e f i n i t i o n s o f bridgeheads those s e l e c t e d p r e v i o u s l y ^* The r e s u l t s are depicted i n Figure 6* We o b t a i n a f o u r component map i f t h e d a s h e d l i n e s ( p a t h w a y s i n v o l v i n g bridgehead c a r b o n i u m i o n s ) a r e removed* One c o m p o n e n t i s i8., t h e second i s U , t h e t h i r d i s 10 - Β a n d J3_ - ϋ , and t h e fourth is ϋ - 5_0* Although our complete r e s u l t s i n d i c a t e t h a t t h e r e a r e a l t e r n a t i v e pathways from i O t o a d a m a n t a n e (R7) n o t c o n s i d e r e d p r e v i o u s l y , the conclusion o f W h i t l o c k and S i e f k i n ^ t h a t a t l e a s t one b r i d g e h e a d carbonium i o n i s required f o r the conversion, under the given s t r u c t u r a l constraints, i s v e r i f i e d * 1

B* An Example of A p p l i c a t i o n of Reaction Sequences to a Structure Elucidation P r o b l e m * The structure elucidation of c o r i o l i n (whose p r o p o s e d structure is 2J_) , a sesquiterpene antibiotic, represents a problem where structural information i n f e r r e d from c h e m i c a l r e a c t i o n s p l a y e d a crucial role in the tentative s o l u t i o n * Although i t i s possible i n this case to translate a l l structural inferences d e r i v e d from o b s e r v a t i o n s on t h e r e a c t i o n p r o d u c t s b a c k to constraints on the complete set of s t r u c t u r a l

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

COMPUTER-ASSISTED ORGANIC SYNTHESIS

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

212

possibilities, i t is difficult to do using the c o n s t r a i n t s mechanism i n CONGEN* In fact i t i s much s i m p l e r and much more i n t u i t i v e , c h e m i c a l l y , t o use the reaction sequence f e a t u r e s o f CONGEN to e x p r e s s the reaction, obtain products, test the products with constraints and have the program automatically d e t e r m i n e w h i c h c a n d i d a t e s t r u c t u r e s a r e p l a u s i b l e as a result. Extensive spectroscopic data revealed that coriolin has an empirical formula ^15^20^5 * is c o m p o s e d o f f i v e s t r u c t u r a l f r a g m e n t s , jj_2 These fragments comprise a l l o f the atoms i n the empirical formula, so that the f r e e valences (bonds w i t h an u n s p e c i f i e d t e r m i n u s ^ ) o f 5_2 - 3_6 must a l l be c o n n e c t e d t o o t h e r , n o n - h y d r o g e n atoms* a n c

The s t r u c t u r a l a n a l y s i s o f c o r i o l i n u s i n g CONGEN and reaction sequence information provides some interesting examples of the different ways both s u b s t r u c t u r a l and c h e m i c a l i n f e r e n c e s can be utilized to help solve the problem* If chemical experiments have a l r e a d y been c a r r i e d out i n the laboratory, then f r e q u e n t l y some o f t h e i n f e r e n c e s c a n be u s e d i n CONGEN wjiile c o n s t r u c t i n g structures* For example, b a s e d on superatoms 2_6 with the constraint of no additional multiple bonds, t h e r e are more than 800 possible structures* But the chemical evidence'^ r e v e a l s t h a t t h e s t r u c t u r e p o s s e s s e s a t l e a s t one four, one f i v e and one s i x membered r i n g * Even though the precise environment of these rings cannot e a s i l y be s p e c i f i e d u n t i l the r e a c t i o n sequence i s c a r r i e d out i n CONGEN, t h e number o f each r i n g of each s i z e can be used as a constraint, resulting in 56 structural candidates p r i o r to chemical r e a c t i o n s * The NMR data do n o t r e v e a l the presence of c y c l o p r o p y l hydrogens* This constraint further reduces the number of p o s s i b i l i t i e s to 52* 9

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

9.

VARKONY E T A L .

In reactions H CORIOLIN

Computer-Assisted

Structure

the laboratory the was c a r r i e d o u t :

Elucidation

following

213 sequence

of

L i A l H j|

2

>DIHYDROCORIOLIN

>HEXAHYDR OCORIOL IN

i

CrO

ι

V

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

triketone ο /

3

OH \

CH—C 1 °-° 2 97

A

CH—CH 1

ζ>

°2

98

The first reaction reduced the ketone f u n c t i o n a l i t y i n c o r i o l i n ( 5 J . ) t o an a l c o h o l . Carrying out this reaction i n CONGEN y i e l d s t h e e x p e c t e d 52 products. The s e c o n d r e a c t i o n o p e n e d t h e two e p o x i d e f u n c t i o n a l i t i e s , y i e l d i n g two new h y d r o x y l g r o u p s , b o t h of which a r e t e r t i a r y . T h i s a l l o w s a r e a c t i o n s i t e and t r a n s f o r m t o be d e f i n e d w h i c h e x p r e s s t h i s observation, 2J. -> 23.) a more e f f i c i e n t p r o c e d u r e t h a n o p e n i n g b o t h e p o x i d e s b o t h ways f o l l o w e d by p r u n i n g t h e p r o d u c t list. The HRANGE r e s t r i c t i o n , ( i * e _ . , no hydrogens, or H Q _ ) , on atom 1 o f 3J_ r e s u l t s i n f o r c i n g the epoxide t o open to a t e r t i a r y a l c o h o l (£8.)» The e x p e c t e d 52 p r o d u c t s a r e o b t a i n e d i n CONGEN. The final reaction r e s u l t e d i n oxidation of the three secondary a l c o h o l functionalities to keto groups. Spectroscopic data suggested that one o f t h e k e t o g r o u p s was i n a f o u r membered r i n g , one i n a f i v e and one i n a s i x membered ring. This constraint c a n be i n v o k e d on the o r i g i n a l c a n d i d a t e s t r u c t u r e s f o r c o r i o l i n o n l y by a c o m p l i c a t e d case a n a l y s i s on the possible environment(s) of the o r i g i n a l ketone f u n c t i o n a l i t y . As a c o n s t r a i n t on t h e products of the f i n a l oxidation, however, i t isa straightforward t e s t whose r a m i f i c a t i o n s i n terms o f s t r u c t u r a l candidates are determined a u t o m a t i c a l l y by CONGEN. This r e a c t i o n , plus constraints, l e a v e s 20 s t r u c t u r a l candidates, the proposed s t r u c t u r e (JLL) A N D 19 others. We have examined these structures (automatically, using CONGEN) f o r the presence of > 0

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

214

COMPUTER-ASSISTED ORGANIC SYNTHESIS

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

naturally-occurring t r i c y c l i c sesquiterpane skeletons ' b e c a u s e i t was t h i s r e a s o n i n g by a n a l o g y w h i c h l e d to t h e p r o p o s a l o f 5_1 for coriolin. S t r u c t u r e 5_L is t h e only one o f the c a n d i d a t e s which possesses a known skeleton• In the group o f 20 s t r u c t u r e s there are f i v e , i n c l u d i n g 5J_, w h i c h obey a h e a d - t o - t a i l isoprene rule* The other four structures are - 102 * We h a v e recently investigated the scope of isomerism of terpenoid systems and find many examples i n the literature where additional structural possibilities exist b u t where structural assignment i s b a s e d on a n a l o g y w i t h known s y s t e m s * A l t h o u g h t h e r e may be good reasons f o r using a n a l o g y , no new terpenoid skeletons w i l l be d i s c o v e r e d t h i s way*

CONCLUSIONS We h a v e p r e s e n t e d an a p p r o a c h w h i c h i s c a p a b l e o f emulating many of the laboratory applications of sequences of chemical reactions* Integrated with the c a p a b i l i t i e s o f t h e CONGEN p r o g r a m t o s u g g e s t sets of candidate structures f o r an unknown compound, this a p p r o a c h has s i g n i f i c a n t l y i n c r e a s e d t h e power of the program to assist chemists in solving structure elucidation problems* We have used some b r i e f but i l l u s t r a t i v e e x a m p l e s t o show d i f f e r e n t a p p l i c a t i o n s o f r e a c t i o n sequences to both m e c h a n i s t i c and s t r u c t u r a l problems. As c o m p u t e r programs to aid i n structure elucidation develop further capabilities and become more widely available, we feel that they w i l l be utilized exactly as o t h e r analytical tools a r e used* CONGEN p r o v i d e s a means f o r v e r i f y i n g hypotheses about unknown s t r u c t u r e s and suggesting a l t e r n a t i v e s which m i g h t o t h e r w i s e be o v e r l o o k e d *

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

9. VARKONY ET A L .

Computer-Assisted Structure Elucidation

215

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

Our inability to utilize stereochemical i n f o r m a t i o n i s a shortcoming, but i t i s not a severe problem f o r many applications of reaction sequences. Most c h e m i c a l r e a c t i o n s u t i l i z e d t o p r o v i d e a d d i t i o n a l s t r u c t u r a l , as opposed t o s t e r e o c h e m i c a l , i n f o r m a t i o n a r e d e s i g n e d t o have broad a p p l i c a t i o n . The r e a c t i o n s must a p p l y i n a v a r i e t y o f p o s s i b l e s i t u a t i o n s because the environments of the f u n c t i o n a l i t i e s involved are u s u a l l y not p r e c i s e l y d e f i n e d . In addition, detailed relative or absolute stereochemical information i s seldom available until considerable detail of the structure i s known; r e a c t i o n s cannot under these c i r c u m s t a n c e s be s e n s i t i v e t o s t e r e o c h e m i s t r y .

LITERATURE CITED Part XXIII o f t h e series "Applications of Artificial Intelligence for Chemical Inference". F o r Part XXII s e e B.G. B u c h a n a n , D.H. S m i t h W.C. W h i t e , R . J . Gritter, E.A. Feigenbaum, J . Lederberg, and C. Djerassi, J. Amer. Chem. S o c . , in press. 2) We w i s h t o t h a n k t h e National Institutes of Health ( R R 0 0 6 1 2 - 0 5 ) for their g e n e r o u s financial support of t h e SUMEX c o m p u t i n g resource ( R R 0 0 7 8 5 - 0 3 ) on w h i c h CONGEN was d e v e l o p e d a n d is made available to outside users. 3) R.E. Carhart, D.H. S m i t h , H. Brown and C. Djerassi, J. Amer. Chem. Soc., 97, 5755 ( 1 9 7 5 ) . 4) E . J . C o r e y and W.T. W i p k e , Science, 166, 178 ( 1 9 6 9 ) . 5) " C a r b o n i u m I o n s " , Vol. I-IV, G.A. O l a h a n d P.v.R. Schleyer, eds., Wiley Interscience, New Y o r k , N.Y. (1968-1973). 6) E.E. v a n T a m e l e n , Accts. Chem. R e s . , 8, 152 ( 1 9 7 5 ) . 7) A d o c u m e n t describing t h e CONGEN program, including EDITSTRUC and all its associated structure manipulation capabilities, is available from t h e authors. 8) R.E. Carhart a n d D.H. S m i t h , C o m p u t e r s in Chemistry, in press. 9) W.T. Wipke and T.M. D y o t t , J. Amer. Chem. Soc., 96, 4825 ( 1 9 7 4 ) . 10) H.L. M o r g a n , J . Chem. Doc., 107 ( 1 9 6 5 ) . 11) L.M. Masinter, N.S. Sridharan, R.E. Carhart and D.H. S m i t h , J . Amer. Chem. Soc., 96, 7714 ( 1 9 7 4 ) . 12) This example is fictional and is used for illustrative purposes only. There are probably easier ways to distinguish the candidate structures in practice. Only structural isomers, as opposed to geometric isomers are presented. 1)

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.

216 13)

COMPUTER-ASSISTED ORGANIC

H.W.

Whitlock

Soc., 90, 4929

and

M.W.

Siefken,

J.

Amer.

SYNTHESIS

Chem.

(1968).

14) E.M. Engler, M. Farcasiu, A. P.v.R. Schleyer, J . Amer.

Sevin, Chem.

J.M. Cense, and Soc., 95, 5769

(1973). 15)

R.E. Carhart, Sridharan, J.

D.H. Chem.

Smith, H. I n f . Comp.

Brown, Sci.,

and N.S. 15, 124

(1975).

Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch009

16) S. T a k a h a s h i , H. Iunuma, T. Takita, Κ. Maeda, a n d H. Umezawa, T e t . Lett., 4663 (1969). 17) T.K. D e v o n and A . I . Scott, "Handbook of Naturally Occurring Compounds. V o l . II. T e r p e n e s , " A c a d e m i c Press, I n c . , New Y o r k , N.Y., 1972. 18) D.H. S m i t h and R.E. Carhart, Tetrahedron, in press.

Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.