5784 gives all the spanning trees incorporating those atoms with their original links intact. If two linked atoms are removed from M, one enumerates ail acyclic precursors containin the bond whlch links them. The normal procedure for evaluating all 011removes any one atom, which is tantamount to counting all acyclic precursors containing that one atom, Le., all acyclic precursors. If all atoms are removed from matrix M,the evaluation is taken a8 1. (24) As noted in ref 19, cleavages which remove carbons not ultimately incorporated in the skeleton (usually as Con) are not included in the grid; they may be regrded as functlonalizlng(or defunctionaiizing) reactions. Thus the acceptable bonds indicated for cleavages in this discussion constitute a new ring, and their cleavage is a horizontal ringopening line on the grid ( A r = -1; A k = 0). (25) Nearly 30 of the 100 syntheses in ref 4 exhibn such indirect routes in
I
which a skeletal bond, not in the product but useful at an early stage of the sequence, is ultimately cleaved. As examples, in Corey's caryophyilene (p 70), a large ring is formed by cleaving a more accessible bicycle: in the Syntex cecropia horomone synthesis (p 791, two cleavages of a bicycle to an acyclic skeleton are used to create stereochemical control; in Johnson's progesterone (p 288), two ring sizes are changed at the same time by cleavage and recyclization (cf. Flgure 6). (26) It would be misleadlng, for example, to consider the Barbier-Wleiand degradation as an affixation of two six-carbon skeletal synthons followed by cleavage of a 13-carbon unit. The present conception sees it as merely a functionalizationof R-CH2COOR R-COOH in which the only skeletal carbons are R-C, Le., those appearing in the final product. (27) The idea was proposed for picrotoxin biosynthesis years ago by H. Conroy.
-.
Systematic Synthesis Design. IV. Numerical Codification of Construction Reactions James B. Hendrickson Contributionfrom the Edison Chemistry Laboratories, Brandeis University, Waltham, Massachusetts 02154. Received January 22, 1974
Abstract: A simple but rigorous system of codification for construction reactions is developed from structural fundamentals, free of mechanistic preconception. The system allows all constructions to be represented with a numerical representation of the involved functionality and skeletal requirements of substrate and product and their interrelation. The scheme is valuable in systematic searching for synthetic routes as well as in cataloging construction reactions and developing new ones.
An essential requirement for the development of systematic synthesis design must be a simple but rigorous numerical codification of the reactions used. Such a system must be free from prejudice about present capabilities or reaction yields. This paper develops such a system for construction reactions from the numerical characterization of structure previously presented.' That constructions are the central reactions of synthesis may be seen from consideration of the ideal synthesis. The ideal synthesis creates a complex skeleton from simpler starting materials2 and so must link several such synthon molecules via construction reactions. Ideally, the synthesis would start from available small molecules so functionalized as to allow constructions linking them together directly, in a sequence only of successive construction reactions involving no intermediary refunctionalizations, and leading directly to the structure of the target, not only its skeleton but also its correctly placed functionality. If available, such a synthesis would be the most economical, and it would contain only construction reactions. The previous paper in this issue3 develops mathematically the enumeration of the possible modes of construction of target skeletons. Here the actual chemistry which can be used to effect these constructions will be codified to define all possibilities in terms of their related substrate and product functionalities. Restrictive preconceptions about reaction mechanism are avoided in this development in favor of the more neutral and general conception of the net structural change occurring in any reaction. The net structural change at any single carbon site was previously characterized] in terms of four kinds of attachment to that carbon: H for hydrogen, R for u bond to carbon, II for r bond to carbon, Z for any bond to heteroatom. I n any reaction, the change from one attachment to another was characterized by two letters, the first showing the bond made, the second showing that broken. Thus, of the 16 possible reactions so characterized, the construction reactions Journal of the American Chemical Society
are RH, RZ, and RII,4 with respect to either one of the two carbons forming the carbon-carbon u bond. A construction requires two partners, the linking carbon of each being characterized by RH, RZ, or RII, and these show oxidation state changes of A x = +1, - 1 , and 0, respectively.' The RII construction necessarily changes the character of a least one other carbon as well, the other carbon of the II bond undergoing addition, and the oxidation state changes of all must be added to find the net change ( A x ) for RII constructions. Thus the net change in RII constructions is always A x = f l . (For C=C R-C-CZ, RII-ZII, A x = + 1 but, for C=C-C-Z R-CC=C, RII.IIII.IIZ, A x = -1). The overall oxidation state change (the sum of both involved components) can be either oxidative or reductive, or isohypsic,I with Z A x = +2, -2, or 0, respectively. Oxidative and reductive couplings, however, are rarely useful in synthesis since they are only effective for creating symmetrical dimers in intermolecular reactions (although they can unite dissimilar functionalities in cyclizations). The present treatment largely focuses on isohypsic constructions of one oxidative and one reductive partner. Each partner in a construction will be categorized by reaction type as R H , RZ, or RII, depending on the change at the carbon forming the construction link. The numerical characterization' concerns the numbers of each kind of atta,chment to a single carbon, as summarized in Figure 1 . The skeletal value (u) shows the number of u bonds to other carbons, i.e., u = 0-4, and the functional value v) shows the functionality level at that carbon site, f = 0-4. Since f = II Z , the sum of functional 7 bonds to adjacent carbon and the number ( Z ) of heteroatom bonds, a distinction is made by placing one or two bars over an f value to denote the number (II) of r bonds to adjacent carbon. Thus an enol ether carbon is f = 2, the same functional level as the p g e n t ketone (f = 2 ) , and a chloroacetylene carbon is f = 3, while a dichlorovinyl carbon isf = 3, both at the functional level of carboxy1,f = 3.
/ 97:20 / October 1 . 1975
--
+
5785
-
h
-
skeleton
f
functionality
skeleton
Type Number
functionality (f = IT + 2)
E=4=h+o+f ~+C4-h
a=O
f=O
hydrocarbon
a=l primary
f=l
-X, -OH, -NH2
a=2 secondary
f=2
ketonelaldehyde and derivs.
0=3 tertiary
f=3 -CN, -COOH, derivs.
a=4 quarternary
f=4
C02 and derivs.
Oxidation state, x = z - h
Figure 1. Summary of single carbon characteristics.
In the original outline of this characterization,’ constructions were defined by the change in f solely at the carbon undergoing construction. However, a full description of a construction on one synthon must involve defining those functionalities cf values) on adjacent carbons which activate the reaction and remain for consideration in the product. This fuller description is developed in the next section. Basis for Codification. Examination of known construction reactions shows that, in each reacting component, a linear chain of up to three carbons virtually always contains all the functionality necessary to activate the bond-forming carbon site in any particular construction reaction5 Thus any generalized construction reaction consists of two partial synthons, each of three (or less) linear carbons variously functionalized, as summarized in eq 1. The three carbons on V
P
. . . c-c-c
f
f
f
f
+ c-c-c
o
y
...
-
partial synthons as substrates Y P f f Q l b Y
,
. . c-c-c-c-c-c
...
(1)
product each side of the forming bond are labeled a,&y away from that bond on each side, the a carbon of each synthon being the one at which the construction occurs. Each synthon, as substrate or product, will bear functionality variously (and characteristically for a particular reaction) on sites a, /3, and y and may as well exhibit other carbons also linked to (branched from) the a, 0,y sites, but the central, linear three-carbon unit is the one which bears the minimal obligatory functionality6 to activate the particular kind of construction. Each partial synthon may be considered separately and independently. The construction at each may be called a half-reaction, defined by the change in functionality from substrate to product on one synthon. Thus any half-reaction on one synthon may be coupled to a whole family of partner half-reactions on the other synthon to make up a construction reaction. The idea is implicit in Grignard or enolate synthons, which may be coupled with ketones, nitriles, vinyl sulfones, epoxides, etc., and the functional change in each half is independent of the other. Furthermore, it is important to observe that the functionality on the substrate and that on the product are specifically related for any given half-reaction since the reactions are defined by their net structural change. Thus not only is the product functionality determined by that of the substrate but also, in reverse, a given product functionality determines that of the substrate for purposes of reasoning backward. To clarify usage, certain other definitions are adopted. Any linear run or chain of n carbons within a skeleton will be called a strand, or n-strand, specifying the number (n)
of carbons it contains (a strand is distinguished here from a chain as being specifically linear, since traditionally chains are often described as branched). The substrate is understood to refer only to the strand of three (or less) carbons (a,p, y) of the starting synthon which bear the obligatory6 functionality for the half-reaction, while the product refers to the functionality on the same strand after the half-reaction has taken place. The substrates and products are thus just the reactive strands of two synthons linked by a construction (in a cyclization they are both on the same molecule).’ The span is the length, or number of carbons, in any strand, defined by and including the carbons at each end. Spans may refer to the distance between two functional groups, between two construction terminals on a synthon, etc. Any half-reaction will be characterized by a half-span (s’), the distance from the bond-forming (a)carbon to that of the outermost obligatory function. Values for half-spans are thus s’ = 1, 2, or 3, for that function at a,/3, or y, respectively. The construction span (s) is therefore the number of carbons linking the outermost functions of the two joined synthons after a construction reaction, Le., s = s 1’ s2’ with values of 2 < s < 6 . The construction span of a Michael reaction, as in eq 2, is s = 5, that of cyanohydrin far:
+
+
-C-CH f = 2
I o
-
I1 F1 -62
C=
(substrates)
1 0 -C-C--C-CH-CI l l
0
I
I
f = 2 0 s’l‘ =
2
l
1
II
,
l
0 0 S?’
(products)
2
= 3
(s = 5)
mation s = 2. In these cases, the construction span is the same as the span of functionality in the product but, in cases in which one synthon becomes functionless, this is not so. In enolate alkylation, the construction span must incorporate the bond formed and is s = 3 even though there are not two functionalized sites remaining to define a span of functionality. In acetylene anion alkylation, s = 3, even though the only product carbons remaining functionalized are adjacent, i.e., a functionality span of 2. The foregoing conceptions are illustrated in the Michael reaction, eq 2, showing the substrates of two synthons each undergoing a half-reaction, the functionalities of both substrates and products noted and interrelated, each independent of the other. The importance of the concept of span is that it shows the relative locus of product functionalities for use in subsequent constructions. Moreover, such a span of
Hendrickson
/
Numerical Codification of Construction Reactions
5786 Table 1. Possible Combinations of Product Functionality
-
Table 11. Symbols for Construction Half-Reaction Labels RF
Substrate
fa
RH
I 0 I
0
2
Ai
11
010
020
03.
51
21
31
110 210
120 220
13 23
A3
13
00 1
002
003
01 1 02 1
012 022
013 023
c2
3
B3
C3
L.
10
23
33
10
010
}llO
101
102
103
101
111 121
112 122
113 123
1111
201
202
203
101
21 1
212
213
22 1
222
223
}
111 8
40
a Any list with f = (carboxyl family) must terminate with that carbon, Distinctions of carbon-carbon rr-bonds from heteroatom functionalities have not been made here." b Pairs differing only in ketone/aldehyde (or related f = 2 functions) and carboxyl family, f = 3, as the outermost (p- or 7-)functionality.
functionality seen on the total product skeleton can direct attention to particular construction reactions in reasoning backward. This concept was employed by Corey and Wipkeg as pairwise consideration of functional groups. The net structural change in each half-reaction is both skeletal and functional. The skeletal change is always simply a unit increase in the value of u for the a carbon (Aua = 1 ) . It is the change in functionality on the reactive strand which is characteristic of a particular half-reaction. In order to annotate this easily, we have only to list the three (or less) f values for the strand (a,0,y ) in the substrate and the related product. These characteristic f-lists. as f a , f ~f.,,, then define particular construction half-reactions. The bars over f values for R bonds are linked across two adjacent f values to avoid ambiguity about the location of the s bond. A single barred f value implies the other x bonded carbon is adjacent but off-strand. For the Michael reaction, eq 2, the f lists are n2 .+ 002 for the unsaturated acceptor and 02 02 for the enolate component. A simple variant of either uses the cyano/carboxyl activating function, as i i 3 003 for the former and 03 03 for the latter. Writing the left-hand f list backward, with a dot to symbolize the construction link, the combined product f list is 20.002 (or y30.00?3), showing both possible carbonyl variants,lo with a span, s = 5. Conjugate addition of Grignard reagents to unsaturated sulfones is 0 0 coupled with 01, or a p r o d s t f l& --c 0.01 (s = 3), while acetylene anion alkylation is with 1 0, or a product f list 22.0 (s = 3). This now provides a basis for cataloging all possible construction reactions in terms of net structural change, without bias from preconceptions of mechanism or current feasibility, using f lists and half-spans for definition. Catalog of Construction Half-Reactions. It is possible to tabulate systematically all possible functional variants for any three-strand partial synthon in order to encompass all possible constructions in terms of half-reactions. For threestrand construction products, there are 40 structurally possible three-digit f-lists (f&fv) since there are three f
-
-
-
-
-
-
Journal of the American Chemical Society
/
97:20
/
0
0-
-
1 -
-
2
T
1
B
1
2
C
2
3
D
3
-
-
4
-
4
-
-
*fa
0
-1
-1
-1
Afp
0
0
-1
0
Af
0
0
0
0
001
}oil
A
RI1
0
Polarities
1'
values (f = 0, 1, 2) available to nonterminal carbons and one more (f = 3) available only to terminal ( u = 1 ) carbons. Thus the number of mathematically possible combinations is 40 = 33 + 32 3' 3 O . The 40 possible product f lists are shown in the center of Table I. Being simply all possible mathematical combinations, these descriptions must necessarily include not only all known construction products but also all possible ones." A further and useful condensation of functionality information is also shown at the right of Table I a s f ' lists. The simplest view of functionality, its presence or absence at any site, is embodied in thef' value of 1 or 0, respectively, i.e., for! = 0, f' = 0; for f 2 1, f ' = 1. Thus there are 23 = 8 f ' lists of three binary digits each, as shown. A slightly larger set off lists can be similarly created for all possible substrates (there are 53 since fa = 3, 4 are structurally allowed for substrates). Half-reactions with no obligatory function beyond the /3 carbon, Le., s' = 2, are represented in Table I by the group off lists ending in one zero, while those of s' = 1 (only acarbon functionally involved, as in carbonyl addition, etc.) are those combinations off lists ending in two zeroes. Subsequently, s' = 1 half-reactions will be represented only by one-digit f lists and s' = 2 half-reactions by two digit f lists, From these generated sets of substrate and product f lists, we may formulate the construction half-reactions which interrelate them. To do this, we must define the changes #inf value (AA at each of the three carbons, a,0, and 7,which are the characteristic substrate-product interrelations for particular types of contruction. The net structural change at the main or bond-forming site (the a carbon) is implicit in the reaction type. For RH half-reactions, there is no functionality change, Le., Afa = 0 while, for RF reactions (Le., RZ and RII), there is a unit decrease in functionality level at the constructing carbon, Le., Afa = - 1. These definitions therefore relate the f values of substrate and product for the (Y carbon in any half-reaction. Functionality at the /3 and y carbons is obligatory for activation of the construction in half-reactions of s' = 2 or 3 but, in the R H and R Z reactions this functionality is unchanged by the construction Le., Afo = Afv = 0 (e.g., the enolate component of the Michael reaction, eq 2, for which 02, 02). In addition reactions, RII,however, the /3 carbon also changes functionality from ll to Z or H. In the former case, the functionality level is unchanged (Afp = 0) although its form changes from ll to Z while, in the latter case, Afo = -1 as the @ carbon goes from II to H (e.g., the other half of the Michael reaction, n2 .+002). These relations allow structural definitions of construction half-reactions. All the primary information about particular half-reactions is contained in the reaction type (RH,
+ +
-
October 1, 1975
5787 Table IV. Selected Construction Reactions
Table 111. Examples of Labels for Construction Half-Reactions -
Rirtion TVP RH
RZ
Rli
Subsna m
1,
Label
RMgBilRLil
4
R CH CO R'
A,
R-CG-b03
8
R E C
c
RCOCl
3
,
0
1
0 ,
:
PIdYct
-
2 1
2 1
1
2 3
1
PolarlN
CI
Enolate Alkylations
A2
Enolates
A2
Aldol Condensations
A2
'
21
0
Friedel-Crafts Reactions
82
Claisen Condensations
Alkylations
11
Grignard Carbonations
AI
'
Epoxide Openings
11
Michael Additions
A2
'
Carbonyl Additions
21
Conjugate AdditionsiCN-
Dl
'
Acylations
31
Conjugate Additions: t o Carbonyl
-
Conjugate additions of Alkyl Copper
AI
'
-
13
Alkyne Alkylations
c2
'
23
Benzoin Condensations
Cl
'
12
Pinacol Reductions
21
-
-1 > '
-1
2
12 -
02
0 0
1,
1
3
113
w3
0
213 -
113
RZ, RII), the functional level of the cy carbon, and the halfspan. Hence simple but systematic labels for all possible construction half-reactions may be developed from this information with a single symbol and a subscript to show the half-span (s'). The symbol will be a letter (A-D) for R H reactions and a number (1 -4) for R F ( = R Z and RII) reactions, the symbol indicating the functionality level at the a carbon, f a , in the substrate; the level in the product is then implicit from the relations above.I2 These R H and R F labels are listed on the left in Table I, as well as the halfspans, s', shown corresponding to their respective product f lists. The corresponding substrates then have the same f lists for R H half-reactions and f lists with one higher f value for the first digit (fa)in the R F half-reactions. It is the f a value of the substrate which dictates the label; thus R F reactions 11, 12, 1 3 all showf, = 1 in the substrate and fa = 0 in the productflist of Table I, while R F reactions 2 yieldf, = 1 in the product f lists shown, etc. The R H and RF symbols are also shown in Table I1 to correspond with the defining values of f a in the substrate. A further distinction of R F half-reactions as R Z or RII is still required. The R Z symbols are plain numbers ( = f a in substrate), while RII symbols (also f a numbers) are differentiated with a bar over the number. Furthermore, as noted above, the RII reactions must themselves be further divided into those addition reactions in which the B carbon of the r bond broken maintains its functionality level or lowers it. Those RII reactions with Afp = -1 (Le., II to H at fl as in the Michael acceptor, eq 2) are labeled normally with barred numbers, while those with Afp = 0 are differentiated with primes (i.e., II to Z or n at the 0carbon). These symbols described for the half-reaction labels are summarized in Table 11, showing their relation to reaction type and f a of the substrate and the functional changes (Af at each site) that interrelate substrate and product f lists. Except for the level of activating but unchanging functionalities at p or y carbons, these simple labels contain all the information necessary to write the structural essentials of substrate and product for any construction half-reaction. These labels and their attendant substrate-product f lists are simple and rigorous, deriving solely from considerations of possible structural changes. However, the families of construction half-reactions which they represent correspond remarkably with current usage in describing half-reactions.I3 Thus the simple alkyllithium or Grignard reagent is an AI half-reaction when used for construction: the substrate, R e , acts as R H construction with f a = 0 (hence label A),I4 Af, = 0, and a half-span, s' = 1; i.e., only the a carbon bears obligatory functionality. Simple alkylation (R-X R-R') is labeled 11, hence an R Z type with substratef, = 1 and Afa = -1, and a half-span, s' = 1. The common ketone reactions are A2, the half-reactions of nucleophilic enolate anion construction, R H with substrate f a = 0, Afa = 0, and s' = 2 Cflists of 02 02) and carbonyl additions, 2, (flists 2 l ) , R Z with substratef, = 2, Afm
-.
-
Dithianes
'
01
22
2 -
CH>=CH.CN
BI ' 2 ,
11
-
2
l2
CHI=CH NRI
AI
Wittig Reactions
21 A2 ' 3,
-
2,
3
Grignard Additions
BI
2
02 1
R.&0?H?
2
AI
3
0
02
C1-CH=CHCOOR
1
Grignard Reactions Wittig Reactions
0 0 0 0 0
0
1
2
f+lt
Full Reactions
Half.reactions
~
f.lis1
I'
-
-
t o Acetylenic Carbonyl t o Nitro, Sulfonyl
11
'
4,
13
13
11
2,
'21
Addition.Eliminations: t o Unsatd. Carbonyl t o Unsatd. Sulfonyl
Z3
Acetylenic Couplings
c2 c2 -
22
-
Claisen Rearrangements
1;
Electrophilic Additions
1;
Fischer Indole Synthesis
A2
'
1;
' '
B2
Table V. RH Half-Reactions and Partial Synthon Characters -Ma" f-llrf
V.Il5,
A,
I1111
00 00 1101 1101
HC-
I
0 1/11
- 34
c
-c
10
64
IO
34
0
34
20
24
2
2
3
3
, ,
02 03
42 4
73
32 3
1 2 13
32 3
12 $3
22 2
z B,
, ,
I0
HC-
10
z c,
HC-
1
2
1
3
z
0,
HCZNiHCZi
-A:
,
0
0
1,
HC-C-
0203 11213
01 (111
I72 T3 i2
2 0 B:
HC-C-
I1
0:
HC=C-
11
c:
HC.C-
11
n
011
on
11111
ill71
-
I10
111
110
1213
-
111211
i
23 12
no
lii
233}
-
210
233
-
-c c I
-