Chemical applications of graph theory. Part II ... - ACS Publications

such as "Draw structural formulas for all of the isomers of heptane". ... For simple problems ..... possible solutions to a structure elucidation prob...
0 downloads 0 Views 4MB Size
Chemical Applications of Graph Theory Part II. Isomer Enumeration Peter J. Hansen and Peter C. Jurs Pennsylvania State University, University Park, PA 16802 The Mpic of isomerism is introduced very early in most undergraduate organic chemistry courses-often in the "alkane cha~ter".Generallv this chanter defines isomerism and gives examples of isomers of some of the smallcr alkanes; trerluentlv it also includeseithera table listinealkane isomer numbers,-or an example such as, "for theiwenty-carbon eicosane, there are 366,319 isomeric structures" (I). At the end of this chapter one is apt to find one or more exercises such as "Draw structural formulas for all of the isomers of heptane". Frequently, students wonder whether isomer numbers can he computed mathematically, but rarely does an organic textbook shed any light on this question. Even most practicing chemists are unfamiliar with methods for computing isomer numhers and are apt to employ the "draw and count" method if the need arises. For simple problems this method is quite satisfactory, hut clearly one would not attempt to draw the structures of the 366,319 eicosanes. Graph theory, however, provides chemists with the means for solving isomer enumeration prohlems that (1) possess mathematical exactness, and (2) can he applied to prohlems that surpass the practicallimits of the draw and count method. Graph theory is a branch of mathematics that is related to both topology and comhinatorics. In addition to isomer enumeration there ahe numerous other applications of graph theory in chemistry as well as in a wide variety of other fields in hoth the natural and social sciences. Readers lacking a familiarity with the basic terms and concepts of graph theory are directed to the first paper in this series (2). Although Euler (1707-1783) is generally considered to be the father of graph theory ( 3 , 4 ) ,the British mathematician Cayley (1821-1895) is also credited with having independently discovered graph theory (5).Cayley's first graph theory paper, published in 1857, was devoted to the enumeration of rooted trees (6).A tree is a connected acyclic graph, that is, a set of points joined by lines, but containingno rings. A rooted tree is a tree that includes one point, the root, that is distinguished from the others. Acyclic alkanes can he viewed as trees, and monofunctional acyclic compounds such as alcohols and alkyl halides can be considered as rooted trees-the root being the carbon to which the functional group is attached. Figure 1 shows, as an example, all nine rooted trees containing 5 points. Cayley's work on the enumeration of trees led him within a few years to apply graph theory to the enumeration of organic chemical isomers. He was perhaps the first person to address the problem of enumerating isomers mathematically in his paper "On the Mathematical Theory of Isomers" published in 1874 (7). Cayley eventually enumerated both the alkanes and alkyl radicals through n = 13 (8-10). His methods were laborious, and not surprisingly, his results contained several arithmetic errors (11). Constltutlonal Isomerism The mathematical solution to the problem of enumerating isomers generally results in a polynomial termed a generating function. For example, the generating function for acyclic alkane isomers is

Figure 1. The nine rooted trees containing five paints.

x

+ xZ + x 3 + 2r4 + 3x"

5z6

+ 9x7 + 18r8 + 35xs + . . .

(1)

For each term in eq 1, the exponent represents the carbon number for a set of alkane isomers, while the coefficient is the number of isomers belonging to that set. For example, the 18x8 term indicates that there are 18 alkane isomers of composition C8H18. Mathematically deriving this generating function is the chief task of isomer enumeration. In a slightly modified form the expression for the enumeration of rooted trees published by Cayley in 1857 (6) is

-

(x)(l X)-T'(l

- X2)-T-(1 - X

p .

.(I- X")-T.,=

T,x + T,x2 + Tax".

. . + T,x" (2)

The right-hand expression of eq 2 is the generating function for rooted trees, that is, there are TI rooted trees containing only one point, T2 with two points, T3 with three points, and so on. T o obtain this generating function (i.e., to determine numeric values for the T,) requires the solution of the lefthand expression of eq 2. The method of solution is iterative and, if performed manually, very tedious. Readers are directed to Cayley's original paper (6) or to TrinajstiC's book (11) for the solution to this equation for rooted trees having up to 10 and 13 points, respectively. (Each of these authors uses a slightly different notation from that employed here. Their generating function coefficients, A,, correspond to rooted trees with n branches, hence the number of points is (n 1).Furthermore, hoth solutions include minor arithmetic errors.) This expression is not directly applicable to organic chemical isomer enumeration since the rooted trees enumerated include those with points having degree (valency) greater than four and roots having degree greater than three (Fig. 1includes an example of the latter). Nonetheless, the importance of this expression cannot he overstated since it opened the way to the solution of many chemical isomer enumeration prohlems.

+

Volume 65

Number 8

August 1988

661

During the 1930's, Henze and Blair (12,13) developed a recursive method that represented perhaps the first major advancement beyond Cayley's work. They calculated the number of alkanes, and primary, secondary, and tertiary alcohols through n = 20. T o enumerate compounds such as the monofunctional alcohols, Henze and Blair recognized interrelationshins between the total number of rooted trees and the numbers of rooted trees having primary, secondary, and tertiarv roots. For examnle. if the OH aroun - . of an alcohol is replaced with a C H ~ O Hgroup a primary alcohol will invariablv result: hence the number of isomeric primary alcohols caving 1) carbons must equal the totafnumher of isomeric alcohols of all types having n carbons. For alkane enumeration t h e en; and Blair approach requires the partitioning of the alkanes into two classes according to whether the total number of carbons is odd or even. Each of these two classes is further subdivided according to specified rules. For example, subclass A of the even numbered alkanes includes only those alkanes that can be divided into two alkvl erouns with an eaual number of carbons by breaking a singe ccemical bondA(e.g.,2-methylheptane. in contrast to 3-ethvlhexane). The number of members in this suhclass (Ni,,)is given by the expression,

(A +

where L, represents the total number of distinct isomeric alkyl groups with half as many carhon atoms as the original alkane. For example, the number of octane isomers yielding two butyl groups upon breaking one bond is, (112) (4) (1 4) = 10, since there are four butyl groups (i.e., n-, sec-, tert-, and isobutvl). Of course. to comnlete the enumeration of the total numbe; of isomers bf octane would require the enumeration of subclass B as well-whose members cannot he split into two butyls. The reader is directed to Trinajsti?~work for a more detailed treatment of this approach (11). Although the manual application of the method of Henze and Blair can be tedious, the method is amenable to computer encoding (14,151. Lederhere first reported a general procedure for the enumeration of isomers corresponding to any given elemental composition (16). He attacked the problem of not only enumerating chemical isomers but also of generating their structures. Lederberg and co-workers (17) developed DENDRAL (Dendritic Aleorithm) for the eeneration of acvclic isomers. More recentl; ~ r i n a j s t ianduco-workers ~ h&e developed computer programs to plot trees (alkanes) and rooted trees (substituted alkanes) (14), as well as KekulC. structures of conjugated hydrocarbons (18).

+

Substltutlonal Isomerism P6lya's enumeration theorem (19, 20) opened the way to the systematic derivation of the generating functions for many different classes of chemical compounds. One problem that arises in countine isomers stems from the fact that the symmetry of a compound reduces the number of substitutional or constitutional isomers that can exist. Many structures that may appear to be isomers are immediately recognized as identical by chemists. For example, chemists recognize that the name "2,3-dibromobeniene" is merely a misnomer for 1,2-dibromobenzene and does not refer to a different theorem combines ~ ~ ~ -isomer. ~ -P6lva's ~ ~enumeration ~ graph theory with group theory to account for the effects of svmmetrv. - func" The theorem orescribes that the eeneratina tion is obtained by substituting a figure counting series into a cycle index-as will be demonstrated below. The application of P6lya's theorem to isomer enumeration involves the following steps: 1. The derivation of a cycle index that reflects the symmetry of the parent compound. 2. The derivation of s figure counting series that reflects the

-

~~~~

~~

~.

~

662

Journal of Chemical Education

Figure 2. (a) A graph Gcontaining low points. (b) The four graphs resulting from ail permutations of G that belong to its permutation group, their cycle representations,and the associated L h symmetry elements (the xand y axes are defined by points 1 and 3 and 2 and 4, respectively; the z axis is perpendicular to the plane of me paper). (c) Additional pictorial representations 01 graph G.

number of different atoms or groups that can occupy a substitution site. 3. The substitution of the figure counting series (after minor transformations)into the cycle index. 4. The algebraic expansion of the resultant expression to obtain the generating function. The cycle index for a system is derived from its permutation group. .. . A permutationgroup possesses the normal properties of a mathematical &up ieiated toclosure, asso$at&itv, identity, and inversion. It is perhapseasiest fur chemists to think ofthe elements of the permutation group as symmetry elements such as Cq,CQ,a,,, a",i, andE, hut more precisely~tbeyare 1-1 maps from a graph G to a graph GI such that G and GI are equivalent. Figure 2a shows agraph, and Figure 2b shows the four graphs obtained from it by applying all four permutations belonging to its permutation group. The cycle representation of each permutation is also shown as well as the associated symmetry elements. The cycle representation of any permutation can be derived as follows (using as an example the graph shown in Figure 2a, and the permutation labeled C&): repeatedly apply the Czb) permutation to the original graph, the labels of the points occupying the original site of point 1will be in turn, 1, 3, 1, 3, 1 , 3 , . . .; hence for this site a single period or cycle of the permutation can be represented by (13); the labels of the points occupying the original site of point 2 will be in turn, 2,2,2,2,. . .;hence for this site asingle cycle of the permutation can be represented simply as (2); in like manner, for the sites originally occupied by points 3 and 4, a single cycle of the permutation can be represented by (13) and (4), respectively. Combining these cycles and eliminating redundancies yields the cycle representation (13) (2) (4) for this permutation.

The distinction between a point group and a permutation group becomes apparent from a consideration of the graph shown in Figure 2a. The point group of this graph is D2h (assuming that its perimeter is a perfect square). The D2h point .. erouD elements (three two. includes eieht .. svmmetrv . fold axes, three symmetry planes, a renter of symmetry, and the identitvu~erator).The permutatinnrrouo of thisrrnph. however, i&des only fou; elements, each o? which can-he associated with two symmetry elements from the D~J,point group. Furthermore, angles and line lengths as shown in the pictorial representation of a graph have no meaning in graph theory. The graphs shown in Figure 2c all belong to the same permutation group as the graph shown in Figure 2a (in fact thev are all reoresentations of the same .. eraoh). . . . but clearlv they do not belong to the same point group. Chemists, who are familiar with ooinr eroun svmmetrv, will find it easier to identify the per&tati&s o? aohemical system by identifying its symmetry elements, hut care must be exercised so that redundant permutations are not included. In addition, one must avoid omitting permutations that do not correspond to any single whelk-molecule symmetry operation. For example, one of the permutations of the biphenylsystem (assuming relatively unhindered rotation about the interring bond) can be viewed as a rotation of one ring by 180° while leaving the other ring fixed. As mentioned before, the cycle index for a system is derived directlv from the oermutation erouo of the parent compound. ~ e n z e n ewhich , belongs t o i h e point groip D6h. belones to the dihedral nermutation eroun DR. The cvcle indexfor the Ds permutaiion group is

+ 38:s; Z(D6= (x2)(4

+ 4s; + 2s; + 2s6)

(4)

This cycle index is applicable to the enumeration of the substitutional isomers of anv svstem belongine to the Dc . permutation group regnrdlesi 07 the substituents. The D; cwle index is derived in the followine. manner (refer to the table foracomplete list of the permut&ion cycles, associated symmetry operations, and contributions to the cycle index of each of the 12 permutations of theD6 permutation group); to derive the polynomial term corresponding to each permutaifac?.de inchdrsonly tionmnke thefollowingsuhstituti~ns: one point (e.g., (4)) replace it withsl: if i r includes two points (e.g., (26)). replace it with s2, and 5 0 forth. Following these replacements, multiply, and combine thes, terms. Hence for the twofold axis that nnsses throueh atoms C-1 and C-4 uf the benzene ring the cbrrespondingcycle index term is

-

(1) (4) (26) (35) =e s, X s, X s2X s2= s:s;

(5)

And for the twofold axis that bisects both the bond between atoms C-1 and C-2 and the bond between C-4 and C-5, the corresponding cycle index term is (12) (36) (45)

-

s, X s,

X a, = s$

(6)

After deriving terms (such as the above) for each of the permutations belonging to the permutation group, add them together, and multiply their sum by the reciprocal of the order of the permutation group (the order equals the number of permutations in the group). For the D6 permutation group, the result will be the cycle index as given by eq 4. The figure counting series (itself a generating function) represents mathematically the number of different tvues of substituents that can occupy any given substitution siie. For example, if one examines the bromobenzene congeners, each substitution site is occupied by either a hydrog& atom or a bromine atom, and the figure counting series will consist of the sum of two terms. One may optionally choose either (y + x) or (1 x). In the case of the former, the resulting generating function will consist of polynomial terms of the type kynxm,where k is the number of isomers with n hydrogens and m bromiues. In the case of (1 x), the terms will he of the type kxm, where k is the number of isomers with m

+

+

Permutatlon Cycles, Symmetry Operatlons, and Cycle lndex Terms for the L& Permutatlon Group Cycle Index Term

Symmetry Operations

Permutation Cycles

bromines, and the number of hydrogens must be deduced by subtracting the number of bromines from the total number of substituiion sites. For a case such as the bromochlorohenzene congeners in which each substitution site may be occupied by one of three different atoms (i.e., H, Br, or C1) the figure counting series z) or (1 x 2). As above, the can be either (y x generating functions that result will differ only in that for the latter case the numher