Graph automorphism perception algorithms in computer-enhanced

Journal of Chemical Information and Computer Sciences 1998 38 (3), 367-373 ... Computer Generation of Nuclear Equivalence Classes Based on the ...
0 downloads 0 Views 561KB Size
197

J. Chem. If. Comput. Sci. 1993,33, 197-201

Graph Automorphism Perception Algorithms in Computer-Enhanced Structure Elucidation Marko Razinger,? K. Balasubramanian,* and Morton E. Munk' Department of Chemistry, Arizona State University, Tempe, Arizona Received March

85287-1604

2, 1992

The concept of graph symmetry is explained in terms of the vertex automorphism group, which is a subgroup of the complete vertex permutation group. The automorphism group can be deduced from the automorphism partition of graph vertices. An algorithm is described which constructs the automorphism group of a graph from the automorphism vertex partitioning. The algorithm is useful especially for graphs which contain more than one vertex-partition set. Several well-known topological symmetry perception algorithms that yield automorphism partitions are compared. The comparisonis favorableto the Shelley-Munk algorithm, developed in the framework of the SESAMI system for computer-enhanced structure elucidation. INTRODUCTION Most computer programs that require the manipulation of chemical structures among their operations (e.g., computerenhanced structure elucidation, spectrum interpretation and simulation, and synthetic design) must eventually be able to recognize and process stereochemical information if they are to be truly useful.' There are different approaches to the treatment of stereochemistry2and related problems, but all of them share in common the need to consider the symmetry of the molecular structure under consideration. Symmetry properties of molecular structures are very important because they are reflected in a variety of molecular properties, interactions, and other structure-related phenomena.3 For the correct interpretation of some experimentally derived information, e.g., X-ray diffraction patterns or IR spectra, the determination of molecular stereogeometry is necessary; this implies the use of group theory formalism in the determination of appropriate space or point groups and their elements. In some other areas, like computer-enhanced structure elucidation4 or the enumeration of I3C-NMR signal^,^ partial solutions are possible using topological symmetry (connectivity) instead of three-dimensional geometry. For such cases, a different concept of symmetry, based on graph theory, can be used to advantage.6 The objective of this article is to describe the connection between the automorphism partitioning of the vertices of graphs and the automorphism groups of graphs. An algorithm to construct the automorphism group from the automorphism partition of the vertices is presented. Several procedures to obtain the automorphism partitioning of graphs, including graphs containing isospectral points, are compared with the Shelley-Munk algorithm.'#* A critical comparison shows, contrary to the perception in the literature?JO that the algorithmis rigorous and efficient and distinguishesisospectral points without difficulty. SYMMETRY OF GRAPHS The properties of a graph do not depend on its pictorial representation, but only on its connectivity. Thus, a graph, which can be drawn in several different ways, may appear to possess different symmetry elements in the normal sense of point group symmetry; however, the topological symmetry of each representation is identical based on connectivity. This

' Present address: Institute of Chemistry, 61 115 Ljubljana, Hajdrihova 19, Slovenia. Camille and Henry Dreyfus Teacher-Scholar.

*

aspect makes the consideration of graph symmetry quite different from the conventional molecular point group and space group approaches. (This is discussed with examples in ref 11.) In terms of graph theory, the structural symmetry relates to the vertex automorphism group: this group is a subgroup of the complete vertex permutation group of the graph representing the chemical structure.'*J3 Mathematically, these relations can be expressed simply and succinctly by using matrix algebra. The two matrices used in the following relations are the adjacency matrix A and the permutation matrix P defined as

[

A , , = 1 if vertices i a n d j are neighbors '1 ' 0 otherwise

Pij

=

I

1 if a mapping of vertices i ---- j is induced by the given permutation of vertices 0 otherwise

There are n! possible labelings of a graph with n vertices. Each of them can be represented by a vector P, and the transformation between two different labelings is given by the permutation matrix P. In Figure 1, three of 120 possible labelings of a graph are shown together with the corresponding adjacency matrices A and permutation vectors P. Although the graphs GI and G2 represent the same structure, they are not identical because of the different labelings of their vertices, i.e., different connectivity. Since they represent the same structure, they are known as isomorphic graphs. The isomorphism can be expressed mathematicallywith the followingrelations, holding generally for all graphs: P;:*A(G').P', = 4 G J (1) Because the permutation matrix is an orthogonal matrix, its inverse is equal to its transpose, and the relation can be further simplified: PT**A(G,).P,2= N G , ) (2) Such a transformation is called an isomorphism between the graphs GI and G2, Figure 2 shows the matrices and the results of matrix multiplication for the expression 2: the permutation matrix PI^ expresses the mapping PI-- -P2 and the consecutive multiplication of A(G1) with P l 2 and P,: results in A(G2).

0095-233819311633-0197$04.00/0 0 1993 American Chemical Society

198 J. Chem. In$ Comput. Sci., Vol. 33, No. 2, 1993 1

4

RAZINGER ET AL.

2

Automorphism orbits: O(1): 1 , 2 O(2): 3,4 O(3): 5 . 6

G

Io

0 0 1 0 0 0 1 0 01

0 1 0 0 0 I1 0 1 0 01

Io

0 0 0 1 0

0 0 1 0 0

0 0 0 1 0

P(1)= (1 2 3 4 5 )

P(2)= (4 5 3 2 1)

j:8BB

P(1) = (213456)

J

P(3) = ( 2 1 3 4 5 )

Figure 1. Illustration of graph isomorphism and automorphism. 0 0 0 01 10 P(12)= 0 0 1 0 0

I

0 0 1 0 0 0 1 0 01

/ & &; ; I

2

r~

1

I

P(2) = (124356)

I

1

2

I

P(3) = (214365)

2

1

P4(12)=F'(l2)= 0 0 1 0 0 0 1 0 0 0 G(1)

0 0 1 0 1 A(l).P(12)=10 1 0 1 0 1 0 1 0

Io

0 0

0 01 1 1

0 0 0 0

IO 1 0 0 01

;I

: : :. I:

PT(12).A(l).P(12)=0 1 0 1 1 =A(2) 10 0

1 0 01

Figure 2. Result of multiplying PT(12),A, and P(12).

If there exists a transformation P which transforms the graph G into itself (an isomorphism of a graph with itself), such a transformation G is called an automorphism, denoted in the matrix relation as: PT*A(G)-P = A( G )

(3) Every graph possesses at least one automorphism, Le., the one induced by the identity permutation. Besides the identity transformation, the graph G I in Figure 1 has only one automorphism which corresponds to the graph G3. The fact that the two graphs GI and G3 are identical is shown by the identity of A , and A3; it can be verified also by the application of eq 3 using the permutation P3. Since every relabeling of the graph vertices corresponds to a vertex permutation, every symmetry operation on a graph can be represented by an automorphism, and inversely, every automorphism corresponds to some symmetry transformation of the same graph. Hence, it follows that the overall symmetry of any given graph (often called topological symmetry in the chemical literature) can be recognized by the identification of all graph automorphisms which together form the automorphism group. AUTOMORPHISM GROUP AND AUTOMORPHISM PARTITION For some applications, like approximating the number of 13C-NMRsignals expected for a compound of given structure, it is not necessary to identify the automorphism group of the graph. Instead, it is sufficient to identify the automorphism partition of the vertices of the graph. This partition separates the vertices into equivalence classes, which in many cases correspond to what are referred to as magnetic equivalences in the N M R literature. The vertices belonging to the same

G(2)

(33)

Figure 3. Illustration of the generation of automorphism partition of the vertices.

equivalence class cannot be distinguished by their graphtheoretical parameters. This property of automorphism partition is known also as the topological symmetry.14 There is an important difference between the information content of the automorphism partition and the automorphism group which must be always kept in mind. All vertex permutations implicit in the automorphism partition (for example, the pairwise interchange of two vertices belonging to the same class) lead to isomorphic or automorphic graphs. Only those permutations which preserve the connectivity of the graph (or equivalently do not make or break an edge in the graph) as defined by eq 3 belong to the automorphism group. This is illustrated in Figure 3, showing a graph G and its automorphism partition. There are three classes in the partition. Three permutations of vertices within the same classes are shown (out of the total number of 6! different labelings and 8 possible combinations of intraorbit permutations). While permutations P1and PZlead to isomorphism, only P3 leads to automorphism since graph G3 is identical to the starting graph G. From the comparison of G and G3, one can see also that the automorphism P3 represents a symmetry operation. That is, the permutation matrix which corresponds to P3 has satisfied eq 3. Permutation P3 is also the only automorphism of the graph G in Figure 3 besides the identity operation. The comparison of GI, Gz, and G3 offers another more pragmatic definition of automorphism as a permutation of graph vertices which does not disconnect any edge in the graph. This definition, while being more "graphic", is equivalent to the statement that the adjacency matrix of a graph does not change after an automorphic permutation of vertices. ALGORITHMS FOR AUTOMORPHISM PARTITIONING The Morgan algorithm15is perhaps the first, and certainly the most known, of the vertex partitioning algorithms. Although originally developed to uniquely number the atoms of a chemical structure and not for automorphism partitioning, the algorithm has been used for the latter and is still popular because of its simplicity. The algorithm makes use of the concept of extended connectivity, the graph-theoretical basis

J. Chem. In5 Comput. Sci., Vol. 33, No. 2, 1993 199

GRAPHAUTOMORPHISM PERCEPTION ALGORITHMS

2

1

a

b

Figure 5. Graphs containing isospectral points.

o-/x>,c‘co^ 1

3

2

3

4

6

7

8

4

Figure 4. Some complex graphs.

of which was described lateral6 A number of modifications of the original algorithm by other authors has improved its performance and a p p l i ~ a b i l i t y . ~ ~ ~ ~ ~ ~ ~ ~ ~ The latest procedure of Shelley and Munk,s a modification and extension of their earlier a l g ~ r i t h mis, ~a combination of vertex classification (by the modified Morgan algorithm), subsequent depth-first construction of sequence number permutations, and the final use of automorphism to restrict the construction process. The automorphism partitioning is carried to its logical conclusion by calculating the necessary vertex permutations. Although one could label such a method as a “brute force” method, it is far from that indeed; the combination of vertex classification and the early detection of graph symmetry by the depth-first search method of graph labeling reduces the number of permutations drastically and makes the procedure efficient.s The modified Shelley-Munk algorithms has now been tested with a wide range of “difficult” graphs starting with those which were given as counterexamples (Figure 4) for the original algorithm in a paper by Carhart.I9 The modified algorithm, published 2 years later in 1979, does determine the automorphism partition correctly for these counterexamples. The same holds true for all other examples described below (Figures 5-8).

A review of the literature on topological symmetry perception revealed several papers which appeared years after the modified Shelley-Munks was published that did not cite it. Instead, referring to the original paper,7 they report the failure of the algorithm in the case of certain graphs, when in fact the modified algorithm performs correctly. We believe that such statements discourage use of a fast, rigorous algorithm that can be used advantageously in many chemical applications. Herndong in 1983, referring to the earlier algorithm7states: “A difficulty that remains in both methods [the second being the Morgan algorithm15] is that isospectral points will not be distinguished by either procedure”. As an example, he cites graph (a) of Figure 5 , with the isospectral points highlighted. As shown in the work described here, the statement is incorrect for the modified Shelley-Munk algorithm. Balaban et a1.I0 in 1985, also failing to recognize the 1979 paper, reinforced the perception of a flawed algorithm, incorrectly citing graph (b) (Figure 5 ) as a counterexample. Balaban et al.1° compared the performance of three algorithms for vertex discrimination-ECA [Extended Connectivity Algorithm (Morgan Algorithm)], SEMA17 (Ste-

5

I

I

9

11

10

0915

16

Figure 6. Test graphs. Table I. Comparison of Performance of Various Algorithms for Automorphism Vertex Partitioning of Graphs in Figure 6 structures ECA SEMA HOC-1 S-M 1 2 3 4 5 6

5 4 4 5 5

I

4 5

8 9 10 11 12 13 14 15 16

sum

7 9 6 11 6 6

9 10

9 105

5 4 5 6 6 8 4 5 9 6 10 6 6 9 10 9 108

6 5 5 6 6

8 4 5 12 6 12

6 5 5 6 6 8 4 5 12 6 12

I

I

6

6

9

9

10

9

10 9

116

116

reochemically Extended Morgan Algorithm), and HOC (Hierarchically Ordered Extended Connectivities)-using the 16 graphs shown in Figure 6. Algorithms were compared by the number of vertex equivalence classes predicted for each of the graphs (Table I). The results obtained for those graphs using the Shelley-Munk algorithm8have been added to those of Balaban. The Shelley-Munk and HOC algorithms perceived all automorphism equivalence classes for all 16 graphs,

200 J . Chem. Znf. Comput. Sci., Vo1. 33, No. 2, 1993

6-

RAZINGER ET AL. A

C

D

2

1

d ?? t55 d 4

E

F

G

H

I

J

K

L

M

N

0

P

5

7

8

10

11

Q

S

R

Figure 7. Graphs containing isospectral points.

while the ECA and SEMA algorithms failed in nine and five of the examples, respectively. The sum of the equivalence classes for all 16 graphs (last row, Table I) can be viewed as a comparative measure of overall performance of these algorithms. Finally, the Shelley-Munk algorithm was tested against a seriesof graphs containing isospectralpoints and highly regular graphs. The sets of graphs were taken from papers by Liu et and by King2' respectively. It is well-known that the Morgan algorithm (and related procedures) has problems with graphs containing isospectral points since the extended connectivity values of isospectral points are equal. The modified Shelley-Munk algorithm yields complete separation of all isospectral points in the graphs shown in Figure 7. Regular graphs are graphs in which every vertex has the same degree. Figure 8 shows 19 regular graphs of degree 3. The correct number of automorphism equivalence classes was obtained by the Shelley-Munk algorithm for each and every one of those graphs. Table I1 shows the actual CPU times in seconds for some graphs in Figures 6-8 obtained by implementing the ShellyMunk algorithm on a VAX workstation 3100. As seen from Table I1 for all graphs in Figure 6, the CPU times range from 0.01 to 0.09 s. The existence of isospectral points did not introduce any additional complications or increase in CPU times as seen from the CPU times for graphs in Figure 7 which contain multiple isospectral points. The most complex of the graphs among the ones in Figure 8 are those which contain several fused (clustered) cycles such as graphs labeled 12, 13, 15, 17, 18, and 19. The CPU time for even these graphs were 0.12-0.28 s; the maximum CPU time being for the graph number 17 (0.28 s). As seen from Figure 8, the graph labeled 17 is quite clustered. On the basis of the CPU times and the correctness of the codes for these cases, we conclude that the Shelly-Munk algorithm performs as well as other state-of-the-art algorithms for graph automorphism perception.

Figure 8. Some regular graphs. Table 11. CPU Times for Automorphism Partitioning Using Shelly-Munk Algorithm for Some Graphs in Figures 6-8"

graph Figure 6 ( 1 ) Figure 6 (2) Figure 6 (3) Figure 6 (4) Figure 6 (5) Figure 6 (6) Figure 6 (7) Figure 6 (8) Figure 6 (9) Figure 6 (10) Figure 6 (1 1) Figure 6 (1 2) Figure 6 (13) Figure 6 ( 14) Figure 6 (1 5 ) Figure 6 (16) Figure 7 (1)

CPU (s) 0.01 0.02 0.03 0.02 0.04 0.04 0.02 0.04 0.02 0.09 0.02 0.08 0.06 0.05 0.06 0.06 0.03

graph Figure 7 (2) Figure 7 (3) Figure 7 (4) Figure 7 (5) Figure 7 (6) Figure 7 (7) Figure 7 (8) Figure 7 (9) Figure 7 (10) Figure 7 (1 1) Figure 8 (N) Figure 8 (0) Figure 8 (P) Figure 8 (Q) Figure 8 (R) Figure 8 (S)

CPU (s) 0.03 0.03 0.03 0.03 0.05 0.02 0.04 0.03 0.01 0.05 0.12 0.13 0.13 0.28 0.17 0.12

All CPU times are in seconds on a VAX workstation. The number in parenthesis is the structure number in the corresponding figure.

ALGORITHMS FOR CONSTRUCTION OF AUTOMORPHISM GROUP FROM AUTOMORPHISM

PARTITION Most of the algorithms developed to date yield only the automorphism partitioning of the vertices of a graph. As indicated earlier, the automorphismgroup and automorphism partitioning of vertices are not the same. While knowledge of the automorphism group can provide the automorphism partitioning of the vertices, the converse is not true. All algorithms described up to now, however, yield only the automorphism partitioning of the vertices of a graph. The question is how this information can be related to the automorphism group. An algorithm has been designed to generate the automorphism group of a graph from the automorphism partitioning of the vertices. This algorithm

GRAPH AUTOMORPHISM PERCEPTION ALGORITHMS reduces the n! checking (n = number of vertices) to a much smaller number for several graphs of chemical interest. Suppose the vertices of a graph G are partitioned into sets V I ,VZ,..., V m under the automorphism. Let the number of elements in vi, denoted by lvi, be ui and the total number of verticesin the graph be n. Then an automorphism permutation can induce a permutation on the vertices of G such that all its orbits are within the same vi set. That is, a permutation P of the vertices of a graph cannot be an automorphism permutation if it interchanges an element in a set vi with another element in a set vj (i # j ) . The automorphism group of any graph is a subgroup of Sn where n is the number of vertices and S,is the set of all n! permutations of n objects. However, if the vertices of G are partitioned into more than one V set as above through the Shelley-Munk algorithm or other algorithms, then the automorphism group of G, denoted by I', has to be a subgroup of S,, X S,,X ... X S,,, where ui = JViJ.Symbolically

J. Chem. Inf Comput. Sci., Vol. 33, NO. 2, 1993 201

CONCLUSION The Shelley-Munk algorithm for the perception of topological symmetry was published first in 1977and in a modified and improved version in 1979. It is a fast, efficient, and rigorous procedure for automorphism partitioning of graph vertices. It has been shown in this paper that the modified algorithm for topological symmetry perception of Shelley and Munk8 can correctly solve the automorphism partitioning of all pathologicalgraphs, includinggraphs containing isospectral points found in later p a p e r ~ . ~ J ~ .Additionally, ~@-~~ an algorithm to generate the automorphism group of a graph from its automorphism partition of the vertices has been described. ACKNOWLEDGMENT The financialsupport of this work by the National Institutes of Health (GM 37963) and Sterling Drug, Inc. is gratefully acknowledged. REFERENCES AND NOTES

Our algorithm generates permutations in the group S,,X S,, X X S,,,, and checks if a P E S,,X X S,,satisfies

...

...

PT*A*P= A where A is the adjacency matrix of the graph under consideration. The maximum number of such checkings can be seen to be ul! X u2! X X um!. However, in practice one can use the properties of a group to eliminate numerous checkings. Suppose P I , P2, ..., Pk already satisfies the ..., automorphism condition. Then it is clear that 6' must belong to I' since r is the automorphism group. Likewise, all possible products PiPj ( i = 1 - k;j = 1 - k) must also belong to I'. It is also true that pf, P:, ..., # e (identity) must be in .'I This property can be used if a Pi happens to contain a longer orbit. Implementation of these procedures in the algorithms to construct r can dramatically decrease the computational time in generating I'. Of course, in the worst case when the graph contains only one automorphism partitioned vertex set, the number of computations in the algorithm can be large. However, while such graphs occur in isomerizationsof chemicalreactions (reaction graphs), they are not commonly found in structure elucidationproblems. A version of this algorithm was implemented on a VAX workstation. Although the present code does not exploit the closeness and the existence of inverse elements in a group, we found that the algorithm does yield the correct automorphism group for all the graphs considered in this study as well as other graphs of potential use in stereoisomer generation. The CPU times for generating the automorphism groups of most of the graphs in Figure 6 were less than 0.5 s on a VAX workstation 3100. The CPU times for graphs 8 and 14-16 in Figure 6 were somewhat larger (4.5-90 s), the largest being for graph 8 in Figure 6. The CPU times for graphs in Figure 4 for generating the automorphism groups were between 5 and 12 s on a VAX workstation 3100. The first two graphs in Figure 8 took 0.3 and 0.8 s of CPU time, respectively. The graph 9 took 7.8 s while the graph 13 took 2.8 s. Thus, we conclude that the CPU times are small for graphs with low symmetry and somewhat larger for graphs with higher symmetry. The automorphism groups thus generated are being used to enumerate and construct the stereoisomers of organic compounds. This topic will be discussed further in a future publication.

...

c',