Computer enumeration and generation of trees and rooted trees

Isomer Generation: Semantic Rules for Detection of Isomorphism. Journal of Chemical ... On Characterization of Molecular Shapes. Journal of Chemical ...
0 downloads 0 Views 994KB Size
J . Chem. Znf. Comput. Sci. 1981, 21, 91-99

91

Computer Enumeration and Generation of Trees and Rooted Trees J. V. KNOP, W. R. MULLER, 2. JERICEVIe, and N. TRINAJSTIe* Computer Centre, The University of Diisseldorf, 4000 Diisseldorf, Federal Republic of Germany

Received June 20, 1980 A computer-adopted method for enumerating and plotting trees (alkanes) and rooted trees

(substituted alkanes) with Nvertices (carbon atoms) is described. Results are compared to some previous work in this area. INTRODUCTION This paper presents a computer-oriented method for enumerating and plotting trees (alkanes) and rooted trees (substituted alkanes). A tree is a connected graph with no cycles.' A rooted tree is a tree in which one vertex has been distinguished from others.z This vertex is usually called the root. Alkanes C N H ~ Nare + ~ represented by trees in which the maximal vertex degree isfour. Alkane trees used here depict only carbon skeletons of alkane hydrocarbons. Substituted alkanes CNHzN+IX are represented by hydrogen-suppressed rooted trees in which the maximal vertex degree is also four.

primary r o o t

secondary root

tertiary root

1-hydroxy-2-methylbutane

2-hydroxy-3-methylbutane

(primary alcohol1

(secoddary alcohol)

2-hydroxy-2-methylbutane (tertiary alcohol)

0 1

I

d

b tree

rooted tree

r3

P

2-methyl-propane

2-methyl-propane i lree)

graph

CH 3

G 3

c
GO10 4 0 0 no p r e v i o u s c o n d i t t o n

'

*1n1=0 i f t h e b r a n c h b e g i n n i n g a t F A T H E R ( I j e x t e n d s t o t h e end of t h e H-tYple,thiS e r t 8 b l i l h e I II new L B C IflUSEn(PATER).EP.PATER) nINI=(KK-USl+8RAI)~BRAI a s s i g n a l l r e m a i n i n g edges t o C u r r e n t v e r t e x TI-N-US1

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

OLDREFSREF OLDORG=ORG t e s t f o r h o n c o n o r p h i c a l i r r e d u c i b i l i t y (PROP3) lF(Tl.E9,1) NOlHIR=.TRUE. nCTID(l)=NOTIDT NCHIR(I)-NOTHIR t e s t i f current loop alreaoy exhausted lF(Tl.GT.01 GOTO 4 9 0 r f the f o l l o u l n g branch 1s taken,"* have fhe Only r e m s i n i n g f a i l u r e c a r e m e n t i o n e d above iF(UINI.GT.0) GOTO 920 CONTINUE If(.NOT.RERUN) GOTO 4 9 5 o t h e r v e c t o r s have been i n i t i a l i r e d a c c o r a i n g to t h e r e i d v a l u e s f o r TREE RERUN=.FALSE. If(REf.LE.0)

GOTO S O 0 i f t h e l a s t RLOC 1 3 3 tun

identical

fulfilled

by e i h s u s t r O n , u e have a non-identica:

~ ~ b f u ~ al ned s t h e r e f o r e

500

'

simply a l l rooted trees ranted G O T O 800 c l a s s i f y rooted t r e e ( t e s t whethe? the g i v e n N-tuple 1 t k r L~1 a r o e s t o b f a r n a b l e a b o v e t h r ~ 0 r r e r p o n d i n Qt r e e ) UAXTEX=.TRUE. IF(N.LT.3) GOTO 530

If(N0TREE)

c

c

t h e f i r s t component never s e t t l e r a non-rdentical automorphism

G OTO ..

a p p l y (PROPA) NOTUlR=NOHIRlI-1) 1F(TI.E9.1) NOTHIR=.TRUE. K=l+? NOHIR(l)=NOTHIR NOT1 D T * N O T l D ( 1 > D O 4 9 0 1;K.N 1FtTl.GT.Oj GOTO 1 1 0 D i e c e d i n g v e r t e x was f i n a l J=l-l i s J brother of I ? If(BRA(J).GT.l) GOTO 1 3 0 no,go b a c k one g e n e r a t r o n J=FATHER(J) happens i n f r e q u e n t l y b u t O Y C T ~ L L o n l y once f o r GOTO 1 2 0 J 7s brother of 1 REF'I P r e p a l F f o r " C Y RLOC oRG=l same f a t h e r n a t u r a l l y PATER=FATHER(J) 1 7 1 n e i r b r a n c h 01 f a t h e r BRAI=BRA(J)-I GOTO 1 8 0 pveceding vertex 7s father,no elder b r o t h e r t o be r e s p e c t e d REf.0 PATER=1-1 f i r s t brother 8RAI;TI

c

c

ORGiO

GOTO 1 0 5 I f no r o o t e d t r e e s a r e i a n t C d , d v O l d the generation o f any N - t u p l e s f o r w h i c h t h e f i r s t Conponrnt I f n o t t h e I t r o n Q l I IdlQielt IFlNOROOT) U A X I k = T l - l apply (PROPI) NOTHIR*Ti.E9.2 r v i t s h ALKANE B L I O I I no more t h a n 4 e 0 f g e ~ a t a v e r t e x lF(.NOT.ALKLNE) GOTO 1 1 0 lF(lI.GT.&) TI=& TREE(l)=TI IF(UAXIU.GT.3) UAXlM=3

c

c c

tree5

49s

OLDREF;: If(I.GT.1)

c

9G

c c

i f t h e following l l n e i s r e a c h e d f r o m a b a r e , t h e first component 3 % c o n f e r n e d , i h i c h n e v e r ha3 any R L O C ; i t r e a c h e d fTOn b e l o r , t h e r e e x i s t s no R L 0 C , b e ~ d u 1 e t h e c o m p o n e n t c o n c e r n e d h a s j u a r b e e n d e c r e m e n t e d f r o m an a l l o w e d v a l u e , u h i s h d e f i n i t e l y f u l f i l l s a n y o l d RLOC

c

I U P L I C I T INTEGER (A-1) c L O G I C A L TREEKO c LOGICAL NOROOT,NOTREE,NOTEST,NOPLOT,SPPLOT,ALKANE lo0 LOGICAL NOTID,NOHIR,NOTIDT,NOTUIR,UAXTEX,RERUN DlPlENSlON T R E E ( 3 0 ~ , 8 R b ( J O ~ , F A T H E R l 3 ~ l C DlUENSlON UfNIU(3O),USED(30) c DlUENSION NOTID(30l,NOHlR(30) C TREE(1) c o n t a i n s t h e c o n t r o l l e d v * r i i l b l e o f t h e I - t h c n e s t e d " v i r t u a l " loop and t h e r e f o r e v i t h r n t h e N - t h ( i n n e r m o s t ) l o o p TREE c o n t a i n s t h e N - t u p l e d e f i n e d c above f o r the a c t u a l (rooted) t r e e FATHERlI) c o n t a i n 3 t h e rnder of t h a t v e r t e x from which v e r t e x 1 v a s r e a c h e d f i r s t ( t r e e - t h e o r e t i c f a t h e ? o f I) i.e. 1 i S one O f t h e v e r t i c e s C o u n t e d b y T R E E ( F A T H E R ( 1 ) ) B R A C I ) c o n t a i n s t h e number o f r e m a i n i n g branches o f v e r t e x PATHER(1) i n c l u d i n g t h a t b e g i n n i n g a t I c U S E D ( 1 ) d e n o t e s t h e n u n b c r O f e d g e s ( p l u s 1 ) u s e d up t o 101 b u t n o t i n c l u d i n g 1,i.c. t h e sum T R E E ( l ) + . . . + T R E E ( I - l ) NOTLO(1) v h r n . T R U E . S a y s t h a t l R E E ( l > , . . .,TREE(I) do 110 n o t represent t h e beginning o f an i d e n t i t y t r e e N O H I R ( 1 ) when .TRUE. s a y 1 t h a t no t r e e h a v i n g t h e s a m e f i r s t 1 c o n p o n e n t r a s TREE c a n b e h o m e o m o r p h i c a l l y rrreduc7ble I I I N I U ( 1 ) d e s c r i b e s t h e Lower b o u n d c o n d i t i o n ( i n t h e c sequel a b b r e v i a t e d L B 0 , i . e . IIINIM(1) > 0 says t h a t t h e l o u e s t a l l o u a b i e v a l u e f o r TREE(1) i s 1 c TI,PATER,8RAl,USI,NOTIDT,NOTHIR,M1HI serve t o o p t i m i z e 120 a c c e s s to t h e c u r r e n t c o m p o n e n t o f t h e ~ c s pv e c t o r c TREE,FATHER,8RA,USED.hOTIO,NOH~R,IIlN~II c REf a n d ORG d e s c r i b e t h e a c t u a l r e v e r s e l e x i c o g r a p h i c a i d e r c o n d i t i o n (RLOC i n t h e r e a u c l ) c REF c o n t a i n s t h e i n d e x O f t h e v e r t e x # h o s e n u m b e r o f 130 b r a n c h e s m u s t b e >I T R E E l l ) t o a c h i e v e RLO o f b r a n c h e s c ORG C o n t a i n s t h e i n d e x o f t h e v e r t e x v h l c h w a s f i r s t S u b l e c t e d t o t h e p r e s e n t RLOC c O L D R E f a n d OLDORG c o n t a i n REF a n d O R G f r o m t h e p r e v i o u s posit ion c RERUN 7 5 .TRUE. during the f i r s t pais through the loop r f r e s t a r t i n f o r m a t i o n was d e l i v e r e d , duping such a f l r i t p a s s no H - t u p l e I S g e n e r a t e d b u t t h e o t h e r v e c t o r s c a r e VeCOnDtiUCted c HlRS,IOTS,TREES,ROOlS,TESTS,MAROTS are respectively 150 homeomorphically i r r e d u c i b l e treer,identity trees, trees,rooted treer,calla t o TREEK0,raoted trees with c maximal f i r i t component component^, U A X I M c o n t a i n s an a d d i t i o n a l u p p e r b o u n d f o r c If NOROOT i s .FALSE. t h e n UAXIM i S t Set t o N and h a s no 180 c effect,if NOROOT 1 s .TRUE. t h e n U A X I M i s s e t t o t h e f i r s t component TREE(1) m i n u s 1 t h e r e b y a v o i d i n g t h e c generation o f those rooted t r e e s which obviously are n o t the trees c UAXTEX i s o n l y u s e d t o r e m e m b e r t h e r e s u l t o f a t e s t c ( I P e t t o .TRUE. o r i g i n a l l y a n d w i t h i n a DO-lOOp,it c r e s e t t o .FALSE. i f t h e examined r o o t e d t r e e must be t e s t e d b y TREEKO t o d e c i d e a b o u t t r e e - p r o p e r t y *a** i n p u t d a t I c NOROOT w h e n .TRUE. s a y s t h a t t h e program s h a l l as f a r 190 a S Possible avoid genersting looted trees rhiCh u r l l o b v i o u s l y b e no t r e e s ( o t h e r w i r e t h o s e r o o t e d t r e e s a r e c generated i n Order t o be counted) 200 NOTREE w h e n .TRUE. I I Y 5 t h a t t h e Crogran % h a i l p l o t a l l c l o o t e d t r e e s and may t h e r e f o r e o m i t a n y t e s t s f o r 300 tree-property c N O T E S T # h e n .TRUE. s a y s t h a t t h e p r o g r a m s h a l l n o t use t h e f u n c t i o n TREEIo ( w h i c h i m p l i e s t h e n t h a t lame t r e e s c nay appear t h a n o n c e ) . T h i s was b u i l t i n a s a t e s t this f a c i l r t y t o m e a s u r e t h e t i m e s p e n t i n TREEK0,ar c i s t h e only p a r t f o r u h r c h t h e expense c o u l d not be 310 t o b e b e L o u o r e q u a l t o t h e same O r d e r c a s t h e number o f v e r t i c e s i n t h e g e n e r a t e d t r e e s . c N O P L O T w h e n .TRUE. s a y s t h a t no g r a p h i c O u t p u t s h a l l b e c p r o d Y C ed 400 S P P L O T w h e n .TRUE. say5 t h a t Only i d e n t , t y t r e e s and h o n e a n o r p h i c a l l y i r r e d u c i b l e t r e e s s h a l l be p l o t t e d c C N d e n o t e s t h e a c t u a l n u m b e r o f v e r t i c e s f o r one t r e e A L K A N E w h e n .TRUE. says t h a t o n l y (rooted) t r e e s " 4 t h atmost f o u r edges a t any v e r t e x s h a l l be g e n e r a t e d c N U I N a n d NUAX a r e l o v e r a n d u p p e r b o u n d s f o r N ( 7 f on i n p u t H M I N < = N < = N I I A X t h e n t h e p r o g r a s e x p e c t s 8% f u r t h e r i n p u t a n tuple a n d g e n e r a t e s n e w N - t u p l e s c b e g i n n i n g j u s t b e l o w t h e g i v e n one, t h e o n l y n e C e s s a r Y 420 c h e r k p o i n t l r e r t a r t i n f o r m a t i o n t o resuae t h e p r o d u c t i o n c o f t r e e s i s t h e last v a l i d ( r o o t e d ) t r e e generated) 460 ~ ~ ~ ~ ( 5 , 1 0 N0 M0 I ) N , N U A X , N , N O R O O T , N O T R E E , N O ~ ~ ~ ~ ~ N O ~ L O ~ ~ FORUAT(3I2,lOLl) IF(N.LT.NMIN) GOTO 2 0 READ(5,lOOl) (TREElI), Ii1.N) fORUAT(3012) RERUN=.TRUE.

TI'TREE(1 ) - 1 D O 52D 1 = 2 XK ;OO:?f a n o t h e r component i s a s l s i g e a2 t h e f l : s t , ? I w i l l p l o d u c e a l e x i c o g i a p h i c a l l y l a r g e r N - t U p l e when u s e d as t h e r o ~ f , t h e r e f o r e t h i s i s d e f i n i t e l y no t h e SYpprcSlel t h l l Case) tree-representative (NOROOT=.TRUE. 51O:if t h e f i r s t component i s n o t b y 2 g r e a t e r t h a n any o t h e r o n e , t h i s o t h e r v e r t e x m u i t b e t C I t C d b y TREEKO w h e t h e r 1 t a s t h e roof d e l i v e r s a l a r g e r N - t u p l e

COMPUTER ENUMERATION AND GENERATION OF TREES 510

520 530 C

c

IF(TREE(I)-Tl) 520,510,900 MAXTEX=.FALSE. CONTINUE IAROTS=MAROTS+l can tree-property I F ( M A X 1 E X ) GOTO 800

J. Chem. If. Comput. Sci., Vol. 21, No. 2, 1981 95 639 640

be seen w i t h o u t

a s a i l to T R E E K O ?

c

TESTSSTESTSIl C

C a l l t o TREEKO w a n t e d ? GOTO U O O test for tree-property 1F(TREEKO(h,TREE,NOTIDT)) GOTO 9 0 0 LO""1 cases TREES=TREES+l 1F(.NOT.NOTIQT) IDTS=IDTS+l IF(.hOT.NOTnxR) H l R s.= H . r a. r+i g r a p h i t s f o r t h i s c a s e wanted ? IF(SPPLOT.AND.NOTHlR.AhD.NOTIDT) GOTO 9 0 0 g r a p h i c s Yented 7 I F ( N O P L 0 T ) GOTO 8 9 9 p?DdYcI giephlCI CALL PLTREE(h,TREES,TREE) CONTINUE CONTINUE * * * * * + * P o s s i b l y i n s t a l l a t i o n d e p c n a a n t *+.+*+. t h i s would be t h e l i g h t p l l c c f o r a p e t l o d i s a l s t o r i n g of Checkpoint information (only vector TREE a n d ( I f n e e d e d ) t h e C n Y D l C r l t l o n C O U n t L I s HIRS,IDTS, TREES,ROOTS,TESTS,.. M A R O T S must be s a v e d ) "0

IfCNOTEST)

C

C

800 C C C

899 900 C C C C C C

c 910

650 C C

c

C C C

660 C C

670

f o r N < 3 t h e r e i s o n l y one r o o t e d t r e e GOTO 960 f i n d innermost non-exhausted "Yi?tUal" loop

IF(N.LT.3)

C

C C

FOUnt

C C

1=*-7

920 930 940

YI=T~EEII)-~ lf(1I)940,930,950 l f ~ M l N I M ~ I ~ . L E . GOTO O ~ 950

C

l=l-1

c

GOTO 9 2 0

c

advance t h i s

loop

950

TREE(I)=TI

960

Outermost " v i r t u a l " loop exhausted ? IF(TREE(l).GT.IINROl) GOTO 1 0 0 * + * + * * * 3 L i n e s i n S t a L L . t i O n dependlnt *t*+*+* i n s t a l l a t i o n dependent r o u t i n e t o get CPU-tile CPU=TIME(I)

C C C

C C

used

675 C C

c C

680

n e x t nunbar o f v e r t i c e s "anted ? lF(N.LE.NMAX1 GOTO 50 i n i t a l I a t i o n dependent p l o t t e p - t e r . i n a t i o n * + * + * + * i n s t a l l a t i o n d e p e n d i n t *+et*+. lI(.NOT.NOPLOT) C A L L PEhD STOP END L O G I C A L FUNCTION TREEKO(h,TREE,hOTIDT)

C C

...

c

C C C

c c

t e s t

c

f o r

t r e e

1=1 690 700

c

rid

C C C

T l i T. l E. f.I 1

C

c C C

C

GOTO

C 610

C C C C

720

K=l-Vtl GOlO 7 4 0 If11.6T.fINV)

C

730 7bO C

750 C

620

770 77s 780 C C

790 795

c C C

c for

valid h-tuple

730 Of

nor r o o t

smaller branches Of o l d root K.1-FINVtJ-1 PLRM(SON,K)~PERM(FATHER,I) l=OLdrK.nll position PERMUT(I)*K CONTINUE c o l l e c t t r a c e i n f o r m a t i o n o f applies p e r m u t a t i o n s DO 7 6 0 I.1,N TfI=lRACE(fATHER,I) lRACE(SON,I)*PERMUT(Tfl) happens i n f r c q u c n t i y b u t OVLIILL o n l y once for e s c h L - v a l u e GOTO 650 compare N - t u p l e s CONTINUE D O 7 8 0 1.l.N If(PERM(L,I)-PERM(l,I)) 790,780,775 NOTREE=.TRUE. GOTO 7 9 5 CONTINUE tuples arc equa1,i.e. lRACE(L,*) deicribcr I non-identical .utonorphism NOTlOT=.TRUE. CONTINUE CONTINUE TREEKOSNOTREE RETURN

.....

ttt..*tt*.t.t~.".~...**.*".~*.

t

g r a p h i c

o u t p u t

t**tt...t**.*t..~t.*.t...~..~*~~~

I M P L I C I T INlEGER ( A - 1 ) REAL X , Y , X I , Y I , X J , Y J , X O , I O , X O , Y D , S l 7 . E , Y I D E , L E N G LOGICAL V A L I D DIMENSION TREEOO),VALID(30) DIRENSION f A T ~ E R ~ 3 1 ~ , B R A N C H l 3 1 ~ , F I N ~ L S ~ 3 1 ~ DIMENSION X ~ 6 0 , 3 0 ~ , ~ ~ 6 0 , 3 0 ~ , X I ~ 3 O ~ , I ~ ~ 3 0 ~ TREE 4 % d e s c r i b e d i n t h e m a i n P r o g r a m BRANCH(1) c o n t a i n s a t a n y m O m m t , r h e n s c a n n i n g t h e Ariadne-thread,the n u s b e r Of yet unused edges S t a l t i n g

C

.

new r o o t

SUBROUTINE PLTREE(N,SERNU(I,TREE) C C C

t h e r o o t a n d must

vALID(I)=.FALSE. m a r e u s e f u l when r e a r r a n g i n g PERM(1,l)iTO f i r s t h-tuple i s defined VALID(l)=.TRUE. ieacch fifhCr,grPndf.ther,..

760

c have

620

C C

GOT0

Of

K=ItN-FINV GOTO 7 4 0

c

b r a n c h used BRASP=BRANCH(SP)-l BRANCH(SP)=BRASP lF(BRASP.GT.0) GOTO 6 2 0 r f no n o p e b r a n c h e s u n s t a c k SP-SP-1 i f TREE h a $ b e e n b u i l t a c c o r d i n g t o d e f i n i t i o n , t h e f o l l o w i n g GOTO w i l l a l w a y s b e t a k e n happens i n f r r s u e n t l y b u t o v c r l l l o n l y once f o r r a s h L - v a l u e IF(SP.GE.1) GOTO 6 1 0

GOTO 7 2 0 larger branches

rmillcr branches

C

C

630

710

>

i n i t i a t e s t a c k p o i n t e r and f i r s t s t a c k elements SPil POs(t)=l BRANCH(l)=Tl L a s t v e r t e x i s a l w a y s f i n a l and t h e r e f o r e c a n n o t 8 large? N-tuple KK=N-I inspect other vertices D O 7 9 0 L=Z,KK number o f b r a n c h e s TL=TREE(L) 1fCTL.LE.O) GOTO 610 *tack "on-final vertex SP=SP+l P O S ( S P ) =L BRAhCH(SP)=Tl

o l d root

I_"

GOTO 7 4 0 If(1.GE.J)

C

TOiTl-1 C

C

--..., "=.A

f o u n d to b e no i d e n t i t y t r e e P O S a n d BRANCH a r e s t a c k s U s C d , l h e n s c a n n i n g t h e A r i a d n e t h r e a d d e f i n e d by t h e N-tYple,tO S t o r e i n d e x and r e m a i n i n g number o f ions f o p a c t u a l f a t h e r s V A L I D ( 1 ) s a y s w h e t h e r PERM(I,t) a n d TRACE(I,*) have been defined PERRUT s t o r e s t h e l a s t p e r m u t a t i o n PERPIC1 * ) c o n t a i n s t h e h - t u p l e f o r I a s t h e r o o t TRACE(;.J) c o n t a i n s t h e p o s i t i o n Of Vertex J f r o n t h e o r i g r n l l h-tuple i n t h e N-tuple fer root I UNOEC s a y s w h e t h e r t h e c a s e i s y e t u n d e c i d e d Y A L D A T S I V I whet he^ PERM a n d TRACE h a v e a L r C # d Y b e e n in7iiali NOTREE i s a t e m p o r a r y c o n t a i n e r f o r t h e f Y n s t i o n r e s u l t VALDAT=.FALSE. r o o t e d t r e e n o t y e t found t o be n o t t h e t r e e NOTREE=.FALSE.

C C

C

UNDEC=.TRUE. CONTINUE CONTINUE DO 7 5 0 I=l,N IF(1.GE.V) G O l O 710 Larger b r a n c h e s O f

C

c c c

C

f i n d p o s i t i o n f o r new b r a n c h O f n e w r o o t ( o l d r o o t a n d remaining branches of old root) UNDEC=.TRUE. JK=J D O 690 KrJK,FINV HELP=PER(l(FATHER,I)-PERM(FATHER,K) L e a v e Loop i f f i r s t d i f f e r e n c e b a t u c c n o l d a n d n c r branch i s p o s i t i v e IF(HELP.GT.O.AND.UhDEC) GOTO 7 0 0 If(HELP.GE.0) GOTO 6 7 5 f i r s t d i f f e r e n c e i s negative.10 t h i s old branch i s L a r g e ? t h a n t h e new o n e UNDEC=.FALSE. GOlO 6 8 0 1.1+1 n e w b r a n c h o f new root i s b u i l t b y c o n c a t e n a t i n g remaining branches e f old r o o t lF(1.EP.V) I*FINVtl t e s t cod o f o l d b r a n c h o f n r r r o o t SUM=SUM*PERM(fATnER,K)-1 IF(SUM.GE.0) GOTO 6 9 0 o l d b r a n c h o f ncr m o t e X h i u i t e d , L C a v r the loop if i t w a s n o t l a r g e ? t h a n new b r a n c h ( n e w o n e m a y b e D U t i n front of this old brmsh) I F ( U N D E 0 GOTO 7 0 0 o l d b r a n c h l a ' l a r g e r t h a n n C l , S e t ~ ~to c O m p l r C new t o next old branch J=K+l

(I

tt**.*..~t~l~~*.*~***~~~~*~*

C C C C C C C C

sun=o

sun-o

tt.*t*..t*.lt**.t......tt*****t

C

JJXSP POSJJ=POS(JJ) I F ( V A L I D 1 P O S J J ) ) GOlO 650 JJSJJ-1 h a p p e n s i n f r e q u e n t l y b u t o v e r a l l o n l y once f o r c a s h L - v a l u e GOTO 6 4 0 g o a n d c o m p a r e when N - t u p l e f o r v e r t e x C o n c e r n e d h a s been developed IF(JJ.GE.SP) GOlO 7 7 0 N-tuple d e f i n e d FATHER=POS(JJ) JJ=JJ*l N - t u p l e to b e d r f i n c a SON*POS(JJ) p o s i t i o n o f t h e son i n t h e N - t u p l e O f t h e f a t h e r V=TRACE(FATHER,SON) VALID(SON)*.TRUE. 1.1 b e g i n n i n g o f f i p s t o l d b r a n c h O f n e w loot J*Vtl s e a r c h t h e e x t e n t o f t h e r u b t r e e b e g i n n i n g a t V,by v i r t u e O f (PROP1) BUM=PERM(FATHER,V)-l F1NV.V IF(SUM.LT.0) GOTO 6 7 0 sum i s n o t n e g a t i v e , t h r r r f e r e t h e next vertex belongs to t h e r u b t r e e a s r e 1 1 FINV.fINVt1 SUM~SUMtPERI(fAlHER,FlNV)-l GOTO 660

.t y c r t e x I f A T H E R ( 1 ) c o n t a i n s t h e i n d e x O f t h a t v e r t e i tiom w h i s h v e r t e x I was p e a c h e d f i r s t I t r e e - t h e o r e t i c f a t h e r o f I), 1.e. 1 i s m e O f t h e vLrtiC.s C o u n t e d b y TREE(FATHER(1)) FINALS(1) c o n t a i n s t h e number o f f i n a l v e r t i c e s u h l c h can be r e a c h e d f r o n v e r t e x 3 t h r o u g h t h i s v e r t e x I xI(I),YI(I) contain the plot-cmrdinltCO Of Vertex 1 XJ,IJ c o n t i i n the plot-ceordin.tcr o f t h e current v e r t e x of vertex 1 X0,YO c o n t a i n the plot-coordin.trr X(K J ) Y(K,J) c o n t a i n ( o n l y when VALID(J).EP..TRUE.) the p l . ; - C 0 0 ~ d i n l t C - a i f f ~ ~ ~ " f~o~r * a l l n e e d e d d i I C C t l 0 n S O f e d g e s (to s a v e S I N a n d COS I v a l Y a t i e n r ) XD,YD contain pl~t-C001dln.t~-diff~~C.SCI for V C I t C I 1 Of different trees

KNOP ET AL.

96 J. Chem. If: Comput. Sci., Vol. 21, No. 2, 1981

Table 1. Number of Trees and Rooted Trees with N Vertices and the CPU Time Needed for the Calculation

c

no. of vertices ( N )

no. of treesa

no. of rooted treesa

CPU time,b s

1 1 2 4 9 20 48 115 286 719 1842 4766 12486 32973 87811 235381 634847 1721159 4688676 12826228

1 2 4 12 32 90 24 8 689

c

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

IOiYD 81:

XCiXGiXD CCNTiIUE

c

"refL,

PATERiNil QHANCH1PATER)iPATER iINALSlPATER)=G C0Y"t f >"a 1 vert ' cer D G 830 l = l , h branclei i t a r t l r g at 1 ~ I i T R E E ( 1 ) f a t h e r i s the youngest vertex r l t h oranche~ IP'HER(I)=PATER . r ~ : i a l l y a c t b r a n c i e i a r e unused QRANCHlI)=l1 1 s Yertex 1 final 7 1FiTI.GT.G) GOTG 8 2 5

c c

C

1 1 %

C

825

c c

l r 1 t 1 a L ~ i a f 1 a n r

830

c

840 850

c

c c

c c

c

c

Temdlnlng

f'la!

FlNP=l iiNALS(li=iIhP bacr t o f a t h e r iINP=IlNP+FlNALS(PATER) i n c r e a s e f a t h e r ' s f i n a l coint b y t h a t o f son FINALS(PATER)=fINP r h l i b r a n c h has been Used BRAP=BRANCH(PATER)-1 QRANCHlPATER)=QRAP branches left ? G O T O 830 ii(BRAP.GT.0) n o more b r d n i h e s , s o b a c k t o f a : h e r PATER=iATHERlPAlER) h a o p e n s ~ n f r e a u e n t l ? b L f 0 ~ E i a 1 1o n l y o n c e f o r e a c t I - v a l u e GOTO 8 2 D v e r t e x 1 I S PO'. f i n a l a n d t h e r e f o r e h a s s o n s ( b r a n c h e s ) PATERil ilNALSli)=C CCNT:NUE n u m b e r o t f , n a l v e r t i c e s I S number C f f i n a l V e r f i i e P r e a c h a b l e through v e r t e x 1 ilNMAX~flNALS11) (+l I f verter 1 ,s I t s e l f flna!) iFlTREEll).LE.l! iINMAXiilNMAXt1

~(L,iINMAX)=LENG*SIN(XJ'FLCATlI-l)) BRANCHll)=TREE(l) start coord1ndtes XJiX0 YJXYD first u e r t e r Xl(:>=XJ I l ( l ) i Y , move and mark *+*+A+* p o s s i b l y 1 n s T d l l l t 1 o n dependant t i * + * + * C A L L SYMBOL(XJ,IJ,SIIE,l,0.,-1! I F l Y . L T . 2 ) G O T O 880 i n l t i a t e eOge d l r e c t l o n ( f i r r l edge I 1 1 f t l e lower than hoi1rontal) L o b i 1 - f 1N A L I ? > D O 870 I = ? , N a n g l e o f a n y e d g e I S t h e mean v a l u e b e t w e e n t h e f l r r t I L o Y + l ) and l a s t v a l u e s ( L O U t Z t i I N A L S l I 1 - 1 1 f i n a l edge r e a c h a b l e t h r o u g h t h i s edge LNDEX=LCY+FINALS(I)

a n g l e modulo ZIP' 1ilINDEX.LE.O) INDEX=lNDEX+FINZ XJ=XJ+XIINDEX,FINllAX) YJ'Y1*I(INDEX,FINMAX) c o o r d ~ n a t e sverter I

c

XIlI)=XJ Yl(i!ilJ

c c c

c 860 C

c

C

c

170

880

.+.+.+.

draw a l a m a r k tit**+* pr,lllb,y 1 n r t a 1 1 a t . o n dependant :ALL SYUBOL(XJ,IJ,SIIE,l,O.,-2) TIiTREEII) a u a ~ i a b l eb l a n c h e s ERAHCH(1)iTI IilT1.GT.O) G O T O 870 final brdnch,Increare iawei bound c f mean value LcY=LOY+z PATER=FATHERli1 o r d l odck t o fatbel XJ=XIlPATER) Y.=Yl(PATER) .+t+*+t p ~ i i l b l ?1 n r t a l l a t 1 o n dependsrt *+*+e+* :ALL P L G T ( X J . Y J . 2 ) b i a n c h "Sed ERAPiQRANCHlPATERI-1 QRANCH(PATER>=BRAP iFlBRAP.GT.3) GOTO 870 no more b r d n C h e i , s o b a c k t o f a t h e r ( I f e x i s t e n t ) l i ( P A T E R . E Q . 1 1 GOT0 870 PATER=iATHERlPATER1 happers ~ n f r e q u e n ; l y but O ~ e r s l lo n l y Once f o i e a c h GOTC 860 CONTINUE next ~ t a r icoora>nates 10iYOtlD l F l A B S I ~ 0 - Y 1 D E ~ . L E . ~ I D E - A B S l l D ~ + . G1 O~ T C 8 9 0 Y D i - I D '0=1O+Yc

I-value

X0=xo'xD 890

RETURN

END

Figure 4. Computer program.

the program will run into a contradiction first between RLOC and LBC. This could be excluded by additional tests for all

n

L

3 6 11 23 47 106 235 551 1301 3159 7741 19320 48629 123867 317955 823065

The corresponding diagrams of the (rooted) tree graphs may be obtained from the author on request. CDCCYBER 76.

components of all N tuples, but it proved less expensive to let the program generate all tuples and then ignore the contradicted ones. Even if many failures of this type occur before generating the next valid N tuple, the expense remains of the order N , because for every such failure the number of components generated and then canceled by searching for the innermost nonexhausted loop is less than or equal to the length of comparative branch of the current RLOC. Now this RLOC is no longer valid for the next trial (it has been satisfied by a smaller value) so that any new failure must originate from an RLOC whose comparative branch must be part of the rest of the newly generated N tuple. But this implies that the total number of backsteps cannot exceeed N . Therefore special provisions for this remaining failure case have been omitted. The logical function TREEKO (Figure 2) tests whether a given N tuple representing a rooted tree is the representative of the tree or the representative of the underlying tree. This is the case if no other vertex chosen as the root delivers a lexicographically larger N tuple. The comparison is only carried out for the nontrival cases where the new root has as many adjacent edges as the original root. To get the new N tuple, TREEKO makes extensive use of property ii to transfer the root-property edge by edge on the way from the old to the new root. The subroutine PLTREE (Figure 3) transforms the representation of a (rooted) tree as an N tuple to a graphic representation. It draws the Ariadne thread described by the N tuple and spreads final edges @e., edges which lead to vertices with no other edges) equally to all directions. Other edges take the mean value of the largest and smallest angles of final edges reachable through the edge concerned. The vertices were represented by a small octagon (SYMBOL, special symbol no. 1).

The computer program is listed in Figure 4. At the end of this section we want to say a few words about the efficiency of the program. The maximal expense to find out the next N tuple representing a rooted tree is of order N , because there are no nested DO loops and backward jumps occur perhaps infrequently but overall at most once for each step of the DO loop. The expense in PLTREE to produce graphic output from an N tuple is almost proportional to N for the same reason, and the same argument bounds the CPU time for a call to

J . Chem. Inf. Comput. Sci., Vol. 21, No. 2, 1981 91

COMPUTER ENUMERATION AND GENERATION OF TREES Table 11. Number of Identity Trees and Homeographically Irreducible Trees with N Vertices

no. of no. of

no. of vertices ( N )

identity trees“

homeographically irreducible tree%

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1 0 0 0 0 0 1 1 3 6 15 29 67 139 310 667 1480 3 244 7241 16104

1 1 0 1 1 2 2 4 5 10 14 26 42 78 132 249 445 842 1561 2988

The corresponding diagrams may b e obtained on request.

TREEKO to N X N . Therefore the average expense to find an N tuple representing a tree is of an order less than or equal to N X N X N because there are at most N rooted trees above, each of which is produced in order N , tested by the main program in order N and by TREEKO within order N X N (only if necessary). Actual runs of the program have shown that these extensive tests are seldom enough to allow estimation of the expense for the generation of trees proportional to the total number of vertices in all generated trees (physical plotter actions as well need real time proportional to the total number of vertices). The efficiency of the method is best illustrated by test runs with dummy PLOT and SYMBOL which save about 90% of CPU time. RESULTS In Table I we give the number of all trees and rooted trees for N = 1,2, . . ., 20 vertices and the CPU time (in s) needed for the calculation on the computing machine CDC-CYBER

76. These values were compared with the numbers given by Harary in his classical book “Graph Theory”.30 In a separate table (Table 11) we give the number of identity trees and homeographically irreducible trees with N vertices. Values for trees with N = 1,2, , . ., 12 vertices in Table I1 were checked against the numbers given by Harary and The diagrams of the trees, rooted trees, identity trees, and homeographically irreducible trees may be obtained on request from the Computer Centre of the University of Diisseldorf. If we wish to enumerate only trees corresponding to the carbon skeletons of alkanes (Le,, alkane tree graphs) or the rooted trees corresponding to the structures which could be generated from alkanes on substitution (i.e., alkane rooted tree graphs), the condition of the maximal valency of vertices must be set to four. In Table I11 we give the number of alkanes CNH2N+2, the number of alkane rooted trees with the primary, secondary, tertiary, and quaternary roots, respectively, the number of substituted alkanes, CNH2N+1X,and the total number of “rooted” alkanes, respectively. These results were checked against the numbers given by Read.13 The alkane diagrams may be also obtained on request from the Computer Centre of the University of Dusseldorf. However, as an example we present in Table IV the output listing containing graphs of all alkane trees with 11 vertices. We are quite aware that it is very difficult to augment the already rich literature on the enumeration of trees and alk a n e ~ . ’ , ~ ’However, .~~ we have presented here, and well documented, a method with several excellent features. (a) It enumerates trees and alkanes with high accuracy as any existing, cumbersome or elegant, method in literature. (b) It is easily adopted for the computer calculations. (c) It possesses an important advantage in comparison with other methods: it produces directly graphs of trees and alkanes. To our knowledge this is a unique method in this respect. (d) It could be easily adopted for the classroom demonstration of the chemical enumeration problem on the computer. Therefore, we believe that the presented method should be of general interest because it offers a very efficient approach for handling a class of important chemical (and graph-theoretical) structures by computer.

ACKNOWLEDGMENT Discussions with Dr. A. J. Harget (Birmingham) and Dr. T. givkovie (Zagreb) are gratefully acknowledged. We thank

Table 111. Number of Alkanes and Substituted Alkanes w i t h N Atoms

no. of atoms (N)

no. of alkanes“

no. of alkanes with a primary root

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1 1 1 2 3 5 9 18 35 75 159 355 802 1858 4347 10359 24894 60523 148284 366319

1 1 1 2 4 8 17 39 89 211 507 1238 3057 7639 19241 48865 124906 321198 830219 2156010

no. of alkanes with a secondary root

no. of alkanes with a tertiary root

no. of alkanes with quaternary root

no. of substituted alkanes

no. of “rooted” alkane trees

0 0 1 1 3 6 15 33 82 194 482 1188 2988 7528 19181 49060 126369 326863 849650 2216862

0 0 0 1 1 3 7 17 40 102 249 631 1594 4074 10443 26981 69923 182158 476141 1249237

0 0 0 0 1 1 3 7 18 42 109 269 691 1759 4542 11733 30559 79743 209136 549959

1 1 2 4 8 17 39 89 211 507 1238 3057 7639 19241 48865 124906 321198 830219 2156010 5622109

1 1 2 4 9 18 42 96 229 549 1347 3326 8330 21000 53407 136639 351757 909962 2365146 6172068

The corresponding diagrams may be obtained from the author on request.

98 J . Chem. InJ Comput. Sci., Vovol. 21, No. 2, 1981

KNOPET

Table IV. Alkane Graphs with 11 Vertices (Copy of the Computer Output)

1 t t

?

"t: L + L-c

t

L I

jx

'I

AL.

J . Chem. In& Comput. Sci. 1981, 21, 99-101

the reviewers for their helpful comments which led to improvement of the paper. REFERENCES AND NOTES (1) F. Harary, “Graph Theory”, Addison-Wesley, Reading, MA, 1972, p I)*

JL.

(2) (3) (4) (5) (6j (7) (8) (9) (10) (11) (12) (13)

Reference 1, p 187. G.Kirchhoff, Ann. Phvs. Chem., 72,497 (1847). A. Cayley, Philos. Mag., 13, 1-2 (1857). C. Jordan. J . Reine Anpew Math.. 70. 185 (1869). . , A. Cayley; Philos. Mag., 67, 444 (1874). A. Cayley, Ber., 8, 1056 (1875). F. Herrmann, Ber., 13, 792 (1880). S.M. Losanitsch, Ber., 30, 1917 (1898). F. Herrmann, Ber., 30, 2423 (1898). S.M. Losanitsch, Ber., 30, 3059 (1898). F. Herrmann, Ber., 31, 91 (1898). R. C. Read, in “Chemical Applications of Graph Theory”, A. T. Balaban, Ed., Academic, London, 1916, p 25. (14) G.Wlya, Acta Math., 68, 145 (1937). (15) R. Otter, Ann. Math., 49, 583 (1948). (16) L. Weinberg, Proc. IRE, 46, 1954 (1958).

(28) (29) . , (30) (31) (32)

99

R. C. Read, in “Graph Theory and Applications”, Y. Alavi, D. R. Lick, and A. T. White, Eds., “Lecture Notes in Mathematics”, No. 303, Springer-Verlag, Berlin, 1972, p 243. H. Schiff, Ber., 8, 1542 (1815). F. Tiemann, Ber., 26, 1595 (1893). H. R. Henze and C. M. Blair, J . Am. Chem. SOC.,53, 3042, 3077 ( 1932). C. M. Blair and H. R. Henze, J . Am. Chem. SOC.,54, 1098, 1538 ( 1932). D. Perry, J. Am. Chem. SOC.,54, 2918 (1932). D. D. Coffman and C. M. Blair, J . Am. Chem. Soc., 55,252 (1933). H. R. Heme and C. M. Blair, J . Am. Chem. SOC.,55, 680 (1933). H. R. Henze and C. M. Blair, J . Am. Chem. SOC.,56, 157 (1934). C. C. Davis, K.Cross, and M. Ebel, J. Chem. Educ., 48,675 (1971). See, for example, various articles in “Graph Theory and Computing”, R. C. Read. Ed.. Academic. New York. 1972. L. M. Masinter,”. S.Sridharan, J. Lederberg, and D. H. Smith, J. Am. Chem. Soc., 96, 7702 (1974). J. B. Hendrikson, J. Am. Chem. Soc.,93,6847 (1971); 93,6854 (1971); 97, 5763 (1975); 97, 5784 (1975). Reference 1, p 232. F. Harary and G.Prins, Acta Math., 101, 141 (1959). D. H. Rouvray, Chem. SOC.Reu., 3, 355 (1974).

ACS Committee on Nomenclature: Annual Report for 1980 KURT L. LOENING Chemical Abstracts Service, Columbus, Ohio 43210 Received February 17, 1981 Nomenclature committees, both national and international,were very active in 1980, resulting in substantial progress in many different fields. A summary of the more important meetings and accomplishments follows. The ACS Committee on Nomenclature held its annual meeting at CAS in November.’ Progress of the work of the divisional committees and international commissions was reviewed. In addition, ways of working more closely with ACS Divisions, journal editors and authors as well as general means of promoting good nomenclature were explored. The chairman of the committee addressed the editors of ACS journals on the subject of chemical nomenclature at their conference in Columbus; this should lead to closer cooperation between the two groups. The feasibility study on compiling an authoritative chemical dictionary was extended, while the project on visual aids for chemical nomenclature was dropped. Contact was established between the committee and its recently established British equivalent. The subcommittee on chemical pronunciation continues to be active. The IUPAC Interdivisional Committee on Nomenclature and Symbols (IDCNS) functioned effectively this year. It held its annual meeting in Cambridge in September. In addition to the IUPAC publications listed in the Appendix, specific documents in process and thus not yet recorded in this Appendix deal with the following topics: straightforward transformations, transport phenomena, biochemical equilibrium data, chemical kinetics, physicochemical quantities and units in clinical chemistry, calorimetric measurements on cellular systems, and various classes of carbohydrates. The IUPAC Inorganic Nomenclature Commission met in September in Cambridge. Topics under discussion included neutral molecules and compounds, ions and radicals, rings and chains, polyhedral clusters, isopoly- and heteropolyanions, oxo acids, inorganic polymers, and stereochemical nomenclature. These topics were discussed in the context of providing a ‘Abbreviations used, not identified in the text, are ACS, American Chemical Society; CAS, Chemical Abstracts Service; IUPAC, International Union of Pure and Applied Chemistry; JCBN, Joint Commission on Biochemical Nomenclature; NC-IUB, Nomenclature Committec of International Union of Biochemistry. 0095-2338/81/1621-0099$01.25/0

revision of the 1970 edition of the Red Book. Revised recommendations on the nomenclature of nitrogen hydrides and isotopically modified compounds are expected to be issued next year. The IUPAC Organic Nomenclature Commission met in September in Cambridge. The commission continued its study of the reorganization and revision of the present rules according to a more logical arrangement (Section R) and of a more drastic long-range approach (Section G). In connection with Section G, several specific projects are under way: nodal nomenclature, radial nomenclature, “inorganic” ring nomenclature, nomenclature for delocalized ions and radicals, nomenclature of oxo acids, and general priority rules for numbering. The following topics are so well advanced that publications should be forthcoming within a year or two: lambda convention, classical ions and radicals, cyclophanes, and a revision of the Section E rules on stereochemistry. The 1979 provisional recommendations for the revision of the Hantzsch-Widman nomenclature system for naming heteromonocycles have generated so many diverse opinions and comments that the commission requires additional time and study before issuing definitive recommendations. The IUPAC Macromolecular Nomenclature Commission met in September in Naples. The commission is continuing its work on (a) nomenclature and symbolism of copolymers, (b) subsidiary definitions of terms relating to polymers, (c) definitions for physical properties of polymers, (d) substitutive nomenclature for reacted polymers, (e) nomenclature of inorganic polymers, (f) classification and family names of polymers, and (g) interpenetrating polymer networks. Of these items (a), (e), and (f) are at the most advanced stage with recommendations expected to be issued in 1981 or 1982. A definitive version of the recommendations dealing with stereochemical definitions and notations relating to polymers will be issued in 1981. 0 1981 American Chemical Society