On Randic's molecular identification numbers - ACS Publications

up to 20 atoms produced 124 duplicates and 1 triplicate. Thus, it is ... Randic defined the mapping from the edge set to the real numbers g'-E - .... ...
0 downloads 4 Views 273KB Size
J. Chem. If. Comput. Sci. 1985, 25, 413-415

413

On Randie's Molecular Identification Numbers K. SZYMANSKI, W. R. MULLER, J. V. KNOP,* and N. TRINAJSTI@ Computing Centre, The University of Diisseldorf, 4000 Diisseldorf, Federal Republic of Germany Received February 8, 1985 The search for counterexamples to RandiE's molecular ID numbers in the field of alkane trees up to 20 atoms produced 124 duplicates and 1 triplicate. Thus, it is shown that the ID numbers, although highly discriminating indices, are not unique. RandiE has recently introduced in this journal a new topological (or more correctly, graph theoretical) index, named molecular identification (ID) number, in order to identify graphs by a real number.' H e defined ID number in the following way. Let G be a graph with the vertex set Vand the edge set E . If uI, u2 E V, we denote by e = (ul, u2) E E the edge connecting u, to u2. By deg u we denote the degree of a vertex u in G, Le., the number of edges incident with u . ~

RandiE defined the mapping from the edge set to the real numbers g:E

-(

R

(1)

by

g(4 = du,,02) =

1 (deg D, X deg u 2 ) ' / *

(2)

TT

n = m i oi o i x a o o o o

ID

-

" 2 0 0 I22100 1000

5 8 4 4 w 1 m ) / 1944 t 27.9964626030828499723

34794+ 33-

TR4313110001010000

ID

=(

46940+ 9 6 9 M

I / 2040 t 29.61648332349611192040

#

~ ~ ~ ~ 3 2 2 2 o o o o ~ o o o o o "o2oa o o o z o o o o o a a o o

ID

=(

+ 631m

322014

)/I1664

TL=43233000100110100000

ID

=(

36S456+ 4332m+ IO-+

4$

TLpS13131010011100000 ID

-(

207968+ 4 7 6 m

t 36.987224730803166Sl626

TR=43231010011300000000

40-)/122W

5 37.05950240070979235227

Md31300011001101100 )/

TRd31311011010001000

8142 it 33.6't2S935302163407369

Figure 1. Some alkane trees with exactly equal ID indices. TL, TM, and TR stand for left tree, middle tree, and right tree. The first set of numbers beneath each tree is the N-tuple representation of the tree (see reference 4).

'Permanent address: The Rugjer BoHkovif. Institute, 41001 Zagreb, Croatia, Yugoslavia

0095-2338/85/1625-0413$01.50/00 1985 American Chemical Society

SZYMANSKI ET AL.

414 J . Chem. If. Comput. Sci., Vol. 25, No. 4, 1985

TL=41~110010 122000000

w 2 212100111020 0 200 0 0

IDL-( 791812+ 8 8 3 8 O n +8444348856)/31104 t 37.60529094692093767173 IDR=( 80917% 6799M+10434Om 341401J6)/31104 t 37.60529094692727095027

TL=4121010 1122100200100

TM=42112100111200210000

TRdl121222110 100010000

IDL=(1007568+18819 ~5 2 0 8 m 518641J6)/41472 t 37.75023900649472001226 IDM= ( 1 0 4 6 2 8 0 + 1 3 O 4 1 6 ~ 1 0 3 7 2 8633601J6 ~ 1141472 t 37.75O239O099!5120158394 IDR= ( 1 0 6 2 9 5 5 + 1 ~ 1 4 0 ~7+S l 8 4 m 6 3 6 1 M1141472 t 37.750239012706619022!59

TL=32222001102100000

TR42100 11310 100 1100

IDL=( 123765+ 1002134392E)/ 5832 t 31.83420784584120972244 IDR=( 133584+ 3 4 7 2 8 m 4 8 6 4 m 1824E6,l 6144 t 31.83420784715891615S86

TL=32211121001102110000

TRe41111111311101001000

IDL= ~1S67512+1994&5+146544~+106!512~)/62208 t 38.0072071 1417147962952 IDR= (1976406+3637600 )I65536 t 38~0072071143926552S745

TL=432220020020000000

TR=433113000100001000

IDLE( 102390 + 15405\Js I-( 11144% 166420

I / 3888 0 33.19759328461896983199 6 , / 4096 5 33.19759328735670112357

Figure 2. Some alkane trees with nearly equal ID indices. TL, TM, and TR stand for left tree. middle tree, and right tree. The first set of numbers beneath each tree is the N-tuple representation of the tree (see reference 4).

Let p = e l , ..., e, ( m > 0) be a path in G; then, the mapping g can be extended to the set of paths in G by g*Q) = f i R ( e i ) i= 1

Table I. Distribution of the Duplicate ID Numbers

no. of carbon atoms 15 16 17

(3)

The ID number of G is then finally defined as ID(G) = N

+ cP g * @ )

(4)

where N is the number of vertices in G and the summation is taken over all different paths in G. The ID number appears to be an attractive topological index that is relatively easy to derive and has structural significance. The question: Are the ID numbers unique?, however, was not answered by RandiE. He examined over 400 graphs and found no pairs of graphs with the same ID number. In this paper we will try to answer on the above question for alkane trees. Alkane trees are trees representing alkanes.2 A tree is a connected acyclic graph. A graph is acyclic if it has no cycle^.^ Since we are in position to generate all molecular trees (and subsets of trees) up to a given number of ~ e r t i c e s , ~we ” computed the ID numbers for all alkane trees up to 20 vertices

~

.

18 19 20

~

no. of duplicate ID numbers 1

no. of alkane isomers

_

4 347 10 359 24 894 60 523 I48 284 366_ 319

1

3 8 ( + I triple) _

23 88

(atoms). The following result was obtained: In the field of 618 050 alkanes (all alkanes up to 20 carbon atoms) there are 1975 pairs and 10 triples of nonisomorphic structures having the same ID numbers. This computation has been done with a single precision (by a word length of 48 bits). In order to check on this result and to find the most accurate value of the ID numbers for this class of alkanes, we used our own arithmetic (computing in terms of 1, 2Il2, and 3Il2 over the field of rational numbers),’ which uses only integers in computer operations. Now. we found only 124 pairs and 1 triple of nonisomorphic alkanes having exactly the same ID number.

J. Chem. In$ Comput. Sci. 1985, 25, 415-419 Some of these are shown in Figure 1. In Table I we give the distribution of duplicates in each set of isomeric alkanes. We searched the class of alkanes for counterexamples because the special structure of trees allows a very efficient method of computing the ID numbers (20 000 computer operations for a single alkane). The obtained results may be summarized as follows: (1) The ID numbers are highly discriminating indices, but they are not unique. (2) There exist structures with very small differences in their ID numbers (see Figure 2). (3) For complicated structures (e.g., polyhexes) the ID numbers are not easily computed.8 In concluding this paper we point out that with this work we once again demonstrated the usefulness of developing generating algorithms that produce all members of a given family of (chemical) graphs and that thus make easier the check on many conjectures proposed in the field of (chemical) graph theory. ACKNOWLEDGMENT We thank Professor Milan Randie (Ames) for correspondence on ID numbers and for allowing us to examine his paper prior to the publication. We are also thankful to W.

415

J. Wiswesser (Frederick, MD) and to the referees for their comments. This work was supported in part by the German-Yugoslav scientific cooperation program. The financial support by the Internationales Biiro, Kernforschungsanlage Jiilich, and by the Republic Council for Science of Croatia (SIZ 11)is gratefully acknowledged. REFERENCES AND NOTES RandiE, M. J. Chem. In$ Comput. Sci. 1984, 24, 164. TrinajstiE, N. “Chemical Graph Theory”; CRC Press: Boca Raton, FL, 1983. Harary, F. “Graph Theory”; Addison-Wesley: Reading, MA, 1972; third printing, Chapter 4. Knop, J. V.; Miiller, W. R.; JeribviE, 2.; TrinajstiE, N. J. Chem. InJ Comput. Sci. 1981, 21, 9J. TrinajstiE, N.; JeriEeviE, Z.; Knop, J. V.; Miiller, W. R.; Szymanski, K. Pure Appl. Chem. 1983, 55, 370. Knop, J. V.; Mtiller, W. R.;Szymanski, K.; TrinajstiE, N. “Computer Generation of Some Classes of Molecules”; SKTH Press: Zagreb, Yugoslavia, 1985. Lang, S. “Algebra”; Addison-Wesley: Reading, MA, 1969; p 175. We carried out some computations of ID numbers for polyhexes. In this case we found that a prohibitive amount of computer time is needed. For example, in order to calculate the ID number of a plyhex with 10 rings, we needed 240 million computer operations (300 s of CPU time on a 0.8 MIPS computer). Thus, it was impossible to carry out the same analysis for the class of polyhexes.

End-User Searching: The Amoco Experience? ROBERT E. BUNTROCK* and ALDONA K. VALICENTI Amoco Research Center, Amoco Corporation, Naperville, Illinois 60566 Received December 18, 1984 The history of training scientists and engineers to do their own online searching of technical information is reviewed briefly. Searching services have been provided at Amoco for decades, computer searching for 15 years, and online services since 1973. In late 1981, it became apparent that several Ammo scientists and engineers wanted to learn to do at least some of their own online searching. We viewed this phenomenon as an opportunity, not a threat, and have provided training and assistance to our active, growing group of end-users since early 1982. We have divided the training into two parts: classroom background for local information resource awareness and individualized, hands-on training in online searching techniques. The program seems to be quite successful in retention of trained end-users, and most, if not all, participants seem to have significantly enhanced awareness of searching technical information. T a n I do my own computer searching?” “Should I do my own computer searching?” Chemical information specialists have occasionally been asked these questions for the last decade, and the frequency is increasing. There seems to be no single reason for the increase in curiosity by the chemist on this subject, but greater familiarity with computers and computing is probably most important. Using titles similar to “The Library at Your Fingertips”, authors in the recently erupted microcomputer trade press would have you believe that everyone will soon have their own microcomputer and will do all of their own searching. Although we do not believe this extreme scenario will happen, we do believe that end-user searching by at least some chemists and chemical engineers is here to stay. We will present a brief history of end-user searching and will then describe our experiences in training Presented, in part, before the Division of Chemical Information, “Symposium on Training Chemists To Do Their Own Computer Searching”, 187th National Meeting of the American Chemical Society, St. Louis, MO, April 11, 1984; American Chemical Society: Washington, DC, 1984; CINF 32; and “Symposium on Direct End-User Access to Chemical Information”, 3rd Joint Meeting of the ACS Great Lakes Region and Central Region,Western Michigan University, Kalamazoo, MI, May 25,1984 American Chemical Society: Washington, DC,198%paper 154. 0095-2338/85/ 1625-0415$01.50/0

end-users at Amoco Corp. and the Amoco Research Center. First, some definitions. “End-user searching” has come to mean literature searching by the eventual recipient of the information, namely, the customer, client, the expert (or the would-be expert), or decision maker. This ”end-user”, if an expert in a technical subject, is usually in research. Computer-based searching, specifically online, is implied. Apparently, end-user searching appeared on the scene shortly after the beginning of readily available online searching in general, which, by our definition, is July 1973. We recall end-users in training classes for online systems or data bases as early as 1974-1975. By 1976, a paper had appeared titled “Nonmediated Use of Medline and Toxline by Pathologists and Pharmacists”,’ and additional work was cited. In 1977, Charles Meadow addressed the topic,* and in 1979 he published the pivotal paper of the field.3 Meadow briefly reviewed the history of programming, and more specifically programmers, and then compared the two with online searching and search intermediaries on nine key issues. It is not our intent to summarize the paper extensively, but some quotes are very interesting. Programmers and intermediaries are described as having “...the keys to the kingdom ...” and are often found 0 1985 American Chemical Society