Notational Systems for Structural Formulas - C&EN Global Enterprise

Nov 5, 2010 - This is a serious disadvantage as speech plays an important role in human communication. In the days before structural formulas came int...
0 downloads 0 Views 591KB Size
of the executives. It also means that there is very little "old school tie" involved in the choosing of the chemical industries* executives. In a f e w instances the alumni of one school dominate the management of a company, but in general there is seldom more than one representative of any one school in a company's management. The t o p five universities in the production of chemical executives were all traditional eastern schools. Harvard topped t h e list with 3 4 aided b y nine graduates from its famous business school. Yale c a m e next with 2 3 , then MIT, 2 0 ; Princeton 18, and Columbia 17. All in all the schools of the Ivy League along w i t h MIT, Williams, and Amherst placed 131 m e n o n this roster of management. B y comparison t h e much larger Big Ten Schools plus Iowa State and Chicago contributed only 4 4 . . T h e highest individual representation from outside the East came from the M i d West's University of Michigan with 11. The West's biggest producer w a s the U n i versity of California, 9, and the South's best executive trainer was the University of Virginia, 5. The technical schools, with a big boost from M I T , taught 4 4 of these executives. The liberal arts colleges, which have b e e n so productive of top-notch scientists, apparently do pretty well on executives, too. They h a v e 61 representatives on the list. Foreign schools were still strongly represented with 3 2 entries. Almost half of these ( 14 ) were from universities in the British Isles, although the largest single credit goes to the University of Berlin with four. What D o e s It Mean? Admittedly, these data were not chosen by accepted sampling procedures. H o w ever, there were enough random factors present to make them at least generally indicative. T h e y show that management in the chemical industries has b e c o m e a highly specialized skill, in which some technical experience is becoming increasingly important. Financial connections, boldness, alertness, or business shrewdness, are still valuable assets to the policy maker but t h e y are no longer adequate in themselves. T h e technically based industries are requiring increasing amounts of technical competence in their top men, and the complex and extensive organizations necessary in modern mass production require men w h o have mastered the special techniques of industrial administration. It is net correct to say that the technical man is taking over the chemical industries b e cause the technically trained man is no longer a scientist or an engineer w h e n he attains the top levels of management. However, the man w h o attains that level today must almost invariably have not only an appreciation but an understanding of the technical aspects of his company's operations, whether he obtains that understanding through formal training and practical experience or through acute observation. V O L U M E

3 0,

NO.

5

»

»

Notational Systems for Structural Formulas MADELINE M . BERRY AND JAMES W. PERRY . Massachusetts Institute of Technology, Cambridge, Mass.

The SUPAC picks the Dyson system as the best available a n swer to needs, but several others have valuable ideas which may be incorporated before making the final decision

JLOR many purposes, the structures of molecules can b e conveniently represented b y the two-dimensional projections commonly known as structural formulas. Their pictographic character, so helpful in thinki n g through chemical problems, has prevented extensive use for comprehensive indexes. In the first place, structural formulas are difficult to print. Furthermore, devising and applying rules for arrangi n g pictographs into ordered arrays is a difficult task. Structural pictographs are also unsuited for use with automatic equipment such as I B M puAched-card machines to search chemical literature. These limitations are regrettable, as the theory of molecular structure forms the framework of modern chemistry. Except for t h e simplest types, structural pictographs cannot b e pronounced. They are, in the literal sense, unspeakable. This is a serious disadvantage as speech plays a n important role in human communication. In the days before structural formulas came into general use, chemists started giving compounds trivial names, such as acetic acid or urea, usually suggested by t h e origin of the compound or some typical property. As more and more n e w compounds of greater and greater complexity were synthesized, chemists began to name them by indicating their relationship to simpler compounds and to component atoms a n d groups. Starting with the older trivial names as the basis, chemical nomenclature developed as a specialized branch of language with rules resembling those encountered in etymology and grammar, with exceptions to the rules (some exceptions being accepted as good u s a g e others frowned upon as undesirable chemical slang) and with variations analogous to dialects in use by differ int editorial centers such as Chemical Abstracts, Chemisches Zentralblatt, Beilstein. A completely logical system of nomenclature has never been developed and, if it were, usage w o u l d erode its contours. The important role of speech in human communication makes it certain that chem-

FEBRUARY

4,

1952

ists will always need names for compounds. Nevertheless, dissatisfaction with previously developed nomenclature as a means for expressing molecular structures has developed and various chemists during recent years have worked out notational systems which express molecular pictographs as a linear sequence of symbols most of which—if indeed not all—are found o n an ordinary typewriter keyboard. (The coauthor, Dr. Perry, notes that there is also no code existing which can utilize machine methods to record or correlate reaction rates, reaction conditions, and important physical or chemical properties. He has under preparation a n e w system to rectify this deficiency.) Since a completely systematic notation for representing structural pictographs should prove a valuable tool, a commission of the International Union of Pure and Applied Chemistry was appointed late in 1946 t o study the possibilities. A milestone in the work of this commission was its meeting in Amsterdam in the summer of 1 9 4 9 at which time the following desiderata were set up for evaluating notation systems. 1. Simplicity of usage. 2. Ease of printing and typewriting 3 . Conciseness 4. Recognizability 5. Ability to generate a unique organic chemical nomenclature 6. Compatability with accepted practices of inorganic chemical notation 7. Uniqueness 8. Generation of an unambiguous and useful enumeration pattern 9. Ease of manipulation by machine methods, e.g., punched cards 10. Exhibition of associations (descriptiveness ) 11. Ability to deal with partial indeterminants At its Amsterdam meeting, the commission decided to invite the inventors of notation systems to submit them for c o n 407

sideration· (3). Several systems—nine in all—came to the commission's attention. Following submission to the commis­ sion o£ their systems, the inventors were requested to apply them to a list of 700 compounds selected from every 100th page of Beilstein, and also to supplementary lists which included compounds contain­ ing radioactive isotopes, organosilicon and organophosphorous compounds, and others presenting difficult notational problems. During the spring of 1951, the writers pre­ pared a book-length report in which the CJieinical Abstracts name of each com­ pound was followed by the notation pro­ vided by each of the nine systems under consideration (2). Last August at MIT, members of the International Union's com­ mission devoted a week to a series of in­ formal and unofficial conferences in which members of the ACS Committee on Scien­ tific Aids to Literature Searching also par­ ticipated. The desiderata were discussed and the nine notation systems were re­ viewed {9 JO). It was observed that three (1,12,13) of the nine systems were not designed to spec­ ify structural formulas in full detail, hut were intended for correlating and search­ ing procedures using mechanical aids or for decimal classification. To this end these three systems designate constituent groups, both functional and nonfunctional, by appropriate code symbols built up from numbers and letters. Without question­ ing their effectiveness for special pur­ poses (NRC code for use with standard IBM machines; Wiselogle for correlation using punched cards; van Weerden for decimal classification) the International Union's commission, at its meeting last September in New York, decided that these three systems were unsuitable for an internationally useful notation as they failed to provide a complete specification of molecular structural formulas. At the same meeting, it was decided that, of the other six systems, the Dyson notation best meets the desiderata. As a consequence, this system was accepted as the provi­ sional international notational system. It was further decided that particular atten­ tion should be given to the possibilities of including in the Dyson system desirable features of all other notations (9,10). The American role in this project is to study the various systems from the viewpoint of coding and decoding problems, of struc­ ture searching and related problems. A committee of the National Research Coun­ cil, with Walter Kirner as chairman, is ini­ tiating a development program. In this program, the Dyson system will be taken as a basis, into which will be in­ corporated the best ideas of all the sys­ tems submitted in order to develop a final system more satisfactory than any of them. What is involved becomes clearer on mak­ ing comparisons. Differences exist between the systems with regard to the symbols— and the definitions of symbols—used to represent atoms and groups. Different procedures are also followed in showing how atoms and groups are linked together. 408

One approach is to build up a structure— or some part of a structure—by proceeding from one atom or group of atoms to the next. Another is to cite first the framework of the molecule and then indicate other structural features as additions and substi­ tutions. It is particularly instructive to consider together the systems developed by Dyson(6), Gruber(S), S i l k ( I I ) , and Wiswesser (14). .For certain simple aliphatic structures, these four systems are in close agreement as to procedure. Consider, for example, 1-dodecanol (CHs[CH 2 ]iiOH), sometimes called lauryl alcohol: Dyson: C12.Q

Gruber: C12.QH

Silk: QCn Wiswesser: Q12

In treating condensed ring systems, Dyson (see Fig. 1) specifies that the symbol Β for benzenoid or A for cycloaliphatic be first cited to de­ note whether the ring sys­ tem is predominantly benze­ noid or cycloaliphatic. Next attention is directed to tne individual rings. Full-size numerals are used to desig­ nate the number of atoms in a ring and, if two or more rings of the same number of atoms are present, a subscript is used to indicate the multiplicity. Thus Dyson uses B6 e to show that our example is benzenoid in nature and contains six rings each built up from six atoms. To indicate how the rings are joined to­ gether, Dyson numbers the atoms in the rings and then cites the numbers assigned to atoms at ring interfaces. In order to establish a uniquely defined enumeration, Dyson numbers the largest ring first and then numbers rings fused thereto, pro­ ceeding from the atom having the lowest previously assigned number. The point at •which to start numbering the largest ring is chosen so that the smallest possible numerical value shall characterize the first ring locant which would be changed if another starting point were chosen. This

In designating our ex­ ample compound, Gruber (see Fig. 2) first indicates the central linear system of three benzene rings as III 6]. Ring positions in this sequence are identified with numerals assigned in succes­ sion around the outside of the frame. Then fusion of the other benzene rings, sym­ bolized by [6], to this linear skeleton is designated by single numbers equivalent to Dyson's successive-integer locants. Gruber uses a rule similar to Dyson's for numbering subsequent rings, starting at the lowest carbon in the previous ring. Thereafter the. substituent groups and CHEMICAL

Three of the systezms use the numeral 12 either alone (Wis^wesseor) or in conjunc­ tion with the carbon atom symbol C (Gruber, Dyson). The hydroxyl group is indicated by the familiar O H ( Gruber ) or by a special symfc>ol Q (Wiswesser, Dy­ son ) . The same symbol is used by Silk to indicate the CBkrOH group, while the balance of the carbon chain is indicated by Cai. All four srystems assume that the carbon atoms are -in a straight chain and that all carbon valencies not otherwise en­ gaged are saturated -with hydrogen. It should be noted tfciat two systems ( Dyson, Gruber) cite the carbon chain first, while Silk and Wiswesser start with the hy­ droxyl group. A more complioated aliphatic, molecule,

SO3H Figur>e 1, Dyson B6e,l,3,5,8,19.ZN,7,10.Q,25.S3,21 rule for starting t h e enumeration so as to minimize the valixe of the first ring locant which would be changed by selecting an­ other starting poixit is also applied to ring systems composed, of rings of equal size. I n the general