Index Chemicus Registry System: Pragmatic Approach to Substructure

Landee,Franc A., “Computer Methods of Handling Files of Chemically Oriented Information,” unpublished paper presented in Moscow, USSR, Oct. 1965. ...
0 downloads 3 Views 493KB Size
GARFIELD, REVESZ,GRANITO, DORR, CALDERON, AND WARNER (3) Bowman, C. M., F. A. Landee, M. H. Reslock, and B. P. Smith, “Automatic Generation of Structural Fragment Codes From the Wiswesser Line Notation for Rapid Structure Searches,” Proceedings of the Wiswesser Line Notation Meeting of the Army Chemical Information and Data Systems, James P. Mitchell, Ed., pp. 49-56, EASP 400-8, Edgewood Arsenal, Md., 1968. (4) Farris, R. N. “Computers Cut the Cost of Literature Searches,”Chern. Eng. Progr. 62 (5), 89-91 (1963). (5) Hyde, E., F. W. Matthews, L. H. Thomson, and W. J. Wiswesser, “Conversion of Wiswesser Notation to a Connectivity Matrix for Organic Compounds,” J. CHEM.DOC. 7, 200-204 (1967). (6) Landee, Franc A., “Computer Methods of Handling Files

(7)

(8)

(9) (10)

of Chemically Oriented Information,” unpublished paper presented in Moscow, USSR, Oct. 1965. Landee, Franc A., “Computer Programs for Handling Chemical Structures Expressed in Wiswesser Notation,” Presented before the Division of Chemical Literature, 147th Meeting ACS, April 8, 1964. Opler, A., and T. R. Norton, “A Manual for Programming Computers for Use with a Mechanized System for Searching Organic Compounds,” The Dow Chemical Co., Western Division, Pittsburgh, Calif., 1956. Smith, E. G., The Wiswesser Line-Formula Chemical Notation, McGraw-Hill, New York, 1968. Smith, E. G., “Machine Sorting for Chemical Structures,”

Sckn~e131, 142-146 (1960).

Index Chemicus Registry System: Pragmatic Approach to Substructure Chemical Retrieval* EUGENE GARFIELD, GABRIELLE S. REVESZ, CHARLES E. GRANITO, HAYES A. DORR, MARIA M. CALDERON, and ANDREA WARNER

Institute for Scientific Information (ISI), Philadelphia, Pa. 19106 Received July 21, 1969 The Index Chemicus Registry System (ICRS), launched in 1968 with the support of a dozen industrial and government organizations, is now a current operational monthly service. Subscribers receive magnetic tapes and printouts, in which the weekly issues of Index Chemicus (IC) have been encoded in Wiswesser Line Notations (WLN). Over 13,000 compounds per month are provided in machine language. The canonical WLN is also provided in alphabetized printouts. Encoding of over 400,000 new chemical compounds from IC has already been completed, including all those reported in 1967, 1968, and 1969. Since the tapes also include title and other bibliographic information, this paper describes the use of supporting software provided for SDI search systems employing ”word” and other searching terms, in addition to the WLN fragments. Use of the monthly and annual printouts are illustrated for those searches which do not require computer manipulation.

The ICRS is designed to provide chemists with current and retrospective chemical information reported in the

IC. As IC has been described elsewhere,’ it is sufficient to state that IC provides detailed abstracts of journal articles which report new chemical compounds or new chemical reactions. The ICRS has, as yet, not been described in the literature, and a brief description of its main characteristics is necessary to enable one to understand how to search for substructures, both currently and retrospectively. ICRS consists essentially of four data files: WLN magnetic tapes, I C bibliographic tapes, WLN printouts, and IC weekly issues. The WLN magnetic tapes contain unique WLN structural descriptions of all new compounds reported in the Index Chemicus and are arranged in abstract number sequence. The WLN tapes also contain molecular formulas and I C registry numbers, which identify a specific line “Presented in part at the ACS MARM Meeting. Washington, D. C., February 14, 1969, and before the Division of Chemical Literature, 157th Meeting, ACS, Minneapolis,

Minn., April 16, 1969.

54

in the numbered IC abstract where a structural diagram and other information is given. The IC bibliographic tape provides, in machine language, most of the information provided in the printed IC: bibliographic citations; codes for new reactions and analytical instrumentation; subject-index terms which are assigned by chemists and include terms related to the properties, uses, and biological activity of the compounds. The WLN printout version is alphabetized according to the WLN, to provide easy scanning for similar type compounds. The corresponding article from the IC can be identified through the registry numbers associated with the notation. Many searches can be done by simply referring to the monthly or annual ICRS printouts. The search for substituted adamantanes is shown in Figure 1. The WLN notation for adamantanes is L66 B6, etc. The ICRS printout identifies abstract number 101318, which contains several adamantanes, each of which is separately encoded. The IC abstract is shown in the lower portion of Figure 1. The printouts are also used to formulate machine-search questions.

JOURNAL OF CHEMICAL DOCUMENTATION, VOL. 10, KO. 1, FEBRUARY 1970

PRAGMATIC APPROACHTO SUBSTRUCTURE CHEMICAL RETRIEVAL I N D E X C U E M I C U S RECISTRV SYSTEM

MLU

L b b Ob L C I Eb L b b Ob LbI E b Lab Lbb LCb Lbb Lab LbI Lbb Lbb Lbb L6b Ltb

A A C A

Bb Eb Bb Ob E6

A A A A A 0b A

E6 Eb 0b Ob E?

C A A A A

0- C 0- C E- C 8-

C

000E00E00RB-

C C C C C C C C C C C

L b b E? A 0- C L b b E7 C 8-

C

L b b 8 1 A 0-

C

AESTR.CP0 NO. NO. AMSUR O M V X F F F I O 0 1 3 b - 5 A M V R O N M ...a. 1 AMVR D Z 2 0- E T S K V V O J 10131E- 7 AR CC I 1 T J E- E T S N V V O J 100142-1 1 I T J OMVZUIICN 100120-74 I T J au2I T * 0NZCZO I T J BVMVYR I T J 0VMVVG 3 I T J E V U V V L e . e.... 4 I T J EVUCO IO I T J BVNRCVVG 6 1 T J 0VUMCOVVG 12 J 1 X T J HG 1-6 ETSMVVOXJ 11 J 1 X T J nc I-c E T S N V V O X J AR 2 J 1 X T J H O 2 I-C .a. ETSMVVOXJ a J 1 X T J H07 I-G E T S N V V O X J AR

I T J I T J ITJ ITJ

..