Computerizing scientific bibliographies - ACS Publications

Double Bond. Heteroeyolies: 0. Hoterocyelios: N. Heterocyclics: S, eto. Polymers. Iona. Free Radhals. Complexes, Metals, Grignards. F Analogs: Saturat...
2 downloads 0 Views 2MB Size
Lana S. RaRet and J. H. Galdsfein Emow Universitv Atlanta, Georgia 30322

(

I

Computerizing Scientific Bibliographies

This note describes the application of a digital computer to the storage and selective retrieval of bibliographical material. We believe that the method trn~~lo~ r d bt: of ;w;ist~ncrro hot11 students and rewill searchers faced with the necessity of effiriently handling the ever-growing volume of scientific literature.' The specific problem was that of assembling, ahstrace ing, and cataloging a large number of literature references pertaining to lac-H satellite nmr spectra. For this purpose a program was written which would: (I) accept and store the bihliographical material, according to a preassigned coding system; (2) permit search and print-out of the stored material according to a selected indexing and cross-indexing scheme; and (3) allow continuous updating of the bibliography whenever desired. The computer system available was an IBM 1620-11 with paper tape input, typewriter output, 40,000 posi? tions of core storage, indirect addressing, and external (random access) memory (IBM 1311 Disk Drive). The necessary software included the Monitor I system, which incorporates the SPS 11-D assembler. We found that Fortran 11-D, the normal mode of programming this computer configuration, was not designed to handle the combination of numeric and linguistic information involved. This was overcome by employing the Disk-oriented Symbolic Programming System (SPS 11-D), which is more flexible than Fortran, yet less tedious than machine language. The desired functions were achieved hv writing two major programs, LOAD and SEARCH; and-two smaller programs, CORRECT and TOTAL TYPEOUT (or PUNCH OUT), all of which were permanently stored, in the core image, on the disk. I n practice these programs have performed with surprising speed and effectiveness. After a brief period of initial instruction, anyone can use the full search procedure without involvement in the details of the program. Up-dating of the bibliography requires a small amount of additional instruction and experience. One of the most gratifying results obtained is the output of unexpected correlations, arising both from the flexibility of the procedures and the speed with which index correlations can be carried out. The errors and oversights encountered in the manual handling of large quantities of bibliographical data are essentially absent in the computerized procedure. Search The indices used for classifying the material and their corresponding codes are listed in Table 1. As the program was written, only one year or author can he

examined in a single search, hut such searches can be entered sequentially at the typewriter. Otherwise, any combination of codes can be linked to guide the search. For example, if the code input is 01 03 06 494= the articles in Table 2 will be selected and typed out, and all others will be rejected. As should be apparent, the variety of combinations possible, and hence the specificity of the overall selection code, is very great. I n its final form the program also permits one to combine the year (a single year) with any set of subject-data codes to provide even greater selectivity. Table 3 shows a condensed listing of the author-phase and the figure gives a corresponding detailed flow chart of these instructions. The five digit numerals of the chart correspond to the instmction number in the typeout. DISK is the storage area into which the desired author's name is placed. This remains constant during the search; whereas, the SEARCH area is that into which each successive sector of disk data is transmitted.

Table 1. List of J(13C-H) Codes Compounds and Functional Groups

Suhieot Matter General (not J information) 01 J(13C-H) 00

02

J(136-Fl

03 J(H-HI 04 J(13C-13C). J(F-F), and J(H-F) 05 J-General 06 Nonunique P M R Analysis 07 P M R Analyais 08 F N M R Analysis 09 13C NMR Analysis 80 Analysis Only 10 Angle Dependence 17 Charge Distribution 11 Conformers. Rina h e . Sterio Effects 12 Djama~(net1oAnisotropy 14 Dipole Moment 13 Dorihle Resonance. Si nal Enhancement, 0"erL"Ser

35 Methyl: General Substituenta 36 37 38 39

Methyl: Halogen Methyl: Others Alkanes: R Alkanes: 0,Halo 40 Alkanes: S, N, eto. ( a ~ i n e s . su1fides.metal substltuenta) 41 Alkenes: R 42 Alkenes: 0 Halo 43 Alkenes: S ' N Eto. 44 Vinyl (Ethhedes:) General Substituenta 45 Vinyl (Ethylenes): Halogens 46 Vinyl (Ethylenesl: Other 47 A l k y n a : R 48 Alkynes: O t h e n 49 Alcohols Ethers 5 0 Acetate; Eaters, Nanoyclic Anhydrides Csrhonyl (Aldehydes. Ketonas, Awl) Amides Imines, any 2 N ~aeto"ks,Laetama. Cyclic Anhydrides SO. SO,, NO*. NO, CO3, SO2

...

ete.

eto. Nonhensenoid Aromatics R i n s : Satursted Rings: Unsaturated Rings: Bridged

Rinas: 0, N, S, etc. Rinxs: 0.N, 5, eto. and Double Bond Heteroeyolies: 0 Hoterocyelios: N

eto.

Heterocyclics: S, Polymers Iona. Free Radhals Complexes, Metals, Grignards F Analogs: Saturated F Analogs: Unsaturated P Ansloga: Rings

Table 2.

Sample Output of Subject Phase of SEARCH Select Phase From 0-Date, l-Author, 2-Code.

3--Date and Code

Type Codes, I" Numeric Order. In The Form

X X X X XX X X . . . Use 0 in Front of Single Digits and Place a

Reoord Mark Direetlv after the Last Code.

Index of Articles pertaining To J(C-H)

P M R Analysis Alcohols and Ethers

Rattst, LS Mandell L ~ ~ l d ~ tJH ~ i b , JACS, 89. 2253 (19671 Sheppard, N Proo Roy JJSoo. A 252. 506 (1959) Turner, Type in next set of codes to be indexed

R

*s Index is Completed 1. Another Phaae Desird-NO

S End of Job Enter Monitor Cntl Reo. Job Card Group Only

Table 3.

Flow Chart of Autha Phare of SEARCH

I N f 5 is the address of the disk sector being read and NUM (its original value) is the total number of sectors required by a specific reference. NUM minus 2 is, therefore, the number of authors of the reference. TEMP is the first sector address of the specific reference which is being compared. Core storage requirements for the total search program are 13,050positions for its 807 statements. Loading

The data is originally disk loaded by a second program, which requires 7100 core positions for its 336 statements. An example of the input form for the loading program is given in Table 4. One sector is assigned for each line of input data, i.e., five sectors would be required for the entry in Table 4. I n the first line are listed: the number of authors, the last two digits of the year of publication, and the subject matter and compound codes, the latter being listed numerically. The entries which follow are the authors' names, each listed in an individual sector. The last line contains the journal entry, including the year. It may be noted that all but the first line constitute the output of the search program. The final entry of the data tape is an extra end of line symbol, which signals the conclusion of both the gearch and loading programs. The loading program will also add or delete any reference or group of references. The additions are placed in alphabetic order according to the first author's last name, with the most recent addition first.

Listing of Author Phase of SEARCH LABEL

NUMBER

OP

*

15W0 15W1 15065 15068 15069 16080 15087 15160 15180 16190

AUTHOR NZMAN

GOGO

OPERANDS

CODE BT RCTY WATY RATY DNR

AUTHOR INDEXING

n

LOPPER

TF

TFM BT

nD

15210

BNR

16010 16020

COD

160'30 16082 160'35 16040 16061 16062 16065

n

TF

AM TTi SM I3D R

YUP

16067 16070

READER

16080

UT TFM

CM BE C

16090 16100 16110 16120 16130 16140

RNE INTIAL

16150 16151 16171 16180 16190 16200 16220 16280

OUTPUT

16236 16!!40 18260

FUR

16260 17010 17020 17030 17082 17034 17010 17045 17050

NNAM

PUR

AM AM D

CM

RNE AM AM

C DNE TF DT I3D

n

PUTUP,PUTUP-1

PUP15

DISK *+20,DISK END IN+S.SSEC BOII.DISK READ.READ-1 SIC1,SEARCII-I CODSIIARCH NZMAN NUMSEARCH IN+5,1,10 TEMP.IN+S NUM.2.10 YUP,NUM SIC2 RGAD.READ-1 OLI.IE,SEARCH OLLIE.23.610 INTIAL ~011.01.LIE.6ll NNAM R01J.2.10 OLLIE.2.10 R1TAI)P;R 1101J.23.610 NNhM IlU!4,4,10

OI.LIE.4.10 IIOH,OLLIE,611 NNAM IN+5,TEMP R E A D READ-I FUR,S~ARCN-1 I.OPPER

RCTY WATY AM B

SM

no AM n

AM TFM

n

Table 4.

Example of Input for Loading Program 3 65 01 07 16 23 27 42 45 90* WATTR ~ - --- , VR* - , LOEMKER. JE* GOLDSTEIN. JH* J MOL SPEC, 17.348 (1965)*

Volume 45, Number 1 I, November 1968

/

735

Operating Detaiis

For the guidance of the user, certain explicit instructions are typed out at various stages of the program in execution. An example of this is shown at the top and bottom of Table 2. For SEARCH this involves a choice of index classifications, and for LOAD the choice of jobs to be accomplished. A typical search takes 20 sec, plus the necessary output time. A record mark terminates all phases of the

736

/

Journal of Chemiml Edumfion

program with the option of selecting another phase. (Otherwise, control is returned to the supervisor program.) The extent to which the bibliograohy may be expanded is indicated by the fact that three disk drives of the 1311 variety would allow simultaneous searching of approximately 18,000references. For more detailed information, inquiries should be sent to the authors.