Lana S. RaRet and J. H. Galdsfein Emow Universitv Atlanta, Georgia 30322
(
I
Computerizing Scientific Bibliographies
This note describes the application of a digital computer to the storage and selective retrieval of bibliographical material. We believe that the method trn~~lo~ r d bt: of ;w;ist~ncrro hot11 students and rewill searchers faced with the necessity of effiriently handling the ever-growing volume of scientific literature.' The specific problem was that of assembling, ahstrace ing, and cataloging a large number of literature references pertaining to lac-H satellite nmr spectra. For this purpose a program was written which would: (I) accept and store the bihliographical material, according to a preassigned coding system; (2) permit search and print-out of the stored material according to a selected indexing and cross-indexing scheme; and (3) allow continuous updating of the bibliography whenever desired. The computer system available was an IBM 1620-11 with paper tape input, typewriter output, 40,000 posi? tions of core storage, indirect addressing, and external (random access) memory (IBM 1311 Disk Drive). The necessary software included the Monitor I system, which incorporates the SPS 11-D assembler. We found that Fortran 11-D, the normal mode of programming this computer configuration, was not designed to handle the combination of numeric and linguistic information involved. This was overcome by employing the Disk-oriented Symbolic Programming System (SPS 11-D), which is more flexible than Fortran, yet less tedious than machine language. The desired functions were achieved hv writing two major programs, LOAD and SEARCH; and-two smaller programs, CORRECT and TOTAL TYPEOUT (or PUNCH OUT), all of which were permanently stored, in the core image, on the disk. I n practice these programs have performed with surprising speed and effectiveness. After a brief period of initial instruction, anyone can use the full search procedure without involvement in the details of the program. Up-dating of the bibliography requires a small amount of additional instruction and experience. One of the most gratifying results obtained is the output of unexpected correlations, arising both from the flexibility of the procedures and the speed with which index correlations can be carried out. The errors and oversights encountered in the manual handling of large quantities of bibliographical data are essentially absent in the computerized procedure. Search The indices used for classifying the material and their corresponding codes are listed in Table 1. As the program was written, only one year or author can he
examined in a single search, hut such searches can be entered sequentially at the typewriter. Otherwise, any combination of codes can be linked to guide the search. For example, if the code input is 01 03 06 494= the articles in Table 2 will be selected and typed out, and all others will be rejected. As should be apparent, the variety of combinations possible, and hence the specificity of the overall selection code, is very great. I n its final form the program also permits one to combine the year (a single year) with any set of subject-data codes to provide even greater selectivity. Table 3 shows a condensed listing of the author-phase and the figure gives a corresponding detailed flow chart of these instructions. The five digit numerals of the chart correspond to the instmction number in the typeout. DISK is the storage area into which the desired author's name is placed. This remains constant during the search; whereas, the SEARCH area is that into which each successive sector of disk data is transmitted.
Table 1. List of J(13C-H) Codes Compounds and Functional Groups
Suhieot Matter General (not J information) 01 J(13C-H) 00
02
J(136-Fl
03 J(H-HI 04 J(13C-13C). J(F-F), and J(H-F) 05 J-General 06 Nonunique P M R Analysis 07 P M R Analyais 08 F N M R Analysis 09 13C NMR Analysis 80 Analysis Only 10 Angle Dependence 17 Charge Distribution 11 Conformers. Rina h e . Sterio Effects 12 Djama~(net1oAnisotropy 14 Dipole Moment 13 Dorihle Resonance. Si nal Enhancement, 0"erL"Ser
35 Methyl: General Substituenta 36 37 38 39
Methyl: Halogen Methyl: Others Alkanes: R Alkanes: 0,Halo 40 Alkanes: S, N, eto. ( a ~ i n e s . su1fides.metal substltuenta) 41 Alkenes: R 42 Alkenes: 0 Halo 43 Alkenes: S ' N Eto. 44 Vinyl (Ethhedes:) General Substituenta 45 Vinyl (Ethylenes): Halogens 46 Vinyl (Ethylenesl: Other 47 A l k y n a : R 48 Alkynes: O t h e n 49 Alcohols Ethers 5 0 Acetate; Eaters, Nanoyclic Anhydrides Csrhonyl (Aldehydes. Ketonas, Awl) Amides Imines, any 2 N ~aeto"ks,Laetama. Cyclic Anhydrides SO. SO,, NO*. NO, CO3, SO2
...
ete.
eto. Nonhensenoid Aromatics R i n s : Satursted Rings: Unsaturated Rings: Bridged
Rinas: 0, N, S, etc. Rinxs: 0.N, 5, eto. and Double Bond Heteroeyolies: 0 Hoterocyelios: N
eto.
Heterocyclics: S, Polymers Iona. Free Radhals Complexes, Metals, Grignards F Analogs: Saturated F Analogs: Unsaturated P Ansloga: Rings
Table 2.
Sample Output of Subject Phase of SEARCH Select Phase From 0-Date, l-Author, 2-Code.
3--Date and Code
Type Codes, I" Numeric Order. In The Form
X X X X XX X X . . . Use 0 in Front of Single Digits and Place a
Reoord Mark Direetlv after the Last Code.
Index of Articles pertaining To J(C-H)
P M R Analysis Alcohols and Ethers
Rattst, LS Mandell L ~ ~ l d ~ tJH ~ i b , JACS, 89. 2253 (19671 Sheppard, N Proo Roy JJSoo. A 252. 506 (1959) Turner, Type in next set of codes to be indexed
R
*s Index is Completed 1. Another Phaae Desird-NO
S End of Job Enter Monitor Cntl Reo. Job Card Group Only
Table 3.
Flow Chart of Autha Phare of SEARCH
I N f 5 is the address of the disk sector being read and NUM (its original value) is the total number of sectors required by a specific reference. NUM minus 2 is, therefore, the number of authors of the reference. TEMP is the first sector address of the specific reference which is being compared. Core storage requirements for the total search program are 13,050positions for its 807 statements. Loading
The data is originally disk loaded by a second program, which requires 7100 core positions for its 336 statements. An example of the input form for the loading program is given in Table 4. One sector is assigned for each line of input data, i.e., five sectors would be required for the entry in Table 4. I n the first line are listed: the number of authors, the last two digits of the year of publication, and the subject matter and compound codes, the latter being listed numerically. The entries which follow are the authors' names, each listed in an individual sector. The last line contains the journal entry, including the year. It may be noted that all but the first line constitute the output of the search program. The final entry of the data tape is an extra end of line symbol, which signals the conclusion of both the gearch and loading programs. The loading program will also add or delete any reference or group of references. The additions are placed in alphabetic order according to the first author's last name, with the most recent addition first.
Listing of Author Phase of SEARCH LABEL
NUMBER
OP
*
15W0 15W1 15065 15068 15069 16080 15087 15160 15180 16190
AUTHOR NZMAN
GOGO
OPERANDS
CODE BT RCTY WATY RATY DNR
AUTHOR INDEXING
n
LOPPER
TF
TFM BT
nD
15210
BNR
16010 16020
COD
160'30 16082 160'35 16040 16061 16062 16065
n
TF
AM TTi SM I3D R
YUP
16067 16070
READER
16080
UT TFM
CM BE C
16090 16100 16110 16120 16130 16140
RNE INTIAL
16150 16151 16171 16180 16190 16200 16220 16280
OUTPUT
16236 16!!40 18260
FUR
16260 17010 17020 17030 17082 17034 17010 17045 17050
NNAM
PUR
AM AM D
CM
RNE AM AM
C DNE TF DT I3D
n
PUTUP,PUTUP-1
PUP15
DISK *+20,DISK END IN+S.SSEC BOII.DISK READ.READ-1 SIC1,SEARCII-I CODSIIARCH NZMAN NUMSEARCH IN+5,1,10 TEMP.IN+S NUM.2.10 YUP,NUM SIC2 RGAD.READ-1 OLI.IE,SEARCH OLLIE.23.610 INTIAL ~011.01.LIE.6ll NNAM R01J.2.10 OLLIE.2.10 R1TAI)P;R 1101J.23.610 NNhM IlU!4,4,10
OI.LIE.4.10 IIOH,OLLIE,611 NNAM IN+5,TEMP R E A D READ-I FUR,S~ARCN-1 I.OPPER
RCTY WATY AM B
SM
no AM n
AM TFM
n
Table 4.
Example of Input for Loading Program 3 65 01 07 16 23 27 42 45 90* WATTR ~ - --- , VR* - , LOEMKER. JE* GOLDSTEIN. JH* J MOL SPEC, 17.348 (1965)*
Volume 45, Number 1 I, November 1968
/
735
Operating Detaiis
For the guidance of the user, certain explicit instructions are typed out at various stages of the program in execution. An example of this is shown at the top and bottom of Table 2. For SEARCH this involves a choice of index classifications, and for LOAD the choice of jobs to be accomplished. A typical search takes 20 sec, plus the necessary output time. A record mark terminates all phases of the
736
/
Journal of Chemiml Edumfion
program with the option of selecting another phase. (Otherwise, control is returned to the supervisor program.) The extent to which the bibliograohy may be expanded is indicated by the fact that three disk drives of the 1311 variety would allow simultaneous searching of approximately 18,000references. For more detailed information, inquiries should be sent to the authors.