RECENT DEVELOPMENTS IN KEYSORT CARDS' GERALD 1. COX Corn Products Refining Company, Argo, Illinois
ROBERT S. CASEY and C. F. BAILEY W. A. Sheaffer Pen Company, Fort Madison, Iowa
WE%
HAVE described an adaptation of the Keysort. punch card, used primarily in commercial applications, for facilitating recovery of information from files of bibliographic cards with a minimum of cards for recording the data. The cards were described, together with the simple punches and sorting needles, or "tumblers," reqniredfor their preparation andmanipnlation. The types of slotting of the cards, single hole or groups of holes ("fields"), were described and illustrated, together with use of a double row of holes to increase coverage of subject matter. An abbreviated outline used fdr coding writing ink references was given as an example. Two alphabetical codes were given, one of them being nen.. We pointed out that punch cards of the Keysort type can be used in a wide variety of ways in the analysis not only of bibliographic material but also for observational and experimental data and suggested broad applications to scientific use. In a second papers we called the utility of this powerful indexing tool to the attention of workers in other fields of science. In a third paperqhe details of the applications of Keysort cards were given. Further possible uses were suggested, such as in correlation of the properties and use of chemicals and materials of construction and processes. Some new techniques were described. There is no essential difference in the treatment by punch cards of scientific data and those from commercial or sociological sources. Hoag6 has described the use of Keysort cards for indexing union contracts and has referred to a California Institute of Technology Bulletin for details of coding. This bulletin, as revised bv Jamieson,' describes the coding of 317 items and makes use, front and hack, of two 8 x 8mch cards. The more usual type of code printed directly on the card is used rather than the "off-the-card" code necessitated for our bibliographic cards which must provide space for an abstract. We have been asked questions and have encountered
problems in our own work which have not been satisfactorily answered in published data on punch cards. We will attempt to clarify some of these briefly. EURTHER EXPERIENCES AND TECHNIQUES
Our experiences with the various indexes on our general bibliographic card2over a period of two years have been as follows: Numerical Index. The "numerical index" (note this index at the upper left in Figure 1)is for a general class, and as such is seldom used for sorting. When a number of new cards have been prepared and slotted, the numerical index is useful for separating the cards for 61ing, for different classes usually will be filed separately. The uniformity of the slots in each separate portion of the file is visual assurance that no card is misfiled. For general classification one a t once thinks of such plans as the Dewey decimal system. Difficulties arise with the requirement of a t least six four-hole fields for recording the class number. For an individual user it will readily be seen that wide gaps will occur in the file, however diverse his interests may be. Large numbers require excessive slotting. For these reasons we very early abandoned such general schemes in favor of simple personalized methods in which items of the greatest interest can be represented by the simplest slotting combinations. As an example, the subject "amino acids" is represented by a single punch in the numerical index, though supplemented in the "classified index" by two four-hole fields. In these latter fields the individual amino acids and combinations of related amino acids are represented. The "direct in-
1 Presented before the Division of Chemical Education a t the 110th meeting of the American Chenucal Soc~etyin Chicago,
C k .Eng.News, Cox, G. J., C. F. BAILEY,AND R. 9. CABBY, 23, 1623 (1945). a BAILEY, C. F., R. S. CASEY, AND G. J. COX,Science, 104, 23, 495 (1946).
_.;. ........... ....... .................
HOAG,M., Special Libraries, 37, 106 (1946). e "Method of indexing provisions of collectiv9 agreements," Industrial Relations Sections, California Institute of Technology, Ti-. 1945, 16 pp. Pasadena, 1941; IZevised by MAYE. JAMIESON, 65
.
.
.~..-'r.S-e
1.
A %ad.
c-d
t...........
-
. ....
tt ii....... i.
*.
--
fo. h d i w publication of junior mutho..
JOURNAL OF CHEMICAL EDUCATION-
dex" is the same for all the amino acids but with the possible variation a t the extreme left of the index. Alphabetical Index. The "alphabetical index," used for authors, permits alphabetizing of long bibliographies. It is, of course, necessary to make the exact final arrangement by hand as the sorting is only for the first three letters of each name, but this operation is not much more than a check on the correct order of the names for most bibliographies. The multiple authorship hole facilitates alphabetizing of junior authors. Selection of all the cards bearing the name of a given author is accomplished by several procedures. The system used a t W. A. SheafFer Pen Company provides slotting of the second and third names of multiple authorship papers in the second and third fields, respectively, but with only the first letter of each name used. The Corn Products Refining Company system records by slotting only the first author. For this latter system "see" cards (Figure 1) are made for the remaining authors. These cards are filed under "I" in the "numerical index." A "see" card bears the name of the pertinent author, and with i t are listed all associate senior authors together with the "numerical index" under which the cards occur. We have made very little use of punch cards for compiling lists of papers by a given author. For such a special purpose, the cards sorted for the author and for his known associates will bring to light most, if not all, the cards in the file. If it is the intention to use a file of punch cards for the compilation of such bibliographies, then "see" cards for authors should he made extensively if the system which records only a single author is used. For the completion of a personal bibliography, recourse will be had to other indexes, such as Chemical Abstracts, or to library compilations, and, as last resort, to inquiry of the author. A very frequent use of the author index is made in searching for a card to avoid its duplication when more than one abstractor is contributing to the file or when abstracts or titles are seen in other journals. Furthermore, though such cards are known to he in the file, i t may be desirable to add the references of the abstract journal or add abstracts from such sources. The logical method of search for such cards is by author. . Date Index. The section on date is useful in the arrangement of long bibliographies since a preliminary chronological sort will result in orderly arrangement of all subkequent sorting for the authors. Chronological ordering of cards is frequently useful, for example, in the development of any topic coded in the "direct index." Search for papers published in any year or period is obviouslyfacilitated. Selection of all the cards of the current and preceding year to avoid duplication, because of repeated appearance of abstracts, is another use of the date section. We have used our general bibliographic card for the critical analysis of reports and correspondence arranged chronologically in monthly periods. This was accomplished by punching decades in the two-hole section ordinarily used for centuries. "No punch"
signified prior to 1920; "18" stood for 192Cb30; "17" for 193M0; and both "17" and "18" slotted, for 1940 to date. The unit years for each decade were punched in the tens field and months in the units field. The arrangement was flawless for chronological development of topics slotted in the "direct index." It was necessary to assign a new number in the "numerical index" for this special study to distinguish regular literature references in the same subject matt,er because of this different treatment of the dates. Direct Index. As previously de~cribed,~ Holes 1-6 a t the right of the card have been given a general meaning for most of the classes of the "numerical index." These meanings relate to availability, patents, whether or not the card bears an abstract, and ideas suggested by reading the paper. Coding Practice. Coding the "direct index" is of greatest importance as the chief use of the cards is in locating specific subject matter. Also, here is the greatest limitation of space and also the greatest diversity of items to be recorded. It has been our practice to code a t the lower right of the card information that is most generally found for a wide variety of subjects. Thus by Hole 7 the melting point of organic compounds is shown and, by slotting Hole 8, it is indicated that one or more boiling points have been given. Progressively toward the left the meaning of the holes becomes more specific for the substance or group of substances that is being recorded. Thus for Hole 26, in dealing with lactic acid, "lactide" is indicated; there would be no counterpart to lactide for most other organic substances. This method of developing codes is of value as it gives a pattern which serves well in developing codes for other materials and makes it possible to work out satisfactory codes much earlier in the compilation of a file on new material (see Casey, et al.,'for lactic acid code).
TABLE 1 Direct Index Code for Analytical Chemistry -
Hole* Subjed 7 Apparatus 8 Ashing; combustion (a) Wet ( b ) Dry 9 Chromatographic 10 Colorimetric (a) Tintometer ( b ) Photometer 11 Conductometric 12 Cryoscopic 13 Ebulliometric 14 Eleotrophoretic 15 Fractional distillation 16 (a) Gasometric (b) Manometric 17 Grtlvmetric 18 Hydrometry 19 Ion exohsllge
-
Hole 20 21 22 23 24 25
26 27
28 29
-
Subject Polarimetric Palerographic Potentiometric (a) Refractometric (b) Interferometric Reagents Spectrographic (a) Absorption (b) Emsslon Turbidimetric (a) Nephelometric ( b ) Scattering Volumetric (a) Aeidimetry, alkalimetry, precipitation (b) Oxidation, reduction X-ray Treatment of data.
* If two subjecis are listed for one hole, the one following "a" is punched in the outer hole and the one following "b" is punched far the inner hole.
FEBRUARY, 1947
As a further example of codmg, a system, worked out in collaboration with Mr. R. J. Smith of Corn Products Refining Company, for dealing with analytical chemistry in general is shown in Table 1. An entire hundred was assigned in the Linumericalindex" for the chemical elements, and the numbers of each class were determined by numbering the elements in their alphabetical order. (For this purpose, i t is advisable to use a table that is complete; most handbook tables do not list all the elements.) For each of the elements the code of Table 1was followed i n general, but deviations were possible where obviously the analytical procedure was seldom or never applied. Section A of the "classi6ed index" was reserved for indication of joint determination of elements or their compounds, i. e., the halogens. The double holes in the "direct index" may be used without the intermediate punch to mean different degrees. Thus, by using the shallow punch, qualitative tests can be indicated and the deep punch means quantitative analysis. This is based on the reasoning that if a quantitative method is applicable no qualitative test would be necessary. The use of both holes in one position is illustrated by the "a" and "b" designations in Table 1. Miscellaneous. We have decided against recording patent numbers by slotting because of the number of holes necessary for the record and because we have found that subject matter is usually the information that is being sought. I n most cases, if a patent number is known, the author or the subject matter is also known and these serve as adequate clues with which the oard can be isolated. If it is necessary to code patent numbers in a small punch card file, we suggest coding only thousands, ten thousands, and hundred thousands, in four-position selector fields described later. Smce the million digit will be "0," "1," or "2" and will not go beyond "3" for some years, one double-hole position can code this digit. In a file of between two and three thousand cards, with the patent numbers randomly ditributed and coded as described above, there would be an average of one card drop for each number selected. The million digit could be ignored, which would increase the hand sorting slightly. Bibliographic punch cards are not for use in card trays intended for the general public, even though the methods for sorting them are extremely simple. They are preeminently for individuals or for small specialized groups. Punch cards should be of particular interest to college students, since they probably have not accumulated extensive files of cards or notes in other forms as yet. If two or more individuals have started collections independently on different subjects and it is desired to bring them together in a single file, conflicts in the ‘‘numerical index" can be resolved by "advancing" one or more of them. Thus, if two general classes, such as bacteriology and analytical chemistry, have been given numbers in the same range, such as 1to 50, one of them can be advanced to 101 to 150 by slotting the 1in the hundreds fields. Blank cards for such general
67
classes can be preslotted in quantity or purchased preslotted from the manufacturers. Successful typewritten carbon copies can be made from the cards for use in two files. Clippings can be mounted on punch cards by the use of nitrocellulose or other plastic base cement as i t does not curl the cards or leave tacky residues. On-the-cardforms can be mimeographed with perfect register even between the double holes for purposes where only a few hundred cards are needed. Double holes, when used for direct coding, permit each position to be given any one, but only one, of three meanings (m addition to 0, no punch) : 1, shallow punch; 2, middle punch; a n d 3, deep punch. Two passes of the single-tine sorting needle are required to segregate. Needling the inner hole drops out No. 3, allows No. 2 to be swung to one side and separatcd; then needling the outer hole separates 1from 0. Another way of codmg double holes is to slot the outer hole with the shallow punch and the inner hole with the middle punch, which enables the outer and . inner holes to be coded independently. . SELECTING
In the use of punch cards for scientific purposes, selecting from the pack of cards those bearing specific data or subjects is more often required than in business use where sorting the cards into serial order seems to be more frequent. Below are some suggestions for improvements in selector codes and in their manipulation. Use of Multiple Needles. Selection is effected by inserting a number of sorting needles in certain holes in a specially coded section so devised that the desired card or cards drops, and all others stay on the needles. The selector unit previously described for this purpose4.' requires that the individual needles be attached to the unit in the proper spaces before each selection. The following improvement was recently ~uggested.~There are needed only a conventional single-tine sorting needle and a group of loose needles without handles, such as No. 1, 10-inch knitting needles. The sorting needle is inserted in one of the holes to be selected, near the center of the edge of the oard, and the loose needles are inserted in the other appropriate holes. When the sortingneedle is lifted, the selected cards drop. The above procedure is specific. The same multiple needle technique can be used with nonselector codes, such as shown in Figure 1, but with the slight disadvantage that cards with slotting additional to the pattern beinn selected will also drou. We have found that an unsiotted card a t the front at the back of the pack, from which cards and are to be selected, help supportthe selector Selector Codes. Five-position alphabetical codesNZ,7> 4 2, 1, and 0I I ,E, c , B - ~ the ~ ~four-position numerical code-7,4,2,1--enable cards to be sorted seri-
' ANONPMOUS,Mfgfg~.Bull.,
The McBee Company, Athens,
O ~ d o o p B R ,P, F,, The McBee Company, New private communication, April, 1946.
City,
68
JOURNAL OF CHEMICAL EDUCATION
ally, but do not permit simple selection of a given combination of letters and digits. The six-position selector c0de-7,4,2,1,O,Sl?~~~-perrnits numerical selection as well as serial sorting, but can code only to 9 in one field, so is not adaptable to alphabetical coding. Tn'angle Selector Codes. More economical of card space than the six-position selector code just mentioned is the five-position triangle code applied to a single row of holes which will select or sort up to ten in each field
the upper symbol in a given square, the left-hand position is slotted with the shallow punch and the righthand position is slotted with the deep punch. To code the lower symbol, the lefbhand position is slotted with the deep punch and the right-hand position slotted with the shallow punch. The slotting of the left shallow or deep, indicates whether the upper or lower symbol, respectively, is coded. This new system permits selecting by inserting the needles in the appropriate holes, then lifting the sorting needle. The cards may be serially sorted by needling the upper and lower holes in each position in order from right to left, placing a t the hack of the pack the cards which drop out after each pass of the needle, keeping them always in the same position relative to one another. Large groups of cards may he rough sorted into groups for subsequent fine sorting by needling the lower, then the upper, hole in each position, left to right. Fields of four positions each may be used to code units, tens, thousands, etc., in a numerical index, and as shown in Figure 2. In each field the two holes are permit sequence sotimg and also selection. This takes slotted whose diagonal columns intersect a t the sym- no more positions than the single hole 7,4,2,1numerical bol to be coded. To select a given number from a paxk code which does not permit selecting. This system is of cards coded in this way, one inserts needles in the two sufficiently economical of card space that it may also be appropriate holes in each field, and on lifting the sorting used for alphabetical selecting and sorting as will he needle, the cards bearing the selected number drop. A described below. triangular code was first seen by us on a card used by Alphabetical Selecting. To facilitate alphabetical the Dow Chemical Company. The numbering was coding, we made a study of the alphabetical distribudiierent from that shown in Figure 2 and did not per- tion in the first, second, and third letters of proper mit serial sorting as described below. names, as they occur in the Author Index of Chemical Serial sorting of a group of cards small enough to be handled on the sorting needle a t one time is accomTABLE 2 Percentages of Initial Letter of Proper Names plished by needling eaih hole with a single-tine needle from right to left, keeping the cards which drop always C. A. "American M a of Private bibliographiee author index, in the same position relative to one another, and plac19d7-S6 Science" No. 1 No. d ing them a t the back of the pack after each pass of the A 3.5 3.1 2.7 3.3 needle. Preliminary rough sorting of a larger group of B 9.2 10.0 10.3 10.2 cards is accomplished by needling the holes from left C 5.4 7.0 5.2 fi.1 to right, accumulating in separate piles those which D E drop when each hole is needled, and later fine sorting F each pile by needling each hole from right to left, as G H described above. I
J K
NEW SELECTOR CODE We suggest the use of the. triangle code combined with double holes as illustrated in Figure 3. To code
In all L
M N 0 P
2S-
1.3 i.i 9.2 6.8 3.4 2.9 u 0.3 0.5 1.3 1.5 W 7.0 5.8 X In dl cases < 0.01 Y 0.6 0.4 z 0.5 1.2 Omitting "Imperial," "I. G. Fruben," "Inter-."
Sch ST
v TENS .
UNITS
..-
Figure 3. I m p r o d *.I. cod., u.l,,g ddobl. h o h . pe"lit. Iocting and .-rid sorting of h i - ps many symboh u single hole triangle cod.
FEBRUARY. 1947
69
TABLE 3 Pementages of Second Latter of Proper Names Private bibliographies No. 1
No. 2
TABLE 4 Percentacres of Third Letta of Promr Names Private bibliographies No. 1 No. d
Abstracts, in "American Men of Science," and in two small private bibliographies. (Tables 2, 3, and 4). Mc- (and Mac-) and Sch- are the initial combinations of letters which occur most frequently and are coded separately so that such names may be subdivided further by codmg in the second and third letter fields the two letters which follow Mc- and Sch-. Six double-hole positions permit the initial letter to he coded as indicated in Figure 4. M , and S, in the figure are "M before Mc" and "S before Schh, respectively, and Mz and Sz are "M after Mc" and "S after Sch," respectively.
rig-
4. Imp=-d d0ubl.-hob triagl. oode ipp1i.a to initial 1latt.r of proPa. name. permit. not only s a i d SOFtin. but elso ..1.ctin. cards .cco.ding to name.
If the principal interest is in selecting, a threeposition double-hole code may be used for the second letter as indicated in Figure 5. This is justified by the alphabetical distribution in the second FieUn B. Naa code Ipplia to letter of proper names wand i.tte* OX Ill.h&sticd distribution of Mcshown in Table 3. In ,,d leter permit. '"d' -d .SOnOmY Of Cad 8P8.7. cases where the letter is "A," "E," "I," or "0,"the selection will be exact. Where "C," "H," "L," "R," or "others" represents the second letter, some hand sorting will be required, although where the first letter and the third letter are selected precisely such hand sorting certainly will not be excessive. Coding "Mc" and "Sch" as t,he first lrtter would greatly reduce "C" as a second letter. The six-position double-hole field shown in Figure 6 can be used to code the third letter. Table 4 shows that E, L, N, and R occur most frequently as the third letter in proper names. The six-position double-hole field
can code 30 symbols, four more than the 26 letters of the alphabet, so the extra four places are used in Figure 6 to split E, L, N, and R according to whether the fourth letter of the name, which follows E, L, N, R, is in the first half of the alphabet or the second half. Thus, three letters may be coded using only 15 positions, the same number as is required by three fiveposition single-hole fields (NZ 7,4,2,1 or O,I,E,C,B) and the latter codes cannot be used to select a combination of letters with only a single pass of the selecting needles. The above 15-position field for alphabetical selection may he used for serial sorting, hut the second letter will not sort into perfect order and will require some hand sorting. If it is desired to sort into perfect order as well as effect perfect selection, three six-position fields may be used, requiring only three more positions. An additional advantage of the triangle type of code is that the digit or letter may be read directly from the code on the card without the necessity of referring to, or remembering, a combination of letters or figures, or both.
JOURNAL OF CHEMICAL EDUCATION
Subject Coding. The same system with slight changes can be used to code three or more letters of subjects. Instead of initial symbols Mc- and Sch-, one might need Ch-, Chem., Di-, Tri-, or Ph-, etc. Vowels would still occur most frequently as the second letter.
and "American Men of Science." A name is coded with the number representing the alphabetical interval in which that name occurs. In order to select cards bearing a specific name, the number is selected which represents the interval in which that name occurs. The cards which drop from the pack are then hand sorted. OTHERSELECTORCODES The alphabetical distribution in the Author Index of There are other compromises and devices for effect- Chemical Abstracts differs from "American Men of Sciing selection in numerical and alphabetical sequences ence" in that the former contains more foreign names, which, in our opinion, are inferior to the above system, and also patents issued in the name of companies but which may he applicable to some uses, particularly ("Imperial," "I. G . Farhen," "Inter-" occupy about 36 pages in the last decennial Author Index). Neither cards which are already printed. Smce the 7,4,2,1,O,SF selector field can code only 0 to list contains "Anon." If a bibliographic file contains 9 in one field, we suggest an adaptation as shown in Tai many anonymous references, it is possible to figure out hle 5 which can code 15 numbers in one field. To code an extra combmation in most selector files to provide an a number in the left vertical column, the holes to he additional number for "Anon." TABLE S Improved Six-Position %lector Code
No.
L
8
Positions B
1
0-DF
TABLE 6 Eaual Al~habeticalIntervals
SF
0-DF O-DF O-DF 0-DF 0-DF 0-DF 0-DF
slotted are indicated by the numbers in the same horizontal line in the table. The SF (Single Figure) position is slotted when a single figure (0,1,2,4, or 8) is to be indicated. The zero position is used also to indicate "double figures." To code the alphabet in a single field, this code can he used with a double row of holes by slotting the outer row (A to M) with the shallow punch and the inner row (N to Z) with the middle punch. It is not as easy to select or sort cards slotted with the middle punch as it is with a punch which permits the cards t o drop completely free of the needle. When sorting such a file, the needle or needles are inserted in the appropriate inner hole positions, and after the slotted cards have dropped about a quarter of an inch, it is then necessary to insert another sorting needle into one of the corner holes (which are never slotted), then withdraw the sorting needles, and l i f t the needle in the corner hole, which permits separating the cards which droooed. Another device for selecting names from an alphabetical list is to assign numbers. usuallv 1to 100. to alphabetical intervals representing approximately equal distribution. Table 6 gives lists compiled from Chemical Abstracts A L~
~
~~~
-
AAAhMIAndrAndBkskBakBarnBartBatBeiBelBertBiBlakBloBowBondBriBramBruBroBueBurCamCCeCamChiCharCodClauCorCollCurCopDayCreDevDDooDawDunDeb EdwDonEngDunFalEdwFinEnlFomFarFneFirGarnForGermFriGloGarGmfGieGooGrosGreHaeHarGuiHamHayHepHarrHillHavHolmHendHueHigIgfHofIndHosJaoHumJonIrvKapJenKemKis" KodKinKnuKot * A 4 . A . Author Index, Science."
--
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 09
KuLamLebLevLib LueMc(Mm)HMdlMasMeiMidMoeM O ~ NdNevNos011PalPenPbPolProRanRelRipRosRumSanScheSchoScrSheiSilSmithSomStap St,".~. SUP TanThiTop-
Tw-
Van DVie Wall WeigWhite WinWri. . ..
Z-
KnnLarLehLilLorMMc(Mm)FMc(Mae)NMarkMmMesMilli-
Mar-
Mu5 Nee, No018PanPecPhiPorPuehwR~YRinRogRovSadSchSclShSiSmith SmoSprStevStrSwTeaTiTruVVogWanWeb.
WiWils-
Worn. . . -...
x,y,z-
1927-36; B-"American
Men of