J . Chem. I n j Comput. Sci. 1990, 30, 169-173 (2) S T N is a registered trademark of the American Chemical Society. (3) Barr, A.; Feigenbaum, E. A. The Handbook of Artifical Intelligence; Kaufmann: Los Altos, CA, I98 I . (4) Winston, P. H.Artijicial 1nteNigence; Addison Wesley: Reading, MA, 1984. ( 5 ) Dijkstra, P. J.; den Hertog, H. J.; van Steen, B. J.; Zijlstra, 2.;Skowronska-Ptasinska, M.; Reinhoudt, D. N.; van Eerden. J.; Harkema, S. J. Org. Chem. 1987, 52, 2433-42. (6) A few words will not be recognized and are simply marked as unknown. (7) Schank's school would call this a script; see: Schank, R. C.; Abelson, R. P. Scripts, Plans, Goals and Understanding, Erlbaum: Hillsdale, NJ, 1977. We could call it a synthesis script, but we have avoided this term, since we are not following Schank's methodology closely enough. (8) Woods, W. A. Transition network grammars for natural language analysis. Commun. ACM 1970, 13, 591.
169
(9) Amsterdam, J. Augmented Transition Networks for Natural Language Parsing. AI Expert 1986, I , 15-21. (101 Charniak. E.: McDermott. D. Artificial Intelligence: Addison Wesley: Reading, MA, 1986. ( 1 1) Quirk, R.; Greenbaum, S.;Leech, G.; Svartvik, J. A Grammar of Contemporary English; Longman: Harlow, England, 1972. (1 2) The term paragraph will be used for a single paragraph or group of related paragraphs under a single title. ( I 3) A substance is marked as generic if it occurs with a definite article as in the acid. (14) Raederstorff, D.; Shu, A. Y.L.; Thompson, J. E.; Djerassi, C. J . Org. Chem. 1987, 52, 2337-46. (I 5) Resolving this type of ambiguity requires a sophisticated knowledge of chemical structure and reactivity and is beyond the scope of our current work. ~I
User Needs in Chemical Information GUNTER POTZSCHER* FIZ Chemie GmbH, Steinplatz 2, D-1000 Berlin 12, FRG A. J. C. WILSON
Crystallographic Data Centre, University Chemical Laboratory, Lensfield Road, Cambridge CB2 1EW, U.K. Received December 5 , 1989 Information is of great value in o u r modern industrial society. In chemistry, a discipline in which compounds and compound classes play the most important role, information on about 10 million compounds has been registered. This number increases annually by 0.5 million compounds published in about 0.5 million documents. In this context it is a prerequisite that the information hidden in books, scientific and technical journals, conference proceedings, dissertations, etc. be evaluated and presented by abstracting and indexing services and database producers. Timeliness, accuracy, and completeness of the information are high on the list of desiderata of the users in chemistry. It is of great importance that all new information published in a primary source be considered and made searchable. Furthermore, properties of compounds including stereochemistry, toxicity, environmental behavior, etc. have to be made searchable, too. Factual and/or numerical data and reviews are requested as well. User friendliness of the services is of high priority. In the future, the electronic media will dominate the field of information more and more and therefore many improvements in search methods are necessary. 1. INTRODUCTION Modern industrial society depends on the availability of the commodity "information". Information is of incalculable value to industry, government, and universities; its value far exceeds its cost, even though only the latter is readily expressed in monetary terms. The difficulty in putting a monetary estimate on its value explains to some extent the reluctance of government and industry to support adequately the necessary but "uneconomic" institutions that participate in the flow of information from producer to user. In scientific and technological fields the producer of information of one type is often the consumer of the same or of a different type. 1.1. Institutions Involved in Information Flow. The following institutions participate in the flow of information from producer to user: 1.1 .I. Newspapers, Popular Journals, Radio, and Television. The popular media disseminate "news" rather than "information". It is often sensationalized, and even if it is reported seriously, it may be distorted by the failure of the reporter to understand it fully. For the chemist it can at most be an indication that fuller details must be sought elsewhere. 1.1.2. Commercial and Learned-society Publications. Many commercial publishers and learned societies produce books and scientific and technical journals that contain original research, compilations, and reviews. Until recently such publications have been "hard copy" on paper, but some are now available in electronic form. 0095-2338/90/ 1630-0169$02.50/0
1.1.3. Patent offices Publishing Patents and Related Works. Patents present special problems, as their purpose and mode of drafting differ completely from those of a journal paper. The purpose of the latter is to convey as much information as possible, and commercial considerations do not restrict the freedom of the author. An ideal patent, on the other hand, would contain no information or "know-how" that might be useful to a competitor, and the claims would be drafted so that all literature searches would lead to the patent. Patent law ordinarily prevents the patent draftsman from reaching this ideal, though some patents approach it. 1.1.4. Abstracting and Related Services. Abstracting and indexing services, data-bank compilers, handbook producers, and textbook authors analyze and report the original literature, thus making it more readily accessible. Although the activities of the organizations described here and in section 1.1.2 are conceptually distinct, the same parent organization often engages in activities of both types. For example, the American Chemical Society produces both the Journal of the American Chemical Society and Chemical Abstracts, and the International Union of Crystallography publishes two journals, an annual volume of abstracts, and the multivolume handbook International Tables for crystallography. 1.1.5. Libraries. In principle, libraries should collect and store all types of literature, make them available to the user, and provide on-line access to electronic abstracting services and databases. In practice, however, underfunding restricts 0 1990 American Chemical Society
170 J . Chem. I$
Cornput. Sci., Vol. 30, No. 2, 1990
the services that libraries can provide. 1.2. Categories of Literature. For clarity, the literature in the information system is often divided into three categories: 1.2.1. Primary Literature, Consistingof Reports of Original Research. Original research ordinarily appears as papers in the learned journals but may also take the form of research and conference reports; patents; serial but nonperiodical publications from universities, learned societies, or commercial publishers; dissertations and theses.] Books, especially of the type known as monographs, sometimes embody original research, and even textbooks may include some new work undertaken by the author to fill gaps that became obvious in writing a connected account of some topic. Books, however, ordinarily come into the third category (section 1.2.3). New results in dissertations and theses are ordinarily extracted by the author and published as journal articles, but i n a small proportion of cases this does not happen-the author loses interest or no longer has access to university facilities. User needs are discussed and some recommendations are made in section 4.1. 1.2.2. Secondary Literature. In the secondary literature certain portions of the primary literature are processed and offered in printed or machine-readable form. Traditionally, the abstracting and indexing services have produced biblidgraphic databases, but in recent years factual databases, and in particular numeric databases, have been growing in number and importance. Examples are given in section 4.2, together with a discussion of user needs and some recommendations. 1.2.3. Tertiary Literature. The secondary literature is based on an almost mechanical process. The units of the primary literature are classified, summarized, indexed, and made accessible, but only rarely is there any attempt to judge the quality of the information processed or to relate the units to one another. The latter functions, plus digestion and synthesis, are the province of the tertiary literature. It consists of such things as progress reports, review papers, textbooks, encyclopedias, handbooks, and critical tables of measured quantities. The above lists of types of literature in the three categories cannot claim to be complete, and the borderlines between the categories are not always sharply defined.2 In particular, textbooks and review papers are tertiary, but many contain sections (ordinarily brief) that are primary-original material not otherwise published. 2. GENERAL USER NEEDS
The information needs of users vary considerably and depend significantly on the field of interest or on the specific problem to be solved. Attention will be focused on user needs in the field of chemistry (section 3 onward), but there are some general requirements that are independent of the field of interest, and these will be discussed first. 2.1. Completeness of Coverage. Information should not be lost through failure of the transmission process from the primary literature to the secondary. If a unit of the primary literature is not abstracted and indexed, the information in it is almost irretrievably lost, particularly with increasing reliance on electronic searching. Only old-fashioned browsing in likely-or unlikely-sources will find it. There is ordinarily no problem with periodicals and patents, and little with established nonperiodical serials, but books and the so-called “gray literature”-the other sources listed in 1.2.1-are rarely fully a b ~ t r a c t e d . ~ Physical properties and measured values should be abstracted precisely. Vague or inaccurate data are ~ s e l e s s . ~ 2.2. Rapid Processing. Information should have a high currency and must be available rapidly. This is particularly important for patents, since the periods for appealing against patent claims are rigorously controlled.
POTZSCHER A N D WILSON
Provision should be made for the rapid ordering and quick delivery of original documents or photocopies when full-text databases are not available. Ordering on-line from within the abstracting database is a highly desirable facility; otherwise, orders by telex or telefax should be possible. 2.3. Costs and Cost Effectiveness. The retrieved information should be worth its cost. It is not necessarily cheap, or without charge to the user, but “poor” users (universities, third-world countries) often experience difficulties in paying the economic cost, so that provision must be made to provide them with information services at a lower price. This makes sense from a general enlightened economic viewpoint; they are important sources of original research and information on raw materials. The above needs and requirements are to some extent contradictory and cannot be realized completely. However, this is no reason for failing to work toward fulfilling them as far as practicable. 3. GENERAL REQUIREMENTS FOR CHEMICAL
INFORMATION There are some requirements that are relevant to the whole field of chemistry. (a) The international rules for the nomenclature of chemical compounds should be processed more rapidly than at present by IUPAC. Their use in the primary literature should be encouraged by the national bodies and enforced rigorously by the editors of journals. As things are at present, names must be invented by the authors or abstracting services, without waiting for IUPAC action, and may be difficult to eradicate if found to conflict with later IUPAC standards. (b) After a CAS Registry Number has been assigned, it should be quoted in all later publications. (c) The encoding of chemical structures in graphical form, internationally accepted as the most important representation of compounds, should be standardized and unambiguous. Stereochemistry must be included. (d) The encoding of reactions and the provisions for searching for them should be improved, so that consecutive reactions can be retrieved even if distributed through various literature sources. (e) More attention should be paid to safety and environmental matters. For example, an explosion occurring during the processing of a mixture should always be reported both in the primary literature and in the secondary literature. (f) Legal claims made in patents should be abstracted as comprehensively as possible. Computerized search methods are of particular value in this field. The above desiderata relate particularly to the abstracting and indexing services. In addition, education in chemical information in universities, colleges, and institutes of further education, as well as on-the-job training in industry, should be improved (see section 7). Some of the above requirements are discussed from other points of view in other paragraphs. 4.
CONTENT AND SCOPE OF CHEMICAL INFORMATION
4.1. Primary Literature. Surveys of user requirements for chemical information5-’ have established that the primary literature, especially journal paper, is still the most important source of information and that users preferentially scan titles, followed by looking at the abstracts or summaries. The following requirements are derived from these surveys: 4.1.1. The literature explosion should be stemmed, and research results should be published only once-and not, as is often the case, in slightly modified forms in several journals. Nevertheless, preliminary communications describing im-
USER NEEDSIN CHEMICAL INFORMATION
portant results have their p l a ~ e . ~ , ~ 4.1.2. The tendency of journals to concentrate on specialized fields is to be encouraged; it drastically reduces the number of journals that have to be scanned by any particular user. Within any issue of a journal it is advantageous to group together papers in a particular field, especially in the list of contents. Many studies have shown that the major specialized journals turn out to be "core" journals. 4.1.3. The title of each paper should be informative, almost a summary of the abstract. Each paper should have an abstract in which the important findings are described precisely. The content of the paper should be arranged clearly, to allow "diagonal reading". 4.2. Secondary Literature. 4.2.1. Bibliographic Services. There are many bibliographic services in chemistry and related fields. Those mentioned in the following list of types of service are intended as illustrations only and do not form a complete list. The main types of service are as follows: ( I ) title lists (Chemical Titles, Current Contents); (2) patent lists; ( 3 ) abstracting and indexing services [Chemical Abstracts Service (CAS), JICST, PASCAL, Derwent's Chemical Patents Index] serving all fields and subfields of chemistry, selective services for particular fields [ Chemlnform, Index Chemicus], and progress bulletins; (4) services adapted to inverse and other special searches (Science Citation Index); and ( 5 ) business services (Chemical Industry Notes). Title lists have a high currency and are important in bridging the gap between the publication of a paper in a primary source and the appearance of an abstract of it in the secondary literature. Detailed abstracting services, such as CAS, claim to cover the whole field of chemistry comprehensively but cannot, and do not intend to, replace the primary literature. However, the abstracts are arranged in sections in accordance with conceptual criteria (a classification system), which permits rapid scanning and evaluation. The loss of detailed information in the abstracts is partially compensated for by a comprehensive indexing system that allows the information to be accessed from many directions. Nevertheless, there are several improvements that should be made: ( I ) Every compound about which new information is given in the primary source should be indexed and made searchable. (2) Stereochemistry (configuration, conformation) should be indexed and made searchable to the same extent as in the primary source. (3) Safety, environmental behavior, toxicity, etc. should always be searchable, even if they are not the main objective of the primary source. The economic damage that may result from lack of information on such matters can be much greater than the cost of providing the information. (4) The "legal" compounds contained in patents should be indexed in addition to the "real" ones. ( 5 ) The CAS Registry Number assigned to each clearly defined substance should be quoted by all services processing chemical information. (6) Similar systems of numbering should be established for other materials, such as plant extracts or minerals. (7) The abstracting services should make agreements among themselves to ensure that "borderline areas", for example, the medicine/chemistry interface, are covered in at least one of their products. In addition to the comprehensive abstracting services, there are selective services that cover the primary literature in accordance with clearly defined selection criteria. The apparent loss of information resulting from selection may be more than compensated for by the fact that the user receives only a manageable quantity of important and high-quality informa-
J . Chem. Inf: Comput. Sci., Vol. 30, No. 2, 1990 171
tion. Adherence to the criteria is monitored, for example, by an advisory board at FIZ Chemie. Though there has been some improvement in recent years, current business services concentrate too much on the American market and neglect both the European and the Asian ones. The special problems of patent literature have already been discussed (sections 1.1.3 and 3). Patent claims concerning chemical Markush structures are widely drafted and may generate a "hit" in nearly all substructure searches. Reform of patent law to restrict such claims to a "reasonable" extent is highly desirable, but may be difficult to achieve. 4.2.2. Numeric and Factual Databases. In many applications the user needs information of a factual or numerical character, not references to the original literature. These needs are met by collections of factual and/or numerical data; the collections may be printed or machine-readable. Machinereadable collections are usually called databases. Examples of factual collections are Beilstein's Handbook of Organic Chemistry, Gmelin's Handbook of Inorganic Chemistry, and Ullman's Encyklopaedie der technischen Chemie. Examples of numeric data collections are Wiley's Registry of Mass Spectral Data and the Cambridge Crystallographic Data File. There are also lists of producers and products of various kinds. Such collections may replace reference to the original literature if they contain only high-quality and critically evaluated data. The saving in cost of tracing and consulting the original publication should be taken into account in analyses of information cost effectiveness. Such collections will be of increasing importance in the future. Generation of new data collections should be encouraged, in fields such as toxicity, optical rotation, and reactions. All new collections should be machine-readable, and existing collections should be converted to machine-readable form as rapidly as possible. At present the effectiveness with which different fields are covered varies widely; crystallographic data are particularly well organized, with major databases for organic substances, inorganic substances, and metals, as well as several specialized databases (see, for example, ref 10). 4.3. Tertiary Publications. The factual and numeric databases described in section 4.2.2 occupy a position that is in some ways intermediate between secondary and tertiary publications. They are secondary in that they are based directly on the primary literature, but tertiary in that the data have been digested and critically evaluated. There is, however, a need for review papers (Progress in ..., Advances in ...) on general or specific topics. These are especially useful to those engaging in new research areas, but authors able and willing to write good reviews are in short supply * Chemistry is well served by textbooks at all levels from the most elementary to the most advanced; the ambition of authors and the self-interest of publishers ensure this. 5. NEW METHODS OF INFORMATION DISSEMINATION
Revolutionary progress has been made in the past 25 years, since the introduction of computers for processing information. What used to take many hours in a library, such as a retrospective search, is now completed in a few minutes, and the results are usually better and more comprehensive than were achieved by manual search. Before the introduction of computers, selective dissemination of information was rarely (if ever) attempted. Even full-text databases, such as CJACS (Chemical Journals of the American Chemical Society), can now be searched by computer, though indexing and search techniques are still under development; for example, they do
172 J . Chem. In/ Comput. Sci., Vol. 30, No. 2, 1990
not yet provide for graphics, such as structural formulas. It is of the greatest importance for chemists that chemical structures be encoded in machine-searchable form and that they can be retrieved in graphic form. Even now it is possible to search for structural components (substructures), thus retrieving classes of compounds. 5.1. Searching Bibliographic Databases. In addition to the classical Boolean operators (OR, AND, NOT) some special operators have been developed that improve the precision of search results without necessarily involving information loss. The question of whether an information specialist or the user should undertake the search (see refs 7, 1 1, and 12) cannot be answered in the abstract. Many factors influence the decision, for example, ( I ) the type of query, (2) the complexity of the database, (3) the qualifications and expertise of the searcher, and (4) the attitude of the administration to this problem. In user meetings the following improvements (not necessarily in order of importance) have been proposed by chemists: ( I ) A single command language should be introduced for all publicly accessible hosts and for nonpublic in-house systems. (2) Both a standardized (controlled) and a free vocabulary (keywords) should be usable simultaneously. (3) Thesauri should be available and filed with related databases. (4) Automated transliteration of different spellings (colour versus color, sulfur versus sulphur) should be implemented. ( 5 ) The CAS Registry Number or other relevant numbering systems should be given in all material databases. (6) Cross-file searching should be installed. (7) Automated switching to full-text databases or on-line ordering of copies of original documents should be made possible. (8) Substructure searches should be extended to Markush structures in patents. (9) Menu-driven searches should be an available option for beginners or occasional users. (1 0) Learning aids, such as manuals, diskettes, and software, should be improved, and more seminars and workshops should be provided. (1 I) It should be legally possible to down-load search results to the user's personal computer for off-line evaluation and manipulation. 5.2. Searching Factual Databases. Numerous factual databases already exist, for example, for crystallography,'* thermodynamics, and spectra. They contain, in addition to the concepts themselves (such as densities), the numerical measured values, which may in many cases depend on external conditions and are accompanied by error limits (ideally estimated standard deviations). Other data may contain ranges of values, such as 10.0-10.2% in the content of a material in a mixture. In such cases algorithms are required to search numerical values, their tolerances, or ranges, in addition to the normal search methods used for bibliographic databases. Factual databases require other new search features, such as ( 1 ) interconversion of units from those in the database to those of the user (inches to centimetres and vice versa; Celsius to Kelvin; psi to pascal) and ( 2 ) searching of "empty fields"-the user must be informed if failure to retrieve information is due to the fact that the data required have not been measured or reported as yet. Methods should be developed for presenting search results on monitors and print-outs ( I ) as tables in a format predefined by the program or set up by the user or ( 2 ) in graphical form, as curves or diagrams, and (3) in appropriate cases the databases should contain programs for the manipulation of search results, such as the inter- and extrapolation of thermodynamic data or for similarity searches in spectra.
POTZSCHER AND
WILSON
As for bibliographic data, it should be legally possible to down-load data for off-line manipulation, thus relieving the mainframe computer and saving telecommunication costs. When the database is not too large, it (or appropriate subsets) can be made available to users on diskettes or CD-ROMs. 6. NEW DEVELOPMENTS
6.1. Software De~elopments.'~ The ability of computers to manage and manipulate large quantities of data rapidly has stimulated the development of new research areas and new methodology. Examples of such developments are (1) synthesis planning, which makes it possible to search for chemical products from given starting materials or vice versa; (2) quantitative structure/activity relationships, for example, in the prediction of pharmacological activities, design of drugs, or of plausible metabolites; and (3) quantitative structure/ property relationships, useful, for example, in the interpretation of spectra. With the aid of such developments reliable structure-derived predictions can be made, and the risks of chemical, physical, or other experiments can be avoided, thereby saving time, trouble, and expense. The tools employed for such purposes are, for example, (1) chemometrics (mathematical methods for the interpretation of such relationships), (2) molecular modeling (for evaluating the electronic and spatial relationships in such interactions), and (3) expert systems (man/machine systems for the treatment of special problems). The efficiency of these methods is based on knowledge obtained by a comparison of observed and calculated values and by successive approximations after the acquisition of new measured data. Predictions will therefore become more precise with increasing size of the data files of chemical structures and of their properties or activities and with increasing quality of the measured data. It is therefore desirable ( 1 ) to transcribe the structural data of CAS, Beilstein, Gmelin and even Chemisches Zentralblatt from their very beginning into machine-readable form; (2) to update existing databases (for example, of spectra) with data of high quality on a continuous basis; and (3) to create new factual databases, for example, of optical rotation (dispersion and circular dichroism). 6.2. Networks. A large step into the future was taken when the international network STN (Scientific and Technical Network) was founded. At present it comprises three computer centers (hosts) where the main chemical databases are loaded. This development must continue, so that all large and small existing hosts are linked into a single network. Prerequisities for this goal are (1) a single command language that can also be extended for application to the factual databases and (2) a directory (a database of databases) to make possible a rapid survey of the tiles and the concepts (inclusive postings) used in them. A highly desirable development would be (3) an increase in the routine rate of data transfer via the telecommunication lines to at least 9600 baud (from the 2400 or even 1200 baud usual at present). 6.3. Data Compilation for the User. The use of diskettes or CD-ROMs on users' personal computers has already been mentioned. The development of more efficient, faster, and less expensive computers will make the management of large data collections or subsets of such compilations possible "at home" (or on in-house databases). It is thus of secondary importance whether these data are transferred by down-loading or offered on diskettes or CD-ROM. Of prime importance, however, is the provision of legal safeguards for preventing misuse. The software for searching and processing such data needs to be made available. Graphic representation of chemical structures during input and retrieval and topological search for (sub)structures is especially important for chemists.
USER NEEDSI N CHEMICAL INFORMATION 7 . EDUCATION I N INFORMATION, USER
TRAINING, AND USER FEEDBACK Even today the value of information is not assessed as highly as it should be, although ease of access to information adds greatly to productivity. When in commercial enterprises and educational institutions financial cuts are necessary, it is usually documentation departments or libraries that are the first to suffer. Their “products” cannot be marketed and do not yield an obvious profit. A long-term change in this attitude can only be achieved if, in schools and later in colleges and universities, information is taught as a serious subject, and steps are taken to ensure in-service training in firms and institutions. Only in this way can new developments be made known to those who should make use of them. Information disseminators and vendors must participate in ensuring that the commodity information is employed optimally. The general public can be reached by (1) lectures and discussions at a level intelligible to all and (2) provision of educational materials, such as articles in newspapers and magazines, and of textbooks at an appropriate level. For the user in a specialized field the following should be available: (1) user handbooks and/or learning diskettes; (2) menu-driven searches for beginners or occasional users; (3) “help” functions incorporated in the software for search and processing; (4) regular seminars or workshops for training users; ( 5 ) meetings of users for exchange of ideas and experience; ( 6 ) questionnaires to users for changes or improvements; and (7) establishment of advisory boards or panels as forums for consulting users. 8. LIBRARIES
The most important task of libraries is to supply information as quickly and cheaply as possible, and this task will not change in the foreseeable future. The “paperless” community that has been visualized will not come into being; since the introduction of the new technologies more paper has probably been consumed-not to say wasted-than before. Also, for economic reasons it will be impossible to convert all existing material into machine-readable form. Such material is especially important for long-lived sciences like chemistry, where solidly based research results retain their validity for many years. In spite of these considerations, however, limited monetary resources are a driving force toward rationalization of library and archive structures. There has been a change of attitude on the part of the authorities during the past generation: older books and long runs of journals used to be regarded as a valuable resource; now they are regarded as a waste of expensive storage space. In the United Kingdom, for example, university libraries have been instructed to adopt the principle of self-renewal-a euphemism for throwing away an old book whenever a new one is acquired. It does, however, reflect the fact that in the long run it will be physically impossible for any library to store all the literature required by its users, and it will have to concentrate on their main specialisms. The inconvenience of incompleteness can be mitigated by the use of on-line search and ordering services. 9. COSTS AND PRICES
There are two important facts about information. ( 1 ) Information and its availability to the user are of the utmost economic importance; without access to information “reinvention of the wheel” will take place repeatedly. (2) Information cannot be provided for nothing; those who work in the information area need an adequate income. I f the literature explosion continues-and there is every indication that the literature output will double every 5-10
J . Chem. Inf: Comput. Sci., Vol. 30, NO. 2, 1990 173
years-and if the financial resources of the user do not grow at the same rate, only a drastic rationalization at all levels can redress the situation. Much can be done by cooperation among services to reduce duplication. Requests often expressed by colleges and universities for low-cost or free information cannot, therefore, be fulfilled. On the other hand, it is without doubt that industry and other nonacademic organizations that depend heavily on information benefit not only from the research carried out in educational establishments but also from the knowledge and experience in information science that engineers, chemists, and other scientists have acquired in these institutions. Other information services should be encouraged to follow the CAS practice of a discount to academic institutions under appropriate conditions, and all services should extend similar help to academic and nonacademic institutions in the developing countries. In the latter case the rebate may be made to depend on the local economic situation and the kind of information required, but it should not be less than 50%. The problem is evident: the state, which subsidizes the printed output of books, journals, etc. in public libraries, cannot be expected to finance the same literature a second time in other forms, such as magnetic tapes or on-line services. CAS and FIZ Karlsruhe within S T N execute a practice whereby, under certain circumstances, a discount is given to college and university members. With CAS, nonacademic users indirectly subsidize such discounts via their on-line fees, whereas in the FIZ Karlsruhe procedure the subsidies are covered by the German government. Such practices cannot find universal application because many services are required to run at a profit or at least to recover their own full expenses. A possible method of supporting college and university use of information services and document delivery would be a public fund to which both industry and government would contribute. Such a practice would also stimulate rational and cost-effective use of search methods, which is possible only when libraries are able to buy and use the electronic equipment for on-line search and on-line ordering. ACKNOWLEDGMENT This paper was presented to the Working Group in Chemistry at a meeting of the International Council for Scientific and Technical Information in OrlCans on May 21, 1989, and has profited from the resulting discussion. REFERENCES AND NOTES Dissertation and thesis are now practically synonymous, one word being favored by some universities, the other by others. In strict usage a dissertation is a discussion of a topic, a thesis is an assertion to be justified. Gannett, E. K. NFAIS Newsl. 1989, 31(2), 26-36. A cynic has defined “gray literature” as what the abstracting services ignore. It may perhaps be better defined as literature produced by bodies whose main business is not publishing and which have no established distribution systems. Jochum, C.; Hicks, M. G.;Sunkel, J. (Eds.) Physical Property Prediction in Organic Chemistry; Springer-Verlag: Berlin, 1988. Kresze, G.; Potzscher, G. Die Zukunft der Chemiedokumentation (An Evaluation of a Questionnaireof the Gesellschaft Deutscher Chemiker); 1971; p 39. Weiske, Ch. Nachr. Chem. Tech. 1970, l8(12), 250-252. Jochum, C.; Moricz, P. Database 1987, 10(4), 41-46. Luckenbach, R. Chemical Information: Quo Vadis? Bibliographical and Factual Databases. Oesterr. Chem. Z . 1986,87, 284-289. Luckenbach, R. The Free Flow of Information: A Utopia? Ways To Improve Scientific and Technological Information and Its International Exchange. J . Chem. Inf. Comput. Sci. 1988, 28, 94-99. Allen, F . H.; Bergerhoff, G.;Sievers, R. (Eds.) Crystallographic Darabases; International Union of Crystallography: Chester, U.K., 1987. Buntrock, R. E.; Valicenti, A. K. J . Chem. Inl. Comput. Sci. 1985, 25, 203-207. Huber, M. K. Chem. Future, Proc. IUPAC Congr., 29th, 1983 1984, 439-449. GDCh-Fachgruppe Chemie Information Software-Entwicklung in der Chemie 1; Springer-Verlag: Berlin, 1986.