Chapter 1
Introduction Expert System Applications in Chemistry Bruce A Hohne and Thomas H. Pierce Downloaded by UNIV OF NEWCASTLE on March 11, 2017 | http://pubs.acs.org Publication Date: September 1, 1989 | doi: 10.1021/bk-1989-0408.ch001
1
2
1Rohm and Haas Company, Spring House, PA 19477 Rohm and Haas Company, Bristol, PA 19007 2
T h i s symposium series volume is the second of its k i n d (1). B o t h symposia were organized w i t h several purposes in m i n d . The first, and most general, is simply to expose the chemical community to expert systems. Expert systems should be of interest to people in a wide variety of fields of chemistry. The second i s to educate chemists in the capabilities of expert systems; both what they can and can not do for them. F i n a l l y , by presenting a variety of applications, it was hoped that attendees would generate further new ideas for expert system applications. The second symposium presented the additional opportunity to review the progress of some of the work described i n the first symposium. The first symposium, presented at the fall meeting of the American Chemical Society in 1985, was open to any type of artificial intelligence application. The second symposium was restricted to expert systems. This is only a minor restriction because most practical artificial intelligence applications to date have used expert systems in one form or another. Below is a brief overview of expert systems. More information on expert systems for chemistry can be found in reference 2. General expert system information can be found in references 3 and 4. Because artificial intelligence has developed i t s own vocabulary, a short glossary is also included. Expert Systems Expert systems are programs that attempt to solve problems i n a w a y similar to how a human expert would solve them. They incorporate "rules of thumb" that experts i n the field have developed through years of experience. The problems attacked are not necessarily procedural, they are often vague, complex, and can contain incomplete or inexact information. Expert systems contain three basic pieces: a knowledge base, a n inference engine, a n d a user interface. The knowledge base contains the information which the program uses to reach decisions. A key 0097-6156/89/0408-0002$06.00/0 o 1989 American Chemical Society
Hohne and Pierce; Expert System Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1989.
1.
HOHNE & PIERCE
Introduction: Expert System Applications in Chemistry
Downloaded by UNIV OF NEWCASTLE on March 11, 2017 | http://pubs.acs.org Publication Date: September 1, 1989 | doi: 10.1021/bk-1989-0408.ch001
difference between expert systems and classical computer programs is the fact that the knowledge is separated from the program. The inference engine is the program that manipulates the knowledge base to reach these decisions. The user interface allows the program and chemist to communicate w i t h each other i n a n effective manner. These three pieces of a n expert system are described below. Knowledge Base. The knowledge base of a n expert system is a resource of information about a specific domain, or problem area. The knowledge base is the most important part of the expert system, and the most difficult to construct. The completeness and accuracy of the knowledge base w i l l determine how well the expert system w i l l perform solving problems. The scope of problems w h i c h can be solved is completely determined by the scope of the knowledge base. Knowledge must be encoded i n such a way that i t can be a) entered into the computer; b) manipulated by the inference engine; c) understood by the expert. There are several common ways to encode the expert's knowledge, each w i t h its own advantages and disadvantages. The encoding scheme must match the underlying structure of the knowledge. The two most common methods of encoding knowledge are production rules and frames. Sometimes both methods are used. Production rules take the form: I F x is true T H E N y is true. F o r example: I F the p H is less than 7 T H E N the solution is acidic. The left h a n d side (x) may contain any number of clauses combined by Boolean algebra operators. In a l l but the simplest expert systems i t is possible to include the expert's confidence i n the conclusion. A frame describes hierarchical dependencies between objects. In a frame, the upper level (or parent) object passes attributes to the objects beneath i t i n the hierarchy (children). I n other words, children inherit attributes from their parents. F o r example: ( F R A M E - Alcohol ( S L O T - Reactions ( V A L U E - "Oxidized by permanganate") ( V A L U E - "...")) ( S L O T - Infrared Absorbances ( V A L U E - 3600 c m - 3100 cm ))) 1
1
( F R A M E - Ethanol ( S L O T - Chemical Class ( V A L U E - Alcohol))) Inference Engine. The inference engine is the central program which manipulates the rules and facts i n the knowledge base to reach conclusions. The structure of the inference engine depends strongly upon the type of knowledge base which the expert system incor-
Hohne and Pierce; Expert System Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1989.
3
Downloaded by UNIV OF NEWCASTLE on March 11, 2017 | http://pubs.acs.org Publication Date: September 1, 1989 | doi: 10.1021/bk-1989-0408.ch001
4
EXPERT SYSTEM APPLICATIONS IN CHEMISTRY
porates. Inherent i n a l l of the program structures, however, is a basic set of functions which expert systems perform. These functions are described below using an example knowledge base which incorporates production rules. The inference engine may approach the problem from either the top or the bottom, beginning w i t h either facts or conclusions. I f a user begins w i t h several hypotheses and wants to determine which, i f any, are correct, then the program should examine a l l the facts using a goal-directed (also called goal-driven) approach. However, i f the user begins w i t h a series of facts which are known to be true and wants to determine what conclusions can be reached, the program should use a data-directed (also called data-driven) approach. A goal-directed expert system begins w i t h a limited set of possible hypotheses and attempts to prove the validity of each one. This type of expert system uses a reverse-chaining (also called backward-chaining) algorithm. The knowledge base is searched to find a rule which concludes the i n i t i a l hypothesis. The IF-clauses from this rule then becomes the hypotheses for the next level of the search. The process continues u n t i l a l l of the remaining IF-clauses are known to be true (hypothesis is true) or u n t i l no more rules apply (hypothesis is false). This approach starts at the bottom (conclusions) and works its way to the top (facts). A n example of this mechanism is illustrated by a n H P L C trouble-shooting problem: A scientist is experiencing problems w i t h a n H P L C . One possible problem is a broken pump. The expert system searches its rules for one that concludes the pump is broken. It may find a rule like: I F "there is no solvent flow" "all tubing is properly connected" "there are no obstructions" "the solvent program module is functioning" T H E N "the pump is broken". The next step would be to search for rules which conclude "there is no solvent flow". The process would continue u n t i l the conclusion "the pump is broken" is either proved or disproved. I n this example, the chemist would be asked to input information whenever a n I F clause was a n observable symptom of a n H P L C problem. Data-directed expert systems begin w i t h a list of the facts known to be true, and see what conclusions can be drawn from those facts. This type of expert system uses a forward-chaining mechanism. E a c h rule i n the knowledge base is tested to see i f a l l of its I F clauses are contained i n the list of known facts. W h e n such a rule is found, the system adds the T H E N - c l a u s e s from the rule to the list of known facts. A l l the rules i n the knowledge base are scanned repetitively u n t i l no new facts can be concluded. A n example of using forwardchaining is illustrated by a structure elucidation problem based on an IR spectrum:
Hohne and Pierce; Expert System Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1989.
1.
H O H N E & PIERCE
Introduction: Expert System Applications in Chemistry
A scientist wants to identify a compound for which the peak locations i n a n ER spectra are known. The list of known facts is the location and intensity of the peaks i n the spectrum. The program searches the knowledge base for a rule which has these peak locations i n its IF-clause. F o r example, assume there are peaks at 3300, 1410, 1300, and 1050cm . That would cause the following rule to fire: I F "there is a large peak between 3000-3500cm T H E N "there is evidence for a n alcohol". This would place the fact "there is evidence for a n alcohol" on the list of known facts. O n the second pass through the knowledge base the following rule could fire: I F "there is evidence for a n alcohol" "there is a peak between 1460-1400cm " "there is a peak between 1350-1260cm " "there is a peak between 1080-1010cm* " T H E N "there is evidence for p r i m a r y alcohol". The system continues to search its knowledge base for rules which w i l l identify probable structures of the compound. 1
Downloaded by UNIV OF NEWCASTLE on March 11, 2017 | http://pubs.acs.org Publication Date: September 1, 1989 | doi: 10.1021/bk-1989-0408.ch001
1M
1
1
1
The two examples above show that the structure of a problem determines what type of search to use. I n the first example, there is only a limited number of components which could be broken, and a large number of possible symptoms. A data-driven approach to this problem would be to ask the scientist for a l l the symptoms, some of which may require disassembling the H P L C to determine. The problem, therefore, demands a goal-directed approach. In the second example, the scientist began w i t h a set of facts and was looking for a l l possible conclusions. The goal-directed approach to this problem would be to attempt to prove whether each one of the millions of compounds reported i n the chemical literature could be responsible for the observed spectrum. A g a i n , the structure of the problem determines the structure of the expert system. The performance of an expert system may be increased by using heuristic rules to eliminate solutions which may be possible, but are unlikely given the constraints of the problem. T h i s is how h u m a n experts solve complicated problems. T h i s added knowledge is called meta-knowledge. I n knowledge bases which use production rules, meta-knowledge takes the form of meta-rules. These rules instruct the expert system how to choose which rule to use when more t h a n one set is relevant. U s i n g organic synthesis as an example, a meta-rule might take the form: I F : "multiple reactions have the same product" T H E N : "use the one w i t h the highest product yield". U s e r Interface. The user interface can be separated into two parts; the part used by the expert to build the knowledge base, and the part used by the novice to solve domain problems. The user interface must provide the expert w i t h the following functions: a method to
Hohne and Pierce; Expert System Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1989.
5
Downloaded by UNIV OF NEWCASTLE on March 11, 2017 | http://pubs.acs.org Publication Date: September 1, 1989 | doi: 10.1021/bk-1989-0408.ch001
6
E X P E R T SYSTEM APPLICATIONS IN CHEMISTRY
enter knowledge, a method to modify rules, and a method to test and manage the knowledge base. The contents of the knowledge base must be easily expandable. W h e n a n expert adds knowledge, or rules, the program must check for contradictions w i t h previous rules. It should also test these new rules against previously defined problems. W h e n the system produces a n incorrect answer, the expert must have ways to examine the knowledge base and make corrections to i t . The user interface must make i t possible for the novice to quickly and effectively solve problems. The functions required for this include: prompting for information, explaining why the information is needed, and telling how a conclusion is reached. Explanations usually display the progression of rules used by the system to solve the problem. Some of the more sophisticated commercial expert systems take advantage of computer graphics to perform these functions. Developing Y o u r O w n Expert System The references contain a great deal of information about developing expert systems. Rather than repeat their advice, this section presents the issues from a chemist's point of view. There are five major issues which must be considered when developing a n expert system: 1. 2. 3. 4. 5.
Should you develop in-house expertise i n expert systems, rely on outside consultants, or some combination of the two? Should your organization be involved i n expert system development? Be aware of the limitations of the technology. Application selection, including the domain expert. Select what computer platform(s) the system must be able to r u n on. Select what software package to purchase.
The first two issues are political, and must be decided w i t h i n your organization. The remaining issues have a political component, but are mostly technical issues. Selecting an application to automate using a n expert system is a detailed process, covered i n the references. The entire process reduces to determining i f the project is cost justifiable, and determini n g that expert systems are the best technology to meet your needs. Areas which should be rich w i t h possible applications include: instrumentation (optimization, trouble-shooting, operation); chemical structure elucidation; computerized manuals (government regulations, corporate guidelines); product specific (applications specific to one vendor's product or product line); safety (risk assessment, safety reviews, emergency procedures); process control (optimization, troubleshooting, control). Selecting the hardware and software to r u n a n application are interrelated, because some expert system software runs only on certain hardware. Hardware options vary from personal computers to mainframes. Software pricing runs from about $100 to 500 times
Hohne and Pierce; Expert System Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1989.
1.
H O H N E & PIERCE
Introduction: Expert System Applications in Chemistry
that. W i t h the power of today's expert system shells, there is almost no reason to develop your own shell / inference engine. Most applications can be prototyped on a P C , and then migrated to more powerful hardware, i f and when necessary. The need to migrate applications favors shells which r u n on multiple computer platforms.
Downloaded by UNIV OF NEWCASTLE on March 11, 2017 | http://pubs.acs.org Publication Date: September 1, 1989 | doi: 10.1021/bk-1989-0408.ch001
The F u t u r e Impact of Expert Systems i n Chemistry Expert systems offer the possibility of revolutionizing the practice of chemistry. There are, however, two major problems which must be overcome. The first, which is easy to overlook u n t i l you develop your first expert system, is the time required to extract the knowledge from the expert and encode i t into a n expert system. The second major problem is the amount of computation required to solve a given problem. T h i s problem is compounded because the most useful applications tend to be the most complex. The time required to build the knowledge base may be the most severe problem, when you consider both the amount of chemical information available today and the rate more is being discovered. Fortunately, small systems which do not contain a broad base of chemical information can still be very useful. E v e n so, unless there are major improvements i n knowledge engineering, the rate at which knowledge bases can be compiled w i l l be a l i m i t i n g factor. Several approaches are being investigated to solve the problem of execution speed. Work is proceeding on two hardware solutions: faster processors that are designed specifically for artificial i n t e l l i gence and parallel architectures which work on multiple parts of the problem simultaneously. Software solutions attempt to reduce the number of computations using more efficient algorithms and better heuristics. The possible benefits of expert systems i n a l l fields are so great, that these problems w i l l be solved, m a k i n g expert systems more cost effective. The cost of "doing chemistry" has skyrocketed. T h i s increase has been caused by increased costs of equipment and raw materials, and the increased cost of handling chemicals safely. Expert systems applied to the areas listed above, and i n this volume, can help increase productivity and safety i n the field of chemistry. The future of chemical applications i n chemistry w i l l b r i n g many exciting new applications. The time w i l l come when the use of expert systems w i l l be a n integral part of the practice of chemistry. Expert systems w i l l not replace chemists, they w i l l rather be very useful assistants which handle the details and allow the chemist to concentrate on the more challenging problems. Glossary Artificial Intelligence (AI): The field of computer science which attempts to develop computer systems which solve problems traditionally thought to require "human" intelligence. Backward-chaining: A technique i n logical inference using I F - T H E N
Hohne and Pierce; Expert System Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1989.
7
Downloaded by UNIV OF NEWCASTLE on March 11, 2017 | http://pubs.acs.org Publication Date: September 1, 1989 | doi: 10.1021/bk-1989-0408.ch001
8
EXPERT SYSTEM APPLICATIONS IN CHEMISTRY
rules. The strategy is to prove a conclusion ( T H E N clause) by proving a l l the I F clauses i n the rule. These I F clauses become the. conclusions i n the next level of the search. The search moves backwards i n this fashion, using the knowledge base to prove or disprove the i n i t i a l hypothesis. F o r example, the reactants i n a chemical reaction may be the products of another reaction, thereby allowing the scientist to trace the reaction back to simple starting materials. Certainty factor: The confidence the scientist has i n a given piece of information. U s u a l l y a value i n a range of numbers (e.g. 0 to 1 or -1 to 1). They may or may not have a statistical basis. They may apply to the confidence the expert has i n the conclusion i n a rule, or to the input supplied by the end-user during a problem solving session. Confidence factor: see Certainty Factor Data-driven: see Forward-chaining Domain: The area of study, or expertise, covered by a n expert system. I n this case, a field of chemistry. Domain expert: The person whose expertise is encoded into the expert system. This person's expertise, or knowledge, is the basis for the knowledge base. Explanation facility: The part of a n expert system's user interface which describes how the program reached a particular conclusion. T h i s ability is a key advantage to expert systems, and is an important part of end-user acceptance of the system. Forward-chaining: A technique i n logical inference using I F - T H E N rules. A l l possible conclusions are d r a w n from a list of known facts. Because each new conclusion (THEN-clause) can be used as a fact (IF-clause) to derive other conclusions, the program iterates u n t i l no new facts can be concluded. Frames: A n hierarchical method of representing knowledge i n an expert system. U s i n g the hierarchy, information is either explicitly stated or inherited from a higher level. Rules are also used to connect facts throughout the frames. Expert systems using frames w i l l search hierarchies to associate facts w i t h conclusions. Goal-driven: see Backward-Chaining Heuristic: A n expert's "rule of thumb" used i n solving a problem. Heuristics provide directions which can speed up the time a program takes to solve a problem. U s i n g heuristics does not guarantee that the best solution w i l l be found. T h i s limitation is acceptable i n cases where more possible solutions exist t h a n can be explored. It can also refer to self-learning. Inference engine: The part of a n expert system which performs the logic calculations using the knowledge base and user's input. The engine determines which rules are appropriate and under which conditions to apply each rule. Knowledge base: The part of a n expert system which contains the information about the problem area. Different methods for
Hohne and Pierce; Expert System Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1989.
Downloaded by UNIV OF NEWCASTLE on March 11, 2017 | http://pubs.acs.org Publication Date: September 1, 1989 | doi: 10.1021/bk-1989-0408.ch001
1.
HOHNE & PIERCE
Introduction: Expert System Applications in Chemistry
representing this information are used by expert systems. The two most common methods are currently rules and frames. Knowledge engineering: The study of techniques to acquire the domain expert's knowledge, and encode i t into the expert system. L I S P : A programming language used to write A I systems. L I S P is a n acronym for L I S t Processor. Production rule: Another name for I F - T H E N rules used i n representi n g information i n a n expert system's knowledge base. Production system: Another name for a n expert system. PROLOG: A programming language commonly used to code A I application programs. P R O L O G is a n acronym for P R O g r a m m i n g i n L O G i c . It is used predominantly i n Europe. Reverse-chaining: see Backward-chaining. Shell: A n expert system shell is a package of programs including a n inference engine, user interface, and knowledge base m a i n tenance tools. It is, i n essence, a n expert system w i t h no knowledge base. The expert begins here, and then creates their own knowledge base. References 1.
2. 3. 4.
Pierce, T . H . , Hohne, B . A . , Eds.; Artificial Intelligence A p p l i c a tions i n Chemistry, American Chemical Society Symposium Series V o l . 306; American Chemical Society: Washington, D C , 1986. Hohne, B . A . , Pierce, T . H . , Bright, M . A . ; in Encyclopedia of Microcomputers Volume 1; M a r c e l Dekker. Inc.: N e w Y o r k , 1987, p 342-368. Weiss, S . M . , K u l i k o w s k i , C.A.; A Practical Guide to Designing Expert Systems; Rowman & Allanheld: N e w Jersey, 1984. Hayes-Roth, F . , Waterman, D.A., Lenat, D . B . , E d s . ; B u i l d i n g Expert Systems; Addison-Wesley: Massachusetts. 1983.
RECEIVED
June 9, 1989
Hohne and Pierce; Expert System Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1989.
9