Chapter
16
Expert System Applications in Chemistry Downloaded from pubs.acs.org by UNIV OF CALIFORNIA SANTA BARBARA on 09/15/18. For personal use only.
ES-EPA Environmental Pollutant Analysis Hikaru Hirayama , Robert Wohlsen , and Constance Brede 1
2
2
Osaka Gas Research Center, 6-19-9 Torishima, Konohana-ku, Osaka 554, Japan SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025 1
2
ES-EPA (Expert System f o r Environmental Pollutant Analysis) i s an expert system f o r producing laboratory t e s t plans, including the appropriate sampling methods, pretreatments, t e s t methods and t h e i r order. ES-EPA generates test plans i n a stepwise manner from abstract plans c a l l e d "templates" to d e t a i l e d plans, using a h i e r a r c h i c a l planning mechanism. The knowledge base contains information on analysis items, t e s t methods, t e s t equipment, pretreatments and other necessary information. The prototype system has been successfully tested f o r various cases i n the domain. The development of a d e l i v e r y v e r s i o n has been completed and it w i l l soon be used i n the field on a d a i l y basis to further v e r i f y its f e a s i b i l i t y .
The A n a l y s i s Center of Osaka Gas Co., Ltd., performs laboratory tests to analyze environmental pollutants on about 1,000 samples per year. The goal of these tests i s to measure the concentration and other c h a r a c t e r i s t i c s of regulated substances i n the sample. P r i o r to executing the t e s t s , correct test plans must be made i n order to get c o r r e c t r e s u l t s . A t e s t plan c o n s i s t s of sampling methods, pretreatment methods, t e s t methods and t h e i r order. Test planning procedures for pollutant analysis are so complex that only experts who have been doing t h i s job f o r more than ten years perform adequately. F i r s t , an expert must c o l l e c t a broad range of necessary information on the sajnple: type, source, amount, sample conditions, purpose of the analysis and analysis items. The information needed varies with each case. Then, the expert makes a test plan using the information, h i s knowledge of the chemical c h a r a c t e r i s t i c s of the analysis items, and the conditions and constraints f o r a p p l i c a t i o n of the laboratory test procedures. A great deal of e f f o r t was expended i n an attempt to write a manual f o r t h i s job. It has not been completed because of the wide 0097-6156/89/0408-0200$06.00/0 o 1989 American Chemical Society
16. HIRAYAMAETAL.
ES-EPA: Environmental Pollutant Analysis
201
v a r i e t y of cases which must be described i n i t . For example, procedures f o r a very t y p i c a l example of waste water analysis might be given i n a 20 page manual. But i f the sample contains a s u b s t a n t i a l amount of suspended s o l i d s , s p e c i a l pretreatment i s required, while i f you want know the amount and contents of the suspended s o l i d s , then you need two more b o t t l e s of the sample and additional test procedures for analysis of the s o l i d s . I f the sample contains o i l or sea water, yet other kinds of pretreatments and test procedures are r e q u i r e d . There are many more c o n d i t i o n s and requirements which may a f f e c t the test procedures. I f you consider the various p o s s i b l e combinations, hundreds of d i f f e r e n t t e s t procedures are possible for waste water analysis alone. The development of the ES-EPA (Expert System for Environmental Pollutant Analysis) s t a r t e d i n September 1986. At the beginning of the p r o j e c t , we spent two months on a f e a s i b i l i t y study on the a p p l i c a t i o n of expert system technologies to t h i s domain, and concluded that t h i s problem p e r f e c t l y matched the methodologies both of knowledge representation and inference of expert systems. The development of the prototype system took t h i r t e e n months and was completed i n October 1987. The prototype system was implemented on a dedicated l i s p machine and had l i m i t e d knowledge, only enough f o r v e r i f i c a t i o n of the system concepts. It was tested by human experts for various cases i n the l i m i t e d domain and the generated plans were proved to be p r a c t i c a l . The development of a d e l i v e r y system with larger knowledge bases and better user interface but implemented on a conventional engineering work station, started i n January 1988 and was completed i n March 1989. I t w i l l soon be tested i n the f i e l d on a d a i l y basis. This paper describes the methods of planning and the implementation of the ES-EPA system. Planning Methods used by Human Experts A f t e r extensively interviewing human experts and doing experiments using an e a r l y v e r s i o n of the prototype system, we found two important c h a r a c t e r i s t i c s of the test planning methods used by human experts i n t h i s domain. F i r s t l y , experts do not always make a d e t a i l e d p l a n . They make a plan at an appropriate l e v e l of abstraction, a rough plan f o r one purpose and a d e t a i l e d plan for another purpose. Secondly, they do not make any plan from scratch. Experts have various kinds of "templates" which they use as the s t a r t i n g point of t h e i r planning or as part of a plan. These two c h a r a c t e r i s t i c s are explained i n more d e t a i l below. D i f f e r e n t Abstraction Level Plans. The experts i n t h i s domain can change the a b s t r a c t i o n l e v e l of the t e s t plan according to the amount of a v a i l a b l e information and the purpose of the plan. Below are three t y p i c a l s i t u a t i o n s that require plans with d i f f e r e n t l e v e l s of abstraction: (1) An analysis case starts with i n q u i r i e s from a c l i e n t about cost and duration. Because information about the sample at t h i s stage i s often incomplete and o c c a s i o n a l l y inaccurate, the expert usually makes a rough plan which i s only enough to estimate cost, time and sampling instructions. (2) When the sample a r r i v e s , the expert can get most of the information necessary to produce a d e t a i l e d plan. At t h i s stage the
202
EXPERT SYSTEM APPLICATIONS IN CHEMISTRY
d e t a i l e d plan i s necessary f o r i n s t r u c t i o n s to the l a b o r a t o r y equipment operators. This d e t a i l e d plan includes pretreatments, preliminary t e s t s , f i n a l t e s t s and t h e i r order. However, some parts of the plan may s t i l l be i n a b s t r a c t form i f the necessary information could not be obtained p r i o r to a pretreatment or a preliminary t e s t . (3) As the laboratory t e s t s are c a r r i e d out, the test plan becomes more and more d e t a i l e d . I t may be modified and added to according to the results of previous t e s t s and pretreatments. There are three reasons to change the test plan: f i r s t , i f new information i s revealed a f t e r executing these tasks; second, i f a measured value i s outside of the range of the t e s t equipment; and t h i r d , i f a measured value i s so close to the regulated concentration that a more accurate test method must be applied to further v e r i f y i t . Plan Templates. When a new c l a s s of t e s t plan made by an expert works well, he seems to store abstract forms of the plan along with the c o n d i t i o n s f o r using them. The next time he meets s i m i l a r conditions, he s t a r t s planning from the stored abstract plan. He uses these as "templates" for test planning. This kind of stored abstract plan was discussed f i r s t by Friedland (1). Templates range from general and abstract ones to s p e c i f i c and d e t a i l e d ones. General, abstract templates are used to represent a large part of a plan or even a whole plan. For example, i f a sample i s some kind of i n d u s t r i a l waste, a human expert s t a r t s planning with a very rough plan which i s t y p i c a l f o r i n d u s t r i a l wastes, instead of making a plan from scratch. Later, he w i l l modify t h i s plan considering any a t y p i c a l requirements and c o n s t r a i n t s of the analysis. S p e c i f i c , d e t a i l e d templates are used for smaller parts of the plan. For example, i f the t e s t plan should include a s e r i e s of procedures r e l a t e d to an IC (ion chromatography) t e s t , a t y p i c a l template f o r IC test procedures w i l l be applied to the plan. These t y p i c a l procedures for the IC test w i l l be modified l a t e r , too. Hardware and Software Environment, o f t h e ES-EPA System 1
The prototype system of ES-EPA was implemented using I n t e l l i C o r p s KEE and Common Lisp on a Symbolics 3600 s e r i e s computer. Before we chose the development environment, we spent two months on a needs a n a l y s i s . During t h i s stage, we f i r s t made a conceptual design of the system, then we clarified the hardware and software requirements. Major requirements are: f i r s t , the software t o o l should have a frame representation, forward chaining production r u l e s and an object o r i e n t e d programming language; second, both hardware and software should be able to handle Japanese characters, e s p e c i a l l y k a n j i characters; t h i r d , the combination of software and hardware should be fast enough to handle large knowledge bases. We made very small toy prototypes using several d i f f e r e n t combinations of software and hardware, and d i d experiments on these to choose the prototype development platform. For the d e l i v e r y environment, we continued to use KEE and Common L i s p as the software platform, but chose a conventional engineering workstation, Sun 3/60, as hardware. Sun 3/60 i s powerful
16. HIRAYAMAETAL.
ES-EPA: Environmental Pollutant Analysis
203
enough to d e l i v e r ES-EPA. We considered the p o s s i b i l i t y of using a 386 machine, but we gave up on t h i s idea because Japanese language was not supported i n KEE on 386 machines when we had to decide. If we have to choose l e s s powerful machines, probably we have to consider the p o s s i b i l i t y of p o r t i n g ES-EPA to a l i g h t e r software environment. Although we used KEE, we r e s t r i c t e d ourselves to using only the fundamental f u n c t i o n s of KEE, namely, the frame r e p r e s e n t a t i o n , the production r u l e s and the object o r i e n t e d programming. We d i d not use the KEE provided user i n t e r f a c e and other f a c i l i t i e s because we thought that i t would reduce the amount of work when porting to a d i f f e r e n t environment became necessary. Architecture and Knowledge Bases of ES-EPA The system consists of three major modules: the Control Module, the Knowledge Base Module, and the User Interface Module. Figure 1 shows a schematic of t h i s a r c h i t e c t u r e . Arrows between modules represent the flow of information. The flow of the session and dialogue with the user are c o n t r o l l e d by the Control Module. The Knowledge Base Module provides a l l the necessary information for making test plans: a n a l y s i s items, sample types and purposes of a n a l y s i s , a n a l y s i s tasks and customers. A l l input and output, i n various forms i n c l u d i n g menus, tables and graphics, are processed by the User Interface Module. ES-EPA has four major knowledge bases i n i t s Knowledge Base Module: Analysis Item KB, Sample Type and Purpose KB, Analysis Task KB, and Customer KB. The frame representation and i t s s l o t value are used f o r expressing d e c l a r a t i v e knowledge. Procedural knowledge i s expressed as methods of the object programming language or forward chaining production rules. ES-EPA c u r r e n t l y has about 1,000 frames and 300 production rules. In the A n a l y s i s Item KB, a n a l y s i s items of environmental p o l l u t a n t a n a l y s i s , such as concentration of elements or ions are represented by frames. Each a n a l y s i s item frame has information about applicable analysis methods or equipment, l e g a l regulations, a p p r o p r i a t e m a t e r i a l f o r sample b o t t l e s and other a p p l i c a b l e information. The Sample Type and Purpose KB contains knowledge about sample types and p o s s i b l e a n a l y s i s purposes f o r each sample type. For example, there are many kinds of i n d u s t r i a l wastes: p o l l u t e d mud, slag, embers and so on. Possible analysis purposes d i f f e r according to the type of sample, and t y p i c a l analysis items d i f f e r according to the analysis purpose. The A n a l y s i s Task KB contains a l l kinds of a n a l y s i s plan elements from abstract task components to s p e c i f i c tasks. Abstract components have information f o r plan refinement i n the form of templates. S p e c i f i c tasks include sampling methods, pretreatments, equipment t e s t s , analysis methods defined i n the Japan I n d u s t r i a l Standards and others. A s p e c i f i c task has information on the conditions and constraints of the task application, instructions for the task execution and so on. In the Customer KB, an index of a l l past customers and the r e s u l t s of past a n a l y s i s are stored. Past records are r e t r i e v e d e a s i l y and used as references and sometimes as a large template of a new plan.
204
EXPERT SYSTEM APPLICATIONS IN CHEMISTRY
The ES-EPA Planning Mechanism In order to have multiple representations of a plan with d i f f e r e n t abstraction l e v e l s , a h i e r a r c h i c a l planning mechanism i s necessary. The method i s f i r s t to sketch a plan that i s complete but vague and then gradually r e f i n e the vague parts into more d e t a i l e d sub-plans u n t i l the plan c o n s i s t s of a complete sequence of d e t a i l e d components (2). A s i m p l i f i e d schematic of the h i e r a r c h i c a l planning mechanism of ES-EPA i s shown i n Figure 2. There are four representations of a plan with d i f f e r e n t abstraction l e v e l s : Level 1, 2, 3 and 4. In t h i s figure, the rectangular boxes represent s p e c i f i c tasks that can not be expanded any further, and boxes with rounded corners represent abstract components. In the ES-EPA system, a l l of the abstract test plan components have templates. Some abstract components have only one template, while others have multiple templates. I f a component has only one template, the same template must be used for a l l s i t u a t i o n s . In t h i s case, the template i s stored as a default value i n the Template Slot of the frame. An example of such a template i s shown i n Figure 3. If a component has multiple templates, then the most s u i t a b l e template for the s i t u a t i o n must be chosen. In t h i s case, multiple templates are represented as a set of production r u l e s . The antecedents (the l e f t hand side or the " i f part") of t h i s kind of rule represent the conditions f o r using that template, and the consequences (the right hand side or the "then part" of the rule) put the template i n the Template Slot of the frame. Figure 4 shows an example of such rules. On Level 1 i n Figure 2, the most a b s t r a c t t e s t plan i s represented by a single component which i s always the Test-Plan-ofSample. The Test-Plan-of-Sample has tens of d i f f e r e n t templates i n production r u l e form. In t h i s case, as the sample i s some kind of water sample and the purpose i s the ordinary waste water analysis, the Waste-Water-Task-Set-Rule shown i n Figure 4 i s applied. On Level 2, as a result of the execution of the Waste-WaterTask-Set-Rule, the Test-Plan-of-Sample of Level 1 i s expanded into a set of less abstract components: Sampling, Waste-Water-Task-Set and Report. The Report component i s s p e c i f i c and cannot be expanded any more, but, Sampling and Waste-Water-Task-Set are s t i l l abstract components on t h i s l e v e l . Both components have multiple templates as sets of production rules, and the most appropriate ones are applied. On Level 3, Sampling i s changed to a s p e c i f i c task which i s Sampling-in-Plastic-Bottle. Waste-Water-Task-Set i s expanded i n t o four task components. The IC-Test-Set has multiple templates. Among the template rules, the IC-Filter-Rule i s selected because there are suspended s o l i d s i n the sample. The ICP-Test-Set has only one template as a slot value of the frame which i s shown i n Figure 3. On Level 4, a l l the abstract components are expanded i n t o s p e c i f i c task components. The h i e r a r c h i c a l planning mechanism stops on t h i s l e v e l . Then rules f o r d e t a i l e d modification are applied to the plan. Certain components might be added or deleted a f t e r the application of these rules.
HIRAYAMAETAL.
ES-EPA: Environmental Pollutant Analysis
Figure 1.
Architecture of ES-EPA Test-Planof-Sample
Level 1
Sampling |
Level 2
Level 3
^«JL
J
1—'
' QualitativeAcidDecompose 7" ICP-Test
i
Quantitative-
Make-
t ICP-Test IUM-
Std-Solution
* Filtering
Level 4
Report
ICPPretreatmentSet
Sampling in-PlasticBottle
Sampling in-PlasticBottle
WasteWaterTask-Set
Take- -1 Liquid. ~ — pH-Test
Figure 2.
H i e r a r c h i c a l Planning i n ES-EPA
Report
206
EXPERT SYSTEM APPLICATIONS IN CHEMISTRY
(Unit: ICP-Test-Set (Slot: Inststruction (Value
))
(Slot: Template (Value ((.Actions Make-Std-Solution Select-WaveLength Quantitative-ICP-Test) (:Order Make-Std-Solution Quantitative-ICP-Test) (:Order Select-WaveLength Quantitative-ICp-Test) (:Entry Make-Std-Solution Select-WaveLength) (:Exit Quantitative-ICP-Test)))) )
Figure 3.
Single Template
User Interface of ES-EPA Generally speaking, i f an expert system i s expected to be used i n the r e a l world on a d a i l y basis, users demand better user interfaces and knowledge engineers tend to spend s u b s t a n t i a l amounts of time adding and changing user i n t e r f a c e f a c i l i t i e s . The User Interface Module of ES-EPA was developed not only to enable e f f i c i e n t i n t e r a c t i o n between the user and the system, but also to reduce the amount of e f f o r t needed f o r knowledge engineers to add and modify the user interface. The User Interface Module has many functions f o r generating input and output windows and these can e a s i l y be modified and combined. The module also has knowledge base e d i t i n g f a c i l i t i e s . Figure 5 shows a t y p i c a l layout of some user interface windows. Window 1 i s the "ES-EPA Logo Window". By c l i c k i n g t h i s window using the mouse, the main menu, which i s Window 2, w i l l appear. Window 3 i s the "Test Plan Graphic Window". I t shows the order of tasks and represents the duration of each task by the width of the box. Window 4 i s the "Table Window". In t h i s example, the Table Window shows the t e s t methods, analysis items and costs i n table form. Window 5 i s the "Menu Window". In t h i s example, analysis items can be selected from t h i s window. Window 6 i s the "Help Window". The help window guides users through ES-EPA. For example, i f a user cannot understand the meaning of a question from ES-EPA, he can get a d e t a i l e d explanation of the question i n t h i s window by c l i c k i n g the question.
HIRAYAMA ET AL.
ES-EPA: Environmental Pollutant Analysis
(Rule: Waste-Water-Set-Rule (IF (AND (Sample of Test-Plan is ?Sample) (?Sample is in Class Waste-Water) (Purpose of Test-Plan is Ordi nary-Waste-Water-Analysis)) THEN (Template of Test-Plan is (:Entry Sampling)) (Template of Test-Plan is (:Order Sampling Waste-Water-Task-Set Report)) (Template of Test-Plan is (:Exit Report))))
Test-Plan-of-Sample-Templates
(Rule: Indutrial-Waste-Rule
(Rule: Sewage-Rule
(Rule: IC-Filter-Rule (IF (AND (Test of ?Task-Set is IC-Test) (Sample of ?Task-Set is ?Sample) (?Sample is in Class Water) (Suspended-Solid of ?Sample is Present) THEN (Template of ?Task-Set is (:Entry Filtering)) (Template of ?Task-Set is (:Order Filtering Liquid-Take IC-Test)) (Template of ?Task-Set is (:Exit IC-Test))))
IC-Test-Set-Templates
Figure 4.
Multiple Template
208
EXPERT SYSTEM APPLICATIONS IN CHEMISTRY