A Rule-Induction Program for Quality Assurance ... - ACS Publications

Hawkinson, Knickerbocker, and Moore. ACS Symposium Series , Volume 306, pp 69–74. Abstract: Expert systems technology can provide improvements in ...
0 downloads 0 Views 1MB Size
3

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

A Rule-Induction Program for Quality Assurance-Quality Control and Selection of Protective Materials L. H. Keith and J. D. Stuart Radian Corporation, Austin, TX 78766-0948

This chapter describes two prototype expert systems for chemical applications being developed using RuleMaster. (1) The f i r s t , QualAId, is a traditional type of system where knowledge on how much and what type of quality assurance (QA) and quality control (QC) is needed for various types of environmental analyses. The second, GloveAId, is being developed to help select the best glove material(s) for protection against a wide variety of hazardous chemicals. However, unlike the former example, the knowledge base for selecting the best glove materials is not yet known. Therefore, experimental data is being subjected to the rule-induction process of RuleMaster and the resulting correlations are examined and tested to help formulate the rules which are, in turn, used to build the expert system. QualAId The prototype of QualAId currently i n existence i s one small part of the t o t a l framework needed f o r a useful expert system. The object i v e of QualAId i s to provide advice on how much and what type of QA/QC i s needed f o r various types of environmental analyses. The rules f o r determining these needs have been derived from the American Chemical Society (ACS) p u b l i c a t i o n , " P r i n c i p l e s of Environmental Analysis," (2) and from various protocols and recommendations of the U.S. Environmental Protection Agency (EPA). This p a r t i c u l a r demonstration module only incorporates d e c i sions involving analysis of v o l a t i l e and semivolatile organic compounds from water. These compounds are, by d e f i n i t i o n , v o l a t i l e enough to be separated by gas chromatography (GC). The complete expert system w i l l incorporate decisions based upon any type of chemical i n any type of matrix and w i l l also be capable of providing advice s p e c i f i c a l l y f o r selected EPA methods commonly i n use, i . e . , EPA Methods 624, 625, 1624, 1625, the various non-mass spectrometric 600 Methods, etc. (Figure 1). 0097-6156/ 86/ 0306-0031 $06.00/ 0 © 1986 American Chemical Society

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

high

Litigation

No

QA/QC M

medium

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

Importance low

Yes

^

f

•I

methodAId Routine

y J

"

^ • \

Figure 1a. half).

.

Inorganic Routine

"*\

/

sampleAId Routine

e

\ Advice

J

Advice

j

Advice

I

•(

\ .'

Θ

A

"^ Specific Methods Routine

d

v

i

C

e

^ \

)

V

f ι '



* Advice

\

Diagram of Modules for QualAId Expert System

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

) y

(first

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

KEITH A N D STUART

A Rule-Induction Program for QA-QC

Determine Extent of Method Verification and Validation

ι Determine Number of Samples Planned

~

i



Determine Probable Analyte Concentration Range

Figure 1b. half).

Diagram of Modules for QualAId Expert System (second

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

34

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

The purpose of t h i s expert system i s t o provide consistently good advice i n both the types and amounts of QA/QC to use. There are many decisions to make and errors are very expensive i n terms of time and money. The expert system i s comprised of a series of modules encompassing the many varied aspects of decision-making. Information from each of these modules i s a v a i l a b l e to other modules t o make d e c i sions where they require i n t e r r e l a t e d knowledge. For example, the f i r s t module. Confidence L e v e l , i s key t o many of the decisions that w i l l be made i n other modules. The f i r s t query by the computer asks the user whether the r e s u l t i n g a n a l y t i c a l data w i l l be used f o r enforcement or l i t i g a t i o n actions. I f the answer i s "yes," then a high l e v e l of confidence w i l l be needed and the user i s advised of t h i s assignment. I f the answer i s "no," then the user i s asked to specify how important he/she views the accuracy and p r e c i s i o n of the data. Routines awaiting future development w i l l provide advice on the best a n a l y t i c a l methodology and sampling procedures, QA/QC needs f o r inorganic, n o n v o l a t i l e organic, and selected methodologies (Figure 1). For the present system, these are skipped and the routine f o r general QA/QC advice f o r v o l a t i l e organics i n water i s entered. The second module. Method, involves determining the l e v e l of v e r i f i c a t i o n and v a l i d a t i o n to which the user's methodology has been subjected. V e r i f i c a t i o n i s the general process used to decide whether a method i n question i s capable of producing accurate and r e l i a b l e data. V a l i d a t i o n i s an experimental process involving external corroboration by other laboratories ( i n t e r n a l or external) of methods or the use of reference materials to evaluate the s u i t a b i l i t y of methodology (1). A menu of choices includes: (1) the method has only been v e r i f i e d , (2) the method has been both v e r i f i e d and v a l i d a t e d , or (3) the method has been neither v e r i f i e d or validated. The t h i r d module. Samples, queries the user f o r how many samples w i l l be taken and the fourth. Cone ent rat ion, f o r the expected range of probable concentration values. The choices of probable concentration values are: (1) high [ > 10,000 p a r t s - p e r - b i l l i o n (ppb) ] ; (2) Medium [10-10,000 ppb] ; or (3) Low [< 10 ppb]. The f i f t h module, Detector, queries the user f o r the detector that w i l l be used i n conjunction with the GC analysis (Figure 2). The information from these f i v e modules i s then used to provide a series of advisory statements r e l a t i n g to whether the user w i l l or w i l l not meet the stated confidence l e v e l s and, i f not, what the options are. Figure 3 i s the r e s u l t i n g advice f o r an example of a good QA/QC match with the user's needs. In t h i s example, a high l e v e l of confidence was established, the methodology was both v e r i f i e d and validated, two samples were t o be taken and analyzed by gas chromatography-mass spectrometry (GC-MS) at l e v e l s below 10 parts-perb i l l i o n (ppb). These conditions might be t y p i c a l of analyses f o r 2»3,7,8-Tetrachloro-£-dioxin (TCDD) i n polluted water.

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

3.

KEITH A N D STUART

A Rule-Induction Program for

QA-QC

We need to establish what instrument you plan to use for the analysis.

Since the compound(s) you are analyzing are s u f f i c i e n t l y v o l a t i l e to be separated by gas chromatography, I am assuming that you will use a GC for your separations.

Here are the detector choices we have to consider:

a

=

b

=

Mass spectrometer (general purpose) Hall detector (in the halogen, nitrogen or sulfur mode)

c

=

Flame photometric detector (for phosphorous) Photo Ionization detector (for olefins and aromatlcs)

d

=

e

=

Electron capture detector (for pesticides, halogens, etc.)

f

=

Flame ionization detector (general purpose)

Choose one of these by typing the corresponding menu letter:

F i g u r e 2.

[a,b,c,d,e,f] a

Q u e r i e s f o r t h e Module D e t e c t o r .

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

35

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

Less than 10 ppb Is the most d i f f i c u l t range of analytes to quantify with a high degree of c o n f i dence. Quantitative measurements which do not f a l l within the "Region of Quantitation" cannot be reported. This region is defined as r e s u l t i n g from signal measurements which are 10 or more standard deviation units above the background noise. In order t o meet t h i s requirement at levels below 10 ppb, very good cleanup procedures are usually needed in addition to c a p i l l a r y GC columns and experienced analysts. All measurements reported must also be within the range of your c a l i b r a t i o n curves. If you cannot meet these stringent requirements, then you w i l l probably have to lower your desired level of confidence in the data.

You Indicated that you need a high level of confidence In your data and GC-MS Is an Instrument combination that w i l l provide you with the c a p a b i l i t y of generating that degree of q u a l i t y . You should a l s o use c a p î l l a r y GC columns If at a l l possible in order t o obtain the best chromatographic r e s o l u t i o n . When you have analytes that are completely, or almost completely resolved from other compounds, the quality of the mass spectra generated w i l l be better. You w i l l a l s o need documentation that the mass spectrometer is operating and is tuned up c o r r e c t l y and that the resolution obtained from the gas chromatograph meets your requirements. These requirements should be met with QC standards for each instrument prior to analyzing your samples.

You have your methodology both v e r i f i e d and validated as required for measurements needing a high level of confidence. But, you must also assure that your analyst Is experienced In performing the type of analysis you need, that you have standards for the analytes a v a i l a b l e , and that you have a written quality assurance plan that documents good laboratory p r a c t i c e .

Adν ice:

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

25% 25% 25% 25%

unsplked f i e l d blanks for control samples, spiked f i e l d blanks for monitoring matrix e f f e c t s , unsplked method blanks for workup/Instrument QC, and spiked method blanks for workup/Instrument QC.

This w i l l give you the required quality

[Note:

control

Figure 3. Example of Advice Provided f o r a Good QA/QC Match w i t h User Needs.

4 blanks were recommended even though only 2 samples were planned.]

(RETURN continues)

Advice:

The total number of blanks you would need, based on the number of samples you plan to take,

• • • •

Is:

4.

For a high level of confidence you w i l l need to have both " f i e l d " and "method" blanks. Field blanks are blanks from a s i m i l a r source that do not contain the analytes of Interest. Control s i t e s (uncontamlnated s i t e s ) are used to obtain f i e l d blanks and If f i e l d blanks are not a v a i l a b l e , every e f f o r t should be made t o obtain blank samples that best simulate a sample that does not contain the analyte (such as a simulated or synthetic f i e l d blank). Your method blanks w i l l consist of a l l solvents, r e s i n s , e t c . that you w i l l use for e x t r a c t i n g , concentrating and cleaning up the samples prior to a n a l y s i s . You may want about half of these unsplked and the remainder spiked with known levels of your analyte standards. Similarly you may want to spike about half of your f i e l d blanks with known levels of your analyte standards so that any matrix e f f e c t s wllI be Identified during the a n a l y s i s . This plan would provide you with:

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

38

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

GloveAId GloveAId i s an expert system being developed f o r the National Toxicology Program, I t has been programmed to choose from seven glove materials the one most l i k e l y to provide the greatest p r o t e c t i o n at the cheapest cost against a v a r i e t y of chemicals. Chemical input i s selected by choosing one of seventeen chemical classes. Glove t a c t i l i t y needs and the desired amount of protection ( i n u n i t s of minutes) are also input. The computer provides advice as to the probable best glove to select and, i f none meet requested c r i t e r i a , i t advises the best choice i t has available and explains the l i m i t a tions of that choice with respect to the users request. Factors used i n making the decisions include: chemical c l a s s , molecular weight, v o l a t i l i t y ( b o i l i n g p o i n t ) , reaction with glove materials (weight change), t a c t i l i t y and glove cost. The prototype GloveAId system was developed using a data base generated from chemical permeation measurements performed at Radian. Experimental data from these tests were entered into a LOTUS-1-2-3 spreadsheet and sorted by a l l c l a s s i f i a b l e respects i n order to make v i s u a l c o r r e l a t i o n s with the protective character of seven d i f f e r e n t glove materials. The data base consisted of 90 chemicals with associated physical properties (molecular weight, b o i l i n g point and l i n e a r i t y of the molecule), chemical class and measurements of breakthrough times, steady-state permeation rate and degradation c h a r a c t e r i s t i c s . The l a t t e r consisted of percent weight change when a piece of the material was immersed i n the t e s t chemical f o r four hours. Each of the chemicals was tested against a l l seven glove materials f o r weight change but only against four of the glove materials for breakthrough and permeation rate data so that 1,300 measured values and 540 associated pieces of information were a v a i l a b l e . V i s u a l c o r r e l a t i o n of t h i s data produced the protective r a t i n g approximations l i s t e d i n Table I . I t i s time consuming and d i f f i c u l t f o r humans to make v i s u a l comparisons of a numerical data set and draw the simplest possible correlations between them; the larger the data set, the more d i f f i c u l t t h i s i s to do. A l o t of time and e f f o r t was expended to make the approximate evaluations l i s t e d i n Table I . When the data set i s a dynamic one, i . e . , i t i s changing due to the addition of new data, i t simply adds to t h i s problem. However, one strength of computer usage i s that such tasks can be performed with ease and, when t h i s c a p a b i l i t y i s coupled to the a b i l i t y to induce correlations or " r u l e s " from a data set, an extremely powerful t o o l f o r evaluating data i s created. This second way of evaluating the data i s curr e n t l y being pursued and i s described i n more d e t a i l i n the next section. The ratings i n Table I are based only on the safety aspects of the glove m a t e r i a l s ; i . e . , protection from exposure to chemicals as indicated by the majority of breakthrough times observed w i t h i n the members of a chemical c l a s s . However, t a c t i l i t y i s often an addit i o n a l important ergonomie f a c t o r ; i t i s impossible to perform d e l i c a t e tasks with t h i c k , bulky gloves. T a c t i l i t y of the gloves was rated subjectively using a dime. I f the features of a dime could be r e a d i l y f e l t through the glove, i t was assigned a r a t i n g of "very good." I f the features were not very distinguishable through

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

3.

KEITH A N D STUART

4J

> P4 Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

> £>£

> £

C»- Ol PU Ο ·

PupuPuPuc^c— > > >

PuP-i

pU PU Οι pU >

pUPUPUC-O-pQPUPU

PU

pqpuOc-c^c-PuPu

ο·

cd •J

39

A Rule-Induction Program for QA-QC

ο·

£4

o* Pu Ρ-ι ο·

£ ^£ Pu Ο

pU Pu

φ

G Φ μ α ο

[χι ο ·

pu pu pu pu

pE4

P* pq Pu pq

puPno-c*-PupqPuPn >

pq pq pq o . pq

pu pu Pu Pu

pqpqpqpLipqpqpqpq >

pq Pu ο

pq pq pq pq

pqg4pqpuPupqpupq

Pu *

Ρ-ι

μ rH 0)

a ο •H -P

pq Pu >

Ο pu

>

Ό

0)

Cd

Ό cd (D β Ό α •Ρ Φ Φ tJO +J ο S o cd

Φ rH (3 60 CO φ 00 rH Ο (0 -Ρ rH f d Ο Q)

w α ο •a cd ο ο

ι υ

G

1 ο

Cd -Ρ

cd Λ Θ 0 •Η Ο Ο rH . . .Μ. μ

Η

θ4

Λ

•Η Ο 4J · Η •ri •Ρ

M

8

r

Φ Φ *d

rC

•Η rH id φ

μ

·Η Φ Ό Ό Μ Φ «Ο · Η · Η Ο 4J

«1

Φ •Ρ

cd

φ Φ Φ

d μ

α pq (χ ρα ωw w lgood,G0AL)

Butyljtabber

144

0

180

»>(fair,GOAL)

Butyljfaibber

106

138

0

181

»>[fe1r,GQAL)

Butyljtubber

106

136

0

80

»>(fair,GOAL)

Butyljtubbor

106

133

0

188

»>{fe1r GQAL)

Neoprene

148

183

0

64

=>[feir,G0AL)

N1tr1le

148

183

0

11

»(best,G0AL)

NHMle

106

136

0

95

=>(fair,GOAL)

N1tr1lo

106

144

0

60

*>tpoor,G0AL)

f

Nltrlle

78

80

0

58

=>(fair,GOAL]

N1tr1le

106

138

0

77

s>(fa1r,GQAL)

Nltrlle

106

138

0

82

Nltrlle PVA

130

185

0

63

«>(fa1r,G0AL) =>(fair,GOAL]

130

185

0

62

=>[best,GOAL)

PVA

148

183

0

0

=>{beet GQAL)

PVA

106

138

0

3

=>(beet,G0AL)

PVA

106

144

0

0

=>[best GQAL]

PVA

106

136

0

0

=>(fair,GOAL] =>(fa1r,G0AL)

t

f

PVA

78

80

0

0

PVA

106

138

0

0

PVC

78

80

0

40

=> (νery_poor,GOAL)

=>{best G0AL) f

PVC

106

138

0

8

=>(ve ry_poo r » GOAL)

Viton Viton

106

138

0

1

=>(beet,GOAL)

106

138

0

1

=>(best,G0AL)

Viton

148

183

0

0

=>[best G0AL)

Viton

78

80

0

3

=>(baet,G0AL)

Viton

130

195

0

0

=>(beet,GQAL)

VUon

106

136

0

0

=>(ba8t,G0AL)

Viton

106

144

0

1

=>(beet,G0AL)

f

ACTIONS: best

[ a d v i s e " T h i s g l o v e has a * b e s t * r a t i n g . " ]

good

[ a d v i s e " T h i s g l o v e has β *good* r a t i n g . " )

fair

[ a d v i s e " T h i s g l o v e has a * f a i r * r a t i n g . " ]

poor

[ a d v i s e " T h i s g l o v e hes e • p o o r * r a t i n g . " ]

veryj>oor

[ a d v i s e "Thia glove hat a *very poor* r a t i n g . " ]

Figure 5 . Induction Module f o r Nonhalogenated Compounds. The symbol => means "then".

Aromatic

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

44

The fourth column pertains to the designation of a non-linear shape (0), and the l a s t column of data l i s t s the percent change i n weight gain or loss when the material i s soaked i n the test chemical f o r 4 hours. The data w i t h i n a row i s associated with a s p e c i f i c compound but the compounds were l i s t e d i n random order w i t h i n a glove materia l group i n order to emphasize an important feature of RuleMaker — that information (data) can be entered as i t i s thought of. This i s an extremely important (and powerful) difference between RuleMaster and other a r t i f i c i a l i n t e l l i g e n c e programs which are w r i t t e n i n a highly structured i n t e r r e l a t i v e fashion. The powerful inductive l o g i c of RuleMaker enables t h i s l i m i t a t i o n to be ignored and t h i s frees the user to add, change, or delete example data which i n f l u ence the rulemaking l o g i c e a s i l y and at w i l l . This feature i s very important when working with a growing/ changing data base. The part of the example to the r i g h t of the arrow (=>), i s an action-next-state-pair. I t indicates what w i l l happen when the s p e c i f i e d combination of condition values occur. In t h i s example the action i s the designation of the r e l a t i v e protection of the material (good, f a i r , etc.) and the word "GOAL" which indicates that the goal of the module w i l l have been reached when the action section of the module has been carried out and the computer can e x i t t h i s p a r t i c u l a r module. Since there i s only one module i n t h i s simple example, the program would then end. The ACTIONS section of the module i s comprised of two parts: •

the action keyword corresponding to the t h i r d part of the EXAMPLE section, and



the action that i s to be c a r r i e d out (for example to advise the user by a p r i n t on the screen and/or a p r i n t e r that "This glove has a *best* glove r a t i n g " .

A f t e r the information i n the induction module i s entered, the program i s assembled by the computer. During t h i s phase, two actions take place automatically with no further input from the user: 1.

Rules are induced from the examples given the computer, and

2.

The actual program f o r running the computer i s COMPILED AND WRITTEN by the computer i t s e l f !

These two actions by the computer are key to the success of t h i s project. This i s because i t w i l l be impossible f o r a human to consider a l l the p o s s i b i l i t i e s of a large data set and to deduce the best (most simple and therefore cost e f f e c t i v e ) rules to use i n order t o choose the best protective materials to use. And when the data base i s dynamically growing i t would be impossible to use a highly structured a r t i f i c i a l i n t e l l i g e n c e system where the user had to rewrite the program modifications himself every time there was a change i n the information.

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

3.

KEITH A N D STUART

A Rule-Induction Program for QA-QC

The rules induced by the computer are shown program which the computer wrote f o r i t s e l f ( i n s i m i l a r to a C-type language) i s shown i n Figure abbreviated notations say the same thing which, follows:

45

i n Figure 6. The "Radial" which i s 7. Both of these i n English i s as

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

"The rules induced from the example data given are: 1.

I f the glove material i s PVC, the r a t i n g i s VERY POOR.

2.

I f the glove material i s n i t r i l e , and compounds have a molecular weight = 142°C, the r a t i n g i s POOR.

3.

I f the glove material i s neoprene, the r a t i n g i s FAIR.

4.

I f the glove material i s PVA, and the compounds have a molecular weight 137°C, the r a t i n g i s BEST."

I t i s i n t e r e s t i n g to c o r r e l a t e these rules with the f i r s t rules that were estimated with no help from RuleMaster. These were the rules used to construct the f i r s t prototype expert system, GloveAId for non-halogenated aromatic compounds: 1.

If the glove material i s PVC, the r a t i n g i s VERY POOR.

2.

I f the glove material i s n i t r i l e , the r a t i n g i s POOR.

3.

I f the glove material i s butyl rubber, the r a t i n g i s FAIR.

4.

If the glove material i s PVA, the r a t i n g i s FAIR.

5.

If the glove material i s neoprene, the r a t i n g i s FAIR.

6.

If the glove material i s Viton, the r a t i n g i s BEST.

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

0

( a l l states)

1

only

[glove] Butyljfcbber : [molwt] ( f a i r , GOAL )

>=118 :

=> ( good, GOAL ]

Neoprene :

=> ( f a i r , GOAL ]

N l t r l l e χ [molwt] ( f a i r , GOAL )

>=92 : [molwt] ( f a i r , GOAL ] [bollpt]

=139 :

=> ( f a i r , GOAL ] [boilpt]

( f a i r , GOAL ]

>=142 :

=> ( poor, GOAL ]

>=118 : [molwt] ( f a i r , GOAL )

>=139 :

=> ( best, GOAL ]

PVA : [molwt] ( f a i r , GOAL )

>=92 : [molwt] ( f a i r , GOAL )

>=137 :

=> ( bast, GOAL ]

>=118 : PVC : Viton :

[bollpt]

( best, GOAL ]

=> [ v e r y j o o r , GOAL ) => ( best, GOAL ]

The Induced rule has 11 test nodes end 16 l e e f nodes. Figure 6. The Induced Rules f o r Nonhalogenated Aromatic Compounds. The f o l l o w i n g are meanings assigned t o symbols: [...] means " I f ... i s " ; => means "then"; and a colon means "and".

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

Figure 7 · The Computer-Generated Program for Using the Rules Induced by RuleMaker in an Expert System for Advising Glove Materials To Be Used for Protection Against Nonhaolgenated Aromatic Compounds.

GOAL OF classIO

STATE: only IF [ask "What is the glove type?" "Buty l_Rubber,Neoprene,Nitrile,PVA,PVC,V iton"] I S "Butyl_Rubber" : IF [ integer.read "What is the molecular weight?" < "118" ] IS "Τ" : [ advise "This glove has a * f a i r * r a t i n g . " , GOAL ] ELSE [ advise "This glove has a *good* r a t i n g . " , GOAL ] "Neoprene" : C advise This glove has a * f a i r * r a t i n g . " , GOAL ] " N i t r i l e " : IF [ integer.read "What is the molecular weight?" < "92" ] IS "T" : [ advise "This glove has a * f a i r * r a t i n g . " , GOAL j ELSE IF [ integer.read "What is the molecular weight?" < "118" ] IS "T" : IF [ integer.read "What is the b o i l i n g point?" < "137" ] IS "T" : [ advise "This glove has a * f a i r * r a t i n g . " , GOAL ] ELSE IF [ integer.read "What is the b o i l i n g point" < "139" ] IS "T" : [ advise "This glove has a * f a i r * r a t i n g . " , GOAL ] ELSE IF [ integer.read "What is the b o i l i n g point?" < "142" ] IS "T" : [ advise "This glove has a * f a i r * r a t i n g . " , GOAL ] ELSE [ advise "This glove has a *poor* r a t i n g . " , GOAL ] ELSE IF [ integer.read "What is the molecular weight?" < "139" ] IS "T" : [ advise "This glove has a * f a i r * r a t i n g . " , GOAL ] ELSE [ advise "This glove has a * b e s t * r a t i n g . " , GOAL ] "PVA" : IF [ integer.read "What is the molecular weight?" < "92" ] IS "T" : [ advise "This glove has a * f a i r * r a t i n g . " , GOAL ] ELSE IF [ integer.read "What is the molecular weight?" < "118" ] IS "T" : IF [ integer.read "What is the b o i l i n g point?" < "137" ] IS "T" : [ advise "This glove has a * f a i r * r a t i n g . " , GOAL ] ELSE [ advise "This glove has a * b e s t * r a t i n g . " , GOAL ] ELSE [ advise "This glove has a * b e s t * r a t i n g . " , GOAL ] "PVC" : [ advise "This glove has a *very poor* r a t i n g . " , GOAL ] ELSE [ advise "This glove has a * b e s t * r a t i n g . " , GOAL ]

MODULE : classIO

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

o> ^ £ J* ^

3

^ 3 SL ^

m g > * ^ Η > 5

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

48

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

As can be seen by a comparison of the two rule sets, the one induced by RuleMaster has s i g n i f i c a n t l y more refinement to i t and w i l l come much closer to making accurate predictions than the human induced rule set. I t i s useful to display these rules as a series of bar charts i n order to be able to view them i n r e l a t i o n to one another. This i s presented i n Figure 8 so that the human induced ranges can be compared to the ranges induced by RuleMaster. I t i s r e a d i l y seen that there i s good agreement between the two ranges i n that a l l of the i n i t i a l human assignments are s t i l l present i n the RuleMaster assignments. The notable difference i s that there i s considerably more refinement to the possible choices i n the RuleMaster chart. The s i g n i f i c a n c e i s that based on the simpler human induced rules i f long term protection (more than 1 hour) was needed f o r working with nonhalogenated aromatics, V i t o n was the only good choice. However, Viton gloves are not only very expensive ($30 a p a i r ) , but they have poor t a c t i l i t y , so work involving much dexterity i s precluded when wearing them. With the RuleMaster information new p o s s i b i l i t i e s are now a v a i l a b l e f o r consideration: •

I f the compounds have molecular weights >138 then N i t r i l e may be used; n i t r i l e gloves o f f e r greater t a c t i l i t y and they are much l e s s expensive than Viton.



I f the molecular weight of the compounds i s 93 with b o i l i n g points greater than 137°C, then PVA may be used; PVA gloves have no better t a c t i l i t y properties than Viton gloves but they are cheaper so the expenses could be lowered.

Thus, the rules induced by RuleMaster o f f e r p o s s i b i l i t i e s f o r reducing cost and allowing more dextrous work to be performed than would have been available using the human induced rules. The important caveat to remember, however, i s that the computer has produced the best rules possible from the data i t was given and has extended those rules to cover examples past that data set where possible. Thus, u n t i l proven with a s u f f i c i e n t number of examples any set of rules must always be viewed simply as the best ADVICE available. There can always be " o u t l i e r s " caused by a d d i t i o n a l factors that have not yet been discovered. Once the computer has induced the rules governing a p a r t i c u l a r set of complex data then i t i s easy f o r a human to check and see i f they are true. This can be done i n two ways: 1.

a simple Rule Table can be constructed, and

2.

a d d i t i o n a l known examples can be analyzed to challenge the rules and see i f they hold true; i f they don»t then addit i o n a l data i s given the computer so that modified rules can be induced.

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

KEITH A N D STUART

A Rule-Induction Program for QA-QC

Protective Rating Based on Breakthrough Time

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

ι

Nitrile

«*»

ι

Good

|

Best

KWSSSNN

Neoprene

Butyl Rubber

PVA

Human Induced Protective Ranges

Butyl Rubber

\,'.'.'t'.'.'y//////a I» » * *

IZ3

RuleMaker Induced Protective Ranges

Figure 8. Protective Ranges of S i x Glove Materials Against Nonhalogenated Compounds.

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

50

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

An example of the Rule Table that can be constructed from t h i s data set i s Table I I . Now, once the Rule Table i s constructed i t i s easy to check the data again and v i s u a l i z e these r e l a t i o n s h i p s ; that i s , to v e r i f y that they are true. But, remember the lack of obvious relationships when the example data was f i r s t examined. The use of a r t i f i c i a l i n t e l l i g e n c e , and s p e c i f i c a l l y a rule inductive program such as RuleMaster i s an excellent way that meaningful relationships can be derived from the large and diverse mass of data being produced. The use of a r t i f i c i a l i n t e l l i g e n c e i n t h i s way i s referred to as "knowledge manufacturing". Thus, the strongest features of a computer (to remember and correlate large numbers of data) and humans (to be creative and to use reasoning c a p a b i l i t i e s beyond that of a computer) are being used to solve very complex problems. Summary In summary, RuleMaster i s an expert system b u i l d i n g package intended to solve many of the problems involved i n the construction of large knowledge based programs. I t s inductive learning system (RuleMaker) allows rapid and e f f e c t i v e a c q u i s i t i o n of expert knowledge. The Radial language allows structured organization of large quantities of knowledge. Radial also provides a f a c i l i t y f o r presenting ordered explanation of reasoning to any l e v e l of elaboration r e quired. Use of an expert system i n conjunction with a statistical program f o r pattern recognition such as Ein*Sight or SIMCA i s a concept that offers an excellent p r o b a b i l i t y of success i n (1) f i n d i n g , (2) ordering, and (3) using the most s e l e c t i v e chemical and physical parameters f o r choosing the best protective materials to use with a wide v a r i e t y of hazardous chemicals. No other program can be used both to help develop the rules needed f o r analysis of a complex data base (by induction) and then to use these rules i n a l o g i c sequence to provide a diagnostic decision. Furthermore, the basis of any and a l l decisions made by the computer are completely available on demand so that they can e a s i l y be checked and/or verified. The f i r s t prototype system used rules which were derived as "best estimates" from a data base of about 1300 tests using 90 d i f f e r e n t chemicals. However, the prototype system i s being revised using computer-generated rules. Thus, i t i s becoming "smarter" and better as i t ' s data base and the r e s u l t i n g rules derived from i t i s expanded. Using a computer to evaluate large masses of data i s not novel, but using i t to help generate rules by an inductive l o g i c process from large masses of data i s an important new achievement. One of the s i g n i f i c a n t advantages of t h i s expert system w i l l be a consistent unbiased i n t e r p r e t a t i o n of the data i n a rapid manner once the expert system has been developed. And l a s t l y , RuleMaster i s structured so that i t i s easy to add, change, or delete data from the expert system so that i t can continue to grow and improve w i t h use and experience. These features w i l l be invaluable as the data base continues to grow and change.

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

3. KEITH A N D STUART

51

A Rule-Induct ion Program for QA-QC

TABLE I I . RULE TABLE FOR NONHALOGENATED AROMATIC COMPOUNDS Glove Material Rating Best

Good BuR i f MW >= 118

Fair

Poor

V Poor

BuR i f MW < 118

Downloaded by NORTH CAROLINA STATE UNIV on December 29, 2017 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch003

Neoprene Nitrile i f MW >= 139

Nitrile i f MW < 118 and bp = = 118 -< 139

PVA i f MW >= 118 - or MW >= 92 -< 118 and bp >= 137

PVA i f MW < 92 - or MW >= 92 -< 118 and bp < 137

Nitrile i f MW < 118 and bp >= 142

Viton

PVC

MW = Molecular Weight bp = B o i l i n g Point BuR = Butyl rubber PVA = P o l y v i n y l acetate PVC = P o l y v i n y l chloride Literature Cited 1.

2.

D. Michie, S. Muggleton, C. Riese and S. Zubrick, "RuleMaster A Second Generation Knowledge Engineering Facility," from Proceedings of the First Conference on A r t i f i c i a l Intelligence Applications, Denver, Colorado, 5-7 December 1984. L.H. Keith, W. Crummett, J. Deegan, Sr., R.A. Libby, J.K. Taylor and G. Wentler, "Principles of Environmental Analysis," Anal. Chem., 55, p. 2210-18, 1983.

R E C E I V E D January 15, 1986

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.