7 An Expert System for the Formulation of Agricultural Chemicals Bruce A. Hohne and Richard D. Houghton Downloaded by UNIV OF PITTSBURGH on February 15, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch007
Rohm and Haas Company, Spring House, PA 19477
An expert system has been written which helps the agricultural chemist develop formulations for new biologically active chemicals. The decision making process is segmented into two parts. The f i r s t is which type of formulation to use. The second is how to make a formulation of that type with the chemical of interest. The knowledge base currently contains rules to determine which formulation type to try and how to make an emulsifiable concentrate. The next phase will add rules on how to make other types of formulations. The program also interfaces to several FORTRAN programs which perform calculations such as solubilities. What Is An A g r i c u l t u r a l
Formulation
An e s s e n t i a l part of the development of a new p e s t i c i d e i s e s t a b l i s h i n g a good, dependable formulation. The product's a c t i v e ingredient and p h y s i c a l properties should remain acceptable f o r two years or more. These formulations are often subjected to storage conditions of extreme heat, cold, and humidity. Once sold to the a p p l i c a t o r , the concentrated formulation should d i l u t e e a s i l y to f i e l d strength and pass f r e e l y through conventional spray equipment. A g r i c u l t u r a l (Ag) formulations that are commonly d i l u t e d and applied by means of spray equipment include water soluble l i q u i d s , emulsifiable concentrates, wettable powders, and flowable suspensions. The choice of which formulation to develop normally depends upon the s o l u b i l i t y properties of the t e c h n i c a l p e s t i c i d e . S c i e n t i s t s often must also consider manufacturing costs, f i e l d e f f i c a c y and product t o x i c i t y . A water soluble l i q u i d formulation (WSL) i s prepared from p e s t i c i d e s that are h i g h l y water soluble. This i s , by f a r , the simplest type of formulation. One d i s t i n c t advantage of WSL s over other formulations i s that the f i e l d spray d i l u t i o n s are i n f i n i t e l y stable as true s o l u t i o n s . Pesticides that are h y d r o p h i l i c and i o n i c , such as inorganic or organic m e t a l l i c s a l t s , often f a l l i n t o t h i s category. Unfortunately, only a small p o r t i o n of a l l p e s t i c i d e s are adequately soluble i n water. 1
0097-6156/86/0306-O087$06.00/0 © 1986 American Chemical Society
In Artificial Intelligence Applications in Chemistry; Pierce, Thomas H., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1986.
Downloaded by UNIV OF PITTSBURGH on February 15, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch007
88
ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY
An emulsifiable concentrate i s prepared from p e s t i c i d e s that are soluble i n common organic solvents, such as xylene and kerosene. Using e m u l s i f i e r s i n the composition causes the formulation to disperse into small p a r t i c l e s , c a l l e d an emulsion, when d i l u t e d i n water. Pesticides that are not soluble or have l i m i t e d s o l u b i l i t y i n common solvents are formulated as wettable powders (WP) or flowable concentrates (flowables). A wettable powder has the capacity f o r high active ingredient content, often between f i f t y and eighty percent by weight, and i s made by blending and grinding dry ingredients. Wettable powders are best prepared from p e s t i c i d e s that are high melting, f r i a b l e s o l i d s . Diluents, such as n a t u r a l clays and synthetic s i l i c a t e s , are used to improve the powder's p h y s i c a l properties. The disadvantages of a WP are: messy handling properties; p o t e n t i a l dust i n h a l a t i o n hazard f o r f i e l d personnel; and the need to measure the powder on a weight basis. In some cases these problems can be overcome by formulating the p e s t i c i d e into a suspension. Water and other ingredients are added to the composition to suspend and disperse the active compound into a flowable. Regardless of what type of formulation i s employed i n the f i e l d , the formulation must wet, disperse, and remain homogeneous i n the a p p l i c a t i o n spray equipment. Careful s e l e c t i o n of formulating agents, commonly c a l l e d i n e r t s , i s extremely important. These ingredients have no b i o l o g i c a l a c t i v i t y of t h e i r own, but combined, they f u n c t i o n as the d e l i v e r y system f o r the p e s t i c i d e . In a d d i t i o n to solvents and d i l u e n t s , formulations may contain e m u l s i f i e r s , dispersants, chelating agents, thickeners, defoamers, and more. The large number and v a r i e t y of each type makes s e l e c t i n g the components f o r a formulation d i f f i c u l t and time consuming. Why Is This A Good Area f o r an Expert System The process of choosing a p p l i c a t i o n areas f o r expert system development has been d e t a i l e d elsewhere, both f o r the general case and the corporate environment [1]. There are several s p e c i f i c advantages i n the formulations a p p l i c a t i o n . Experts on one type of formulation are not necessarily experts on other formulation types. Expertise i n Ag formulations tends to be i n the form of 'rules of thumb', based on experiences with s i m i l a r chemical systems. Incremental growth, l i k e t h i s , i s i d e a l f o r expert system development. Formulation s c i e n t i s t s are also l i k e l y to be more tolerant of the program's mistakes because t h e i r s k i l l i s measured by how few bad formulations they make before they make a good one. M u l t i l e v e l expert systems o f f e r a d d i t i o n a l advantages over t r a d i t i o n a l expert systems. M u l t i l e v e l expert systems draw on computational computer programs to solve parts of the problem. The Ag formulation expert system does t h i s i n the areas of computational chemistry, bookkeeping, and communication. There are numerous computational programs a v a i l a b l e to chemists today. These programs are algorithmic by nature, and solve problems that do not lend themselves to expert systems. However, a great deal of expertise may be needed by the chemist to decide which program to use and how to a c t u a l l y use i t . Most chemists do not
In Artificial Intelligence Applications in Chemistry; Pierce, Thomas H., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1986.
Downloaded by UNIV OF PITTSBURGH on February 15, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch007
7.
H O H N E A N D HOUGHTON
Formulation of Agricultural Chemicals
89
have, and are not w i l l i n g to gain, t h i s computer expertise. Some would rather use t r a d i t i o n a l , noncomputational, methods rather than navigate the maze of a v a i l a b l e computer programs and users manuals. Expert systems can be extremely valuable i n providing t h i s expertise to chemists. The Ag formulations expert system has the a b i l i t y to execute the appropriate computational programs, giving i t an advantage over the formulation chemist. Bookkeeping tasks are generally handled better by a computer than a chemist. For example, time tables must be met f o r long term storage studies, toxicology data, and government r e g i s t r a t i o n s . These tasks are e a s i l y handled by the computer. The expert system f i l l s several p o t e n t i a l communication gaps. Molecular modeling c a l c u l a t i o n s which are performed by the synthetic chemists, outside the formulation area, can be accessed by the expert system. Through t h i s i n t e r f a c e , the expert system can extract u s e f u l , s t r u c t u r a l information d i r e c t l y . Also, i f a structure has not been entered, the formulation chemist can use the modeling program to enter the structure into the computer. In addition, the system safeguards against communication gaps between the chemist and management/marketing by including marketing and production considerations i n the rule base. In t h i s way, management can determine which new formulations are possible, and what c h a r a c t e r i s t i c s they w i l l s a c r i f i c e with a p a r t i c u l a r formulation. Structure of the Problem The problem of devoloping a new formulation i s highly structured. The structure tends to be h i e r a r c h i c a l , although t h i s hierarchy does not resemble a t r a d i t i o n a l decision tree. Each branch point may have any number of branches. The decision about which 'branch' to take at each l e v e l can be viewed as an independent expert system. The a b i l i t y to break the o v e r a l l problem into smaller, simpler subproblems i s desirable f o r expert systems. Many of the facts i n the system are shared by several subproblems, and subproblems must be. developed by s t a r t i n g at the top of the hierarchy and working down. Other than these s t i p u l a t i o n s , they are independent problems. Each branch of the tree can be used independently, and need not be complete to be u s e f u l i n the formulation study. The expert system's competence on each subproblem can be judged independently. In many cases d i f f e r e n t experts are used to develop the knowledge bases f o r d i f f e r e n t subproblems. Figure 1 shows the structure of the problem, t r a c i n g one branch from each l e v e l . Structure of the Expert System The program was w r i t t e n on an Apollo computer i n LISP. Apollo's Domain LISP, a v e r s i o n of Portable Standard LISP, was the d i a l e c t available. The expert system has been w r i t t e n to follow the natural structure of the Ag formulation problem. Figure 2 shows the o v e r a l l structure of the expert system. One nice feature of the program i s that at each branch point the user can override the computer's choice, and can also select as many branches to pursue as desired.
In Artificial Intelligence Applications in Chemistry; Pierce, Thomas H., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1986.
Downloaded by UNIV OF PITTSBURGH on February 15, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch007
90
ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY
Emulsifiable Concentrate
Determine Solvent
Solvent
1
Solvent
Solvent
Solvent
3
4
5
Det40% THEN 1. There i s suggestive evidence (-0.5) that the formulation type should not be emulsifiable concentrate 2. There i s suggestive evidence (-0.5) that the formulation type should not be water soluble l i q u i d 3. There i s suggestive evidence (-0.5) that the formulation type should not be flowable concentrate BECAUSE: EC's, WSL's and Flowables r a r e l y have that high an AI l e v e l Table I I I . Structure of Rules AgRule_13 If-1 (Isequal Solvent Req_EPA_Clear Value C) Then-1 (Avoid NotEqual EC_Solvent EPA_Clear C -1) Why I t ' s the law Date 12/20/83 Author Hohne Agrulel3 IF 1. The value of the solvent's required EPA clearance is C THEN 1. Avoid (-1) e m u l s i f i a b l e concentrate solvents where EPA clearance i s not equal to C BECAUSE I t ' s the law
In Artificial Intelligence Applications in Chemistry; Pierce, Thomas H., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1986.
7.
HOHNE A N D HOUGHTON
Formulation of Agricultural Chemicals
95
The r u l e structure allows simple Boolean functions to be performed. M u l t i p l e numbered IF clauses are l o g i c a l l y ANDed together. M u l t i p l e clauses which are part of the same numbered IF are l o g i c a l l y ORed. The l o g i c a l NOT does not e x i s t , but can be simulated using predicates with the opposite meaning i n the IF clause, ( i . e . BIGGER i s equivalent to NOT SMALLER). Table IV l i s t s the c u r r e n t l y a v a i l a b l e predicates f o r IF clauses.
Downloaded by UNIV OF PITTSBURGH on February 15, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch007
Table IV. Relationships (predicates) Predicate BIGGER SMALLER MEMB NOTMEMB ISEQUAL NOTEQUAL
Meaning Bigger than Smaller than Member of the l i s t Not a member of the l i s t Is equal to Not equal to
The ACTIONS a v a i l a b l e to the THEN clauses are l i s t e d i n Table V. These ACTIONS give r i s e to two types of THEN clauses. The f i r s t type a f f e c t s the VALUE of only one property. The THEN clauses i n Table I I show the construction of one-property THEN clauses. The second type of THEN clause deals with a l l of the current branch points. Table I I I shows the construction of t h i s type of THEN clause. Table V. Actions Action SUGGEST SET_EQUAL ORDER_BY AVOID
Meaning Adjust the property's value using the l i s t e d confidence factor Set the property's value equal to the l i s t e d value Order the hypotheses by the value of the l i s t e d property Avoid conclusions where the requirement l i s t e d
The inference engine was designed to use multivalued l o g i c , i . e . , i t handles inexact reasoning. Confidence factors (CF) are contained i n the THEN clauses of each r u l e . The equation f o r combining p o s i t i v e confidences i s : CF - old_value + new_value - (old_value X new_value) The equation f o r negative confidences i s : CF » old_value + new_value + (old_value X new_value) For mixed p o s i t i v e and negative confidences, a simple sum i s used. The advantage to these functions i s they are bounded by -1 and +1. The program also handles exact reasoning through the SET_EQUAL
In Artificial Intelligence Applications in Chemistry; Pierce, Thomas H., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1986.
96
ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY
ACTION i n the THEN clause. This ACTION can be used to set a confidence value to +1 (true) or -1 ( f a l s e ) , regardless of previously compiled confidences. SET_EQUAL can also be used to set FACT values equal to nonnumeric values, where required. The n a t u r a l language i n t e r p r e t a t i o n of the rules given at the bottom of Tables I I and I I I was generated by the program. The natural language generator uses synonyms f o r FACT names and properties. The synonyms are simply substituted i n t o one of several templates to generate a sentence. The template used i s determined by the value of the confidence factor and the combination of ACTIONS and PREDICATES.
Downloaded by UNIV OF PITTSBURGH on February 15, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch007
Current Status of the Project The project i s s t i l l i n the prototype stage. I t i s being used, but not widely. Presently, the knowledge base f o r the system has l e s s than 100 r u l e s . This number i s misleading because a l l the work performed by the FORTRAN programs i s not counted i n the number of r u l e s . These programs give the system f a r more knowledge than would be expected from the 'small' knowledge base. The system can help s c i e n t i s t s r e l i a b l y determine what type of formulation to make. However, the only branch of the d e c i s i o n tree which has rules i s the emulsifiable concentrates (EC) branch. The system can determine which solvents to t r y to make an EC. I t s decision r e l i e s h e a v i l y on rules and s o l u b i l i t y c a l c u l a t i o n s . Work i s j u s t beginning on the rules to determine which e m u l s i f i e r s to use. The program has been interfaced to two FORTRAN programs. The f i r s t , MOLY, i s a l o c a l l y developed product f o r chemical structure entry, d i s p l a y , and molecular modeling [2]. The expert system only takes advantage of the chemical structure handling p o r t i o n of the program. The other program, UNIFAC [3], performs s o l u b i l i t y c a l c u l a t i o n s f o r the active ingredient i n a group of solvents of i n t e r e s t to formulation chemists. The inference engine performs both forward and reverse-chaining. The reverse-chain algorithm i s a depth f i r s t search. Using t h i s algorithm, questions asked by the system are grouped by subject, making the program appear more l o g i c a l to the user. The program handles exact and inexact l o g i c c a l c u l a t i o n s and explains, i n English, why a question was asked and how a conclusion was reached. The program also allows the s c i e n t i s t to change answers i n case of mistakes, or to investigate "what i f " scenarios. Directions f o r Future Development Future developments f a l l i n t o two classes: additions to the knowledge base and enhancements to the program. As the program i s used by more people, f i n e tuning of the rules to s e l e c t which type of formulation to t r y w i l l be needed. Work, from that point, w i l l continue on the emulsifiable concentrate branch. The solvent s e l e c t i o n p o r t i o n w i l l require some f i n e tuning, but the major work i s i n adding to the l i s t of solvents. The e m u l s i f i e r s e l e c t i o n p o r t i o n of the knowledge base w i l l d e f i n i t e l y require a d d i t i o n a l r u l e s , to be followed by considerable tuning as i t i s used. The remaining four formulation types have yet to be started. They w i l l
In Artificial Intelligence Applications in Chemistry; Pierce, Thomas H., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1986.
Downloaded by UNIV OF PITTSBURGH on February 15, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch007
7.
HOHNE A N D HOUGHTON
Formulation of Agricultural Chemicals
97
require d i f f e r e n t experts and can be developed concurrently with the EC p o r t i o n . The f i r s t major enhancement to the program w i l l be the a b i l i t y to stop sessions at any point and r e s t a r t at the same point at a l a t e r time. This c a p a b i l i t y w i l l be more than j u s t a convenience, i t w i l l be necessary to make the laboratory r e s u l t s requested by the program u s e f u l . A f t e r t h i s addition, the next major enhancement w i l l be to develop a method of using the rules to trouble-shoot f i e l d problems. This enhancement w i l l involve adding some r u l e s , but most of the knowledge should already be i n the knowledge base. As the program becomes widely used, the a b i l i t y to generate reports and data sheets f o r laboratory r e s u l t s w i l l be a valuable addition. The added a b i l i t y to remind the s c i e n t i s t about c e r t a i n deadlines for a p r o j e c t may be e a s i l y included, but w i l l not be u s e f u l u n t i l s c i e n t i s t s use the Apollo computer r e g u l a r l y . The expert system c u r r e n t l y has no r u l e entry or maintenance f a c i l i t i e s . Rules are entered and modified using the Apollo computer text e d i t o r . This i s acceptable f o r a prototype, but not for a production system. Before these f a c i l i t i e s are added, i t s cost and c a p a b i l i t i e s w i l l need to be compared to those o f commercial expert systems. Literature Cited 1. Prenau, D.S., "Selection of an Appropriate Domain for an Expert System", AI Magazine, 6(2), 1985 2. Dyott, T., Stuper, A.J., Zander, G.S., "MOLY, an Interactive System for Molecular Analysis", J. Chem. Inf. Comp. Sci., 20(28), 1980 3. Fredenslund, Α., Jone, R.L., Prausnitz, J.M., "Group-Contribution Estimation of Activity Coefficients in Nonideal Liquid Mixtures", AIChE Journal, 21(6), 1975 R E C E I V E D December 17, 1985
In Artificial Intelligence Applications in Chemistry; Pierce, Thomas H., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1986.