A Rule-Based Declarative Language for Scientific Equation Solving

the solution of a set of Ν algebraic equations - not necessarily linear - in Ν unknowns. ... how best to solve the problem posed and in what order t...
0 downloads 0 Views 885KB Size
9 A Rule-Based Declarative Language for Scientific Equation Solving

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on April 4, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch009

Allan L. Smith Chemistry Department, Drexel University, Philadelphia, PA 19104

Procedural languages for scientific computation are briefly reviewed and contrasted with declarative languages. The capabilities of TK!Solver are explained, and two examples of its use in chemical computations are given. Most of the applications of artificial intelligence in chemistry so far have not involved numerical computation as a primary goal. Yet there are aspects of the AI approach to problem-solving which have relevance to computation. In scientific computation, one could view the knowledge base as the set of equations, input variable values, and unit conversions relevant to the problem, and the inference engine the numerical method used to solve the equations. This paper describes such a software system, TK!Solver. Brief Review of Software for Scientific Computation Since the beginning of electronic computing, one of the major incentives for developing computer languages has been to improve the ease of solving mathematical problems arising in science and engineering. Many such problems can be reduced to the solution of a set of Ν algebraic equations - not necessarily linear - in Ν unknowns. The earliest ways of doing this involved direct hand coding in hexadecimal machine language or in assembly language mnemonics, specifying in excruciating detail the procedures needed to transform input data into results. My first experience with computers (I) was on a Bendix laboratory computer, generating three-component polymer-copolymer phase diagrams in assembly language. After a summer of this I became quickly convinced that there must be a better way. In the early 1960's the first compiled procedural programming language for scientific computation, FORTRAN, became widely used in the US, with a parallel development of the use of ALGOL in Europe. Later in the decade, the interpretive procedural language BASIC emerged, followed by the powerful algebraic notational language APL. The first structured, procedural language developed to teach the concepts of programming, Pascal, appeared in 1971, followed later in the decade by the C language. In all of these procedural languages (also called imperative languages (2), one of the basic elements of syntax is the assignment statement, in which an algebraic 0097-6156/ 86/ 0306-0111 $06.00/ 0 © 1986 American Chemical Society

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on April 4, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch009

112

expression is evaluated and stored in a named storage location called a variable. Although both BASIC and FORTRAN use the equality symbol " = " for the assignment statement, Pascal emphasizes the procedural nature of the assignment statement by using the symbol " := " , thus distinguishing it from an algebraic equation. Another characteristic of procedural languages is that they specify in detail the procedures and flow of control needed to solve a problem, using such structures as conditionals and loops. Parallel to, but largely independent of, this development of procedural computational languages was the evolution of non-procedural or declarative languages used for symbolic processing. Eisenbach and Sadler (2) have reviewed the evolution of declarative languages, which began with LISP in 1960 and includes such recent languages as Prolog. One of the characteristics of declarative languages is that problems are defined in terms of logical or mathematical relationships, rather than assignment statements and flow of control, and that the language itself then decides how best to solve the problem posed and in what order to use the information provided. Declarative languages have not so far been widely used in scientific computation because of their computational inefficiency. There have also been important developments in the past decade in scientific applications software as scientists and engineers have looked for other ways of solving problems than by writing a program in FORTRAN or another procedural language. Libraries of mathemetical procedures commonly used in science and engineering became available for those who wanted to write their own procedural software but needed robust numerical algorithms in an easily used form. One of the first full scientific software packages which freed the user from writing in a compiled or interpreted procedural language was RS/1, which evolved as a part of the Prophet Network established by NIH for its research grantees in the 1970's and now runs as a separate package on DEC sur^rminis and personal computers (2). Another was the electronic spreadsheet, first embodied in its simple tabular format in Visicalc but now enhanced with plotting and sorting capabilities in Lotus 1-2-3 and several other packages. A third example is statistical software such as SPSS (4) or Minitab (5). Symbolic processing languages such as LISP led to the development of symbolic mathematics packages such as MACSYMA; their use in chemistry has been reviewed by Johnson (©. A recent ACS Symposium on symbolic algebraic manipulation contains a full description of MACSYMA among other systems, and a variety of applications in chemistry (2). Scientific applications software packages are often characterized by close attention to the design of the user interface, sometimes at the expense of program size or execution time. By far the dominant computational idiom in these packages, however, is procedural. For example, RS/1 has an internal language called RPL, modelled after the procedural language PL/1, in which specialized procedures and functions not available in the package may be written by the user. In spreadsheets, the cell is the basic storage location for either data or formulae. Cells are provided with data by an assignment process, and formulae reference other cell locations as variables.

TK!SQlygr TKîSolver (S) is a high-level computer language for solving sets of algebraic equations and tabulating or plotting their results. In TKîSolver, equations are viewed as relationships or rules, not as assignment statements, and in that sense it may be viewed as a declarative language. The basic computational approach taken by TKîSolver grew out of the research of textile engineer Milos Konopasek in the 1970's. It was realized early on by Konopasek and Papaconstadopoulos (2) that a high level computational langauge need not be procedural but could be declarative;

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on April 4, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch009

9. SMITH

A Rule-Based Declarative Language for Equation Solving

113

this point has been recently amplified by Konopasek and Jayaraman (1Q), who also make the case for TKîSolver's being an expert system for equation solving. To produce TKîSolver, the problem-solving methodology implemented by Konopasek in his Question Answering System (2) was combined with the experience in designing full-screen user interfaces of Software Arts, Inc. (the originators of the electronic spreadsheet). The goal of the language was to obviate three of the time-consuming stages of procedural program development (11): (1) algebraic transformations necessary for formulating assignment statements; (2) sequencing assignment statements to secure desiredflowof information through the program; and (3) setting up input and output statements. The capabilities of TKîSolver, which runs on a number of different personal computers, are as follows (10,11): (1) It parses entered algebraic equations and generates a list of variables. (2) It solves sets of equations using a consecutive substitution procedure (the direct solver). (3) It solves sets of simultaneous (non-linear) algebraic equations by a modified Newton-Raphson iterative procedure when consecutive substitution fails (the iterative solver). (4) It searches through tables of data and evaluates either unknown function values or arguments when required in solving. (5) It performs unit conversions with definable conversion factors. (6) It detects inconsistencies in problem formulation and domain errors. (7) It generates series of solutions for lists of input data and displays results in tabular or graphical form. To see how such a language can speed up the process of equation-solving, consider the steps needed to solve a set of algebraic equations when using a procedural language. First, you must identify the variable or variables for which you need to solve. Next, you must use algebraic substitution methods to express the variables to be solved for in terms of the known variables using assignment statements. Finally, you must write a program to input valuesforthe known variables, evaluate the unknown variables, and output the results. There are several disadvantages to this method. If a different combination of variables serves as input for another similar problem based on the same set of equations, the algebra must be reworked to solve for those new variables. In many cases it may not be possible to obtain analytic expressions usable in assignment statements, so you must find some numerical approximation algorithm suitable for the problem at hand and either obtain or write the code based on that algorithm. A Chemical Example: The van der Waals Gas Take, for example (12), the problem of solving for the P-V-T properties of a real gas obeying the van der Waals equation of state, 2

P =nRT/(V-nb)-n a/V

2

(1)

where a and b are coefficients characteristic of a given gas. Solving for P, given n, V, and Τ is a simple assignment statement, but solving for η given Ρ, V, and Τ requires considerable algebraic manipulation, followed either by applying the formula for the roots of a cubic equation or by using a numerical technique for determining roots (the latter usually requires more mathematical analysis - for example, finding first derivatives using the Newton-Raphson method). Figure 1 shows the Rule Sheet for a TKîSolver model REALGAS.TK (12). Thefirstrule is the van der Waals equation of state. The second defines the gas constant, and the third rule defines the number density. The fourth defines the compressibility factor z, a dimensionless variable which measures the amount of

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

114

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

departure of a real gas from ideality. The next three rules give the critical pressure, molar volume, and temperature of a van der Waals in terms of the coefficients a and b. The Van der Waals equation can be recast in a form which uses only reduced, dimensionless variables; these are defined in the next three rules. The last two rules provides values for the van der Waals coefficients a and b when the name of the gas is given (user-defined functions with symbolic domain elements and numerical range elements can be used in any model which requires reference to built-in data tables).

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on April 4, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch009

S Rule "Equation of State of a van der Waals Gas. Chap. 4. Model name: REALGAS.TK * R = 0.0820568 "Value of gas constant *nd = n/V "Number density *z = P* V / ( n * R * T ) "Compressibility factor *Pc = a/(27*b 2) "Critical Pressure * Vc = 3 * b "Critical Molar Volume *Tc = 8*a/(27*b*R) "Critical Temperature *Pred = P/Pc "Reduced pressure * Vred = V/Vc "Reduced volume *Tred = T/Tc "Reduced temperature A

* a = acoeff ( gasname ) * b = bcoeff ( gasname )

"Function for Van der Waals a coefficient "Function for Van der Waals b coefficient

Figure 1. Rule Sheet for Model REALGAS.TK A typical use for this model would be to solve for the number of moles of a gas, given its identity, pressure, volume, and temperature. The iterative solver is used for this purpose. You must decide which variable to choose for iteration and what a reasonable initial guess is. Real gases approach ideal behavior at low pressure and moderate temperatures. Since the compressibility factor ζ is 1 for an ideal gas, and since knowing ζ along with Ρ, V, and Τ allows a calculation of n, we choose ζ as the iteration variable and 1.0 as the initial guess. The Variable Sheet with the solution to such a problem is shown in Figure 2. Unit conversions from psi to atmospheres, from cubic feet to liters, and from Fahrenheit to Kelvins have been built into the model via the Units Sheet. For input values of 100 cubic feet of acetylene at 300 psi and 66°F, there are 728.9 moles of acetylene and the value of ζ of 0.874 indicates that the deviationfromideality is 12.6%. Another Example: Acid Rain Problems in chemical equilibrium with many reactions involving many species often generate mathematical models containing large sets of simultaneous, nonlinear equations which must be solved by numerical means. TKîSolver is a good tool for solving such problems. For example, consider the acid-base chemistry of a raindrop. Vong and Charlson (12) have developed an equilibrium model which predicts the pH of cloud water, assuming an atmosphere with realistic levels of three soluble, hydrolyzable gases: SO^, C 0 , and NH . Also included is the effect of acidic dry aerosols, particles of sub-micron diameter containing high concentrations of sulfuric and nitric acid. 2

3

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

9.

SMITH

St

A Rule-Based Declarative Language for Equation Solving

Input

Name

Output

Ρ V η 728.92419 R .0820568 66.000000 Τ 'acetylen gas nam 'text ζ .87417992 nd .97443505

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on April 4, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch009

300 100

a b Pc Vc Tc Pred Vred Tred

4.39 .05136 61.638310 .15408 308.63925 .33118688 4854.9325 .94624676

115

Comment

Unit

pressure psi volume ft 3 number of moles mol gas constant l*atm/(mo temperature oF name of the gas compressibility factor decimal molar density mol/1 A

A

atm*l 2/m 1/mol atm 1 Κ decimal decimal decimal

van der Waals a coefficient van der Waals b coefficient critical pressure critical molar volume critical temperature reduced pressure reduced volume reduced temperature

Figure 2. Variable Sheet for REALGAS.TK with Solution There are five laws of chemical equilibrium relevant to the Charlson-Vong model: (1) the ideal gas law, relating gas species density to its temperature and partial pressure; (2) Henry's law, relating the partial pressure to the concentration of dissolved gas; (3) the law of mass action, giving equilibrium constant expressions for the hydrolysis reactions of the dissolved gases; (4) conservation of mass for species containing sulfur(IV), sulfur (VI), carbon(IV), nitrogen(V), and nitrogen(-IH); and (5) conservation of charge. Applying these laws, Vong and Charlson were able to calculate the pH of a raindrop by solving a set of 17 equations in 29 variables (cloud water content, temperature, partial pressures, and species concentrations) and 9 parameters (Henry's law constants, equilibrium constants, and the gas constant). They wrote a FORTRAN program which solved all equations but one, that of charge conservation. The pH at electrical neutrality was determined by a graphical method, in which the total positive and negative charge concentrations were calculated and plotted for a series of assumed pH's and the crossing point found. A TKîSolver model called RAINDROP.TK has beendeveloped to incorporate the full Charlson-Vong model of cloud water equilibrium (12), including the temperature dependence of all equilibrium constants. The iterative solver makes it possible to compute the pH at charge neutrality without having to make plots of intermediate results. The Rule Sheet is shown in Figure 3. The Unit Sheet contains a number of conversions necessary to accommodate the variety of units used in experimental atmospheric chemistry. The Variable Sheet is arranged so that the variables at the top are the ones normally chosen as input variables. Since the usual goal of running the model is to determine the pH of the raindrop, the variable p H is chosen as the one on which to iterate. The following problem, taken to match the conditions in Figure 2 of reference 13, is typical of those solved in less than one minute on an IBM PC with this model: "a cloud at 278 Κ contains 0.5 grams of liquid water per cubic meter of air. The atmosphere of the cloud contains 5 ppb sulfur dioxide, 340 ppm carbon dioxide, 0.29 μg/m of nitrogen base, 3 μg/m of sulfate aerosol, and no nitrate aerosol. What is the pH of the cloud water?' Figure 4 shows the Variable Sheet after solution. 3r

3

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

116

Rule

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on April 4, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch009

Equilibrium pH of a Raindrop, Charlson-Vong model. Chap. 8. Model name: RAINDROP.TK PS02 = NS02 * R * Τ "Ideal gas law for S02 PNH3 = NNH3 * R * Τ "Ideal gas law for NH3 PC02 = NC02 * R * Τ "Ideal gas law for C02 PS02 = CS02 * KHS PNH3 = CNH3 * KHN PC02 - CC02 * KHC

"Henry's law for S02 "Henry's law for NH3 "Henry's law for C02

K1S = CHS03m * CHp / CS02 K2S = CS032m * CHp / CHS03m KB = CNH4p * COHm / CNH3 K1C = CHC03m * CHp / CC02 K2C = CC032m * CHp / CHC03m KW = CHp * COHm

"Mass action law for S02 - HS03m "Mass action law for HS03m - S032m "Mass action law for NH3 - NH4p "Mass action law for C02 - HC03m "Mass action law for HC03m - C032m "Mass action law for water

NTS4 = NS02 + L * (CS02 + CHS03m + CS032m ) "Mass balance for sulfur(IV) NTN3m = NNH3 + L * ( CNH3 + CNH4p ) "Mass balance for nitrogen(-ffl) NTC4 = NC02 + L * (CC02 + CHC03m + CC032m ) "Mass balance for carbon(IV) NTN5 = L * CN03m "Mass balance for nitrogen(V) NTS6 = L * CS042m "Mass balance for sulfur(VI) C^4p+CHp=ŒS03m+2*CS032m+COHm+CHC03m+2*CC032m+CN03m+2*CS042m "Chargebalance pH = -log(CHp)

"Definition of pH

"Tenperature-dependent equilibrium constants KHS = 0.379 * exp ( -3145.99 * (278 - Τ ) / ( 278 * Τ ) ) KHC= 16.6* exp (-2367.65* ( 2 7 8 - Τ ) / ( 2 7 8 * Τ ) ) KHN = 7.11E-3 * exp( -3730.87 * (278 - T ) / (278 * T) ) K1S = 2.06E-2 * exp(2003.54 * (278 - T ) / (278 * T) ) K2S = 8.88E-8 * exp (1461.46* ( 2 7 8 - Τ ) / ( 2 7 8 * T ) ) KlC = 2.94E-7*exp(-1716.92*(278-T)/(278*T)) K2C = 2.74E-11 * exp (-2217.49 * ( 278 - Τ ) / ( 278 * Τ ) ) KB = 1.5E-5 * exp ( -685.59 * ( 278 - Τ ) / ( 278 * Τ ) ) KW = 1.82E-15 * exp ( -7057.27 * ( 278 - Τ ) / ( 278 * Τ ) ) R = 0.0820565 "Ideal gas constant

Figure 3: Rule Sheet for Model RAINDROP.TK Summary of Other Chemical Applications In addition to the two examples above, I have developed TKîSolver models for the ideal gas, for two-component mixture concentrations, for acid base chemistry (including the generation of titration curves), for transition metal complex equilibria, for general gaseous and solution equilibria, and for linear regression (12). Drexel undergraduate students in both the lecture and the laboratory of physical chemistry have been using TKîSolver for such calculations as least squares fitting of experimental data, van der Waals gas calculations, and quantum mechanical computations (plotting particle-in-a-box wavefunctions, atomic orbital electron densities, etc.). I use TKîSolver in lectures (on a Macintosh with video output to a 25" monitor) to solve simple equations and plot functions of chemical interest.

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

9. SMITH

A Rule-Based Declarative Language for Equation Solving

117

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on April 4, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch009

TKîSolver has also had heavy use in the material balance course in chemical engineering, and in a mathematical methods course in materials engineering. Graduate students in chemistry are using it in research projects in spectroscopy and kinetics. In the teaching of quantum mechanics, TKîSolver has proved especially useful. For example, Berry, Rice, and Ross Q4) give several problems on the regions of Unit

Comment

Τ L

Κ g/m 3

temperature liquid water content of the cloud

PS02 PC02 NTN3m NTS6 NTN5

ppb ppm ug(N)/m 3 ug(S04)/m ug(N03)/m

partial pressure of S02 partial pressure of C02 total nitrogen base concentration sulfate aerosol concentration nitrate aerosol concentration

Input

Name

278 .5 5 340 .29 3 0

Output

A

A

pH

4.0190385

decimal

pH of water in the cloud

PNH3

2.9020E-4

ppb

partial pressure of NH3

CN03m 0 CS042m .0000625

mol/lw mol/lw

concentration of N03 anion concentration of S04 anion

NS02 CS02 CHS03m CS032m

2.192E-10 1.3193E-8 2.8395E-6 2.6344E-9

mol/la mol/lw mol/lw mol/lw

concentration of gaseous S02 concentration of dissolved S02 concentration of HS03 anion concentration of S03 anion

NC02 CC02 CHC03m CC032m

1.4905E-5 2.0482E-5 6.2915E-8 1.801E-14

mol/la mol/lw mol/lw mol/lw

concentration of gaseous C02 concentration of dissolved C02 concentration of HC03 anion concentration of C03 anion

NNH3 CNH3 CNH4p

1.272E-14 4.082E-11 3.2197E-5

mol/la mol/lw mol/lw

concentration of gaseous NH3 concentration of dissolved NH3 concentration of NH4 cation

CHp COHm

9.5711E-5 1.902E-11

mol/lw mol/lw

concentration of hydrogen ion concentration of hydroxide ion

NTS4 NTC4

2.206E-10 1.4905E-5

mol/la mol/la

total concentration of sulfur (IV) total concentration of carbon (TV)

KHS KHC KHN

.379 16.6 .00711

atm*la/mo atm*la/mo atm*la/mo

Henry's law constant for S02 Henry's law constant for C02 Henry's law constant for NH3

K1S K2S K1C K2C KB KW R

.0206 8.88E-8 2.94E-7 2.74E-11 .000015 1.82E-15 .0820565

decimal decimal decimal decimal decimal decimal la*atm/(m

equilibrium constant for S02 - HS03 equilibrium constant for HS03 - S03 equilibrium constant for C02 - HC03 equilibrium constant for HC03 - C03 equilibrium constant for NH3 - NH4 ionization constant for water ideal gas constant

Figure 4 : Variable Sheet for Solution to Sample Problem Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.

118

ARTIFICIAL INTELLIGENCE APPLICATIONS IN CHEMISTRY

bonding and anti-bonding in diatomics, one of which requires the calculation and plotting of contours of constant bonding force. They suggest calculation of the bonding force on a large grid of points and then connecting points of constant force, but with TKîSolver it is possible to solve directly the set of parametric equations in r and theta and to plot the resulting contours. In summary, the rule-based, declarative approach to solving sets of algebraic equations presented by TKîSolver has proved to be a fruitful medium for chemical computations.

Downloaded by UNIV OF CALIFORNIA SAN DIEGO on April 4, 2016 | http://pubs.acs.org Publication Date: April 30, 1986 | doi: 10.1021/bk-1986-0306.ch009

Literature Cited 1. S. Krause, A. L. Smith, and M. G. Duden, J. Chem. Phys. 1965, 43, 2144-45. 2. S. Eisenbach and C.Sadler, Byte 1985, 10, 181-87; see also other articles on declarative languages in the same issue of Byte. 3. "RS/1 Command Language Guide"; BBN Research Systems, Cambridge, Mass, 1982. 4. "SPSS X User's Guide"; McGraw-Hill, New York, 1983. 5. T. A. Ryan, Jr. ,B. L. Joiner, and B. F. Ryan, "Minitab Student Handbook"; Duxbury Press, Boston, 1976. 6. C.S. Johnson, J. Chem. Inf. Comp. Sci. 1983, 23, 151-7. 7. R. Pavelle, ed. "Applications of Computer Algebra"; Kluwer Academic Publishers, Hingham, Mass, 1985. 8. TK!Solver is a software product developed by Software Arts, of Wellesley, Mass., and introduced in late 1982. At the present time (November, 1985), the rights to distribute TK!Solver belong to the Lotus Development Corporation. 9. M. Konopasek and C. Papaconstadopoulos, Computer Languages 1978, 3, 145 155. 10. M. Konopasek and S. Jayaraman, Byte 1984, 9, 137-145 . See also M. Konopasek and S. Jayaraman, "The TK!Solver Book"; Osborne/Mc-Graw Hill, 1984. 11. M. Konopasek, "Software Arts' TK!Solver: A Message to Educators"; unpublished manuscript, October, 1984. 12. The examples in this paper are from A. L. Smith, "TK!Solver Pack in Chemical Equilibrium and Chemical Analysis"; McGraw-Hill, to be published. 13. R. J. Vong and R. J. Charlson, J. Chem. Ed. 1985, 62, 141-3. 14. R. S. Berry, S. A. Rice, and J. Ross, "Physical Chemistry"; John Wiley, 1980; Chap. 6. R E C E I V E D January 16, 1986

Pierce and Hohne; Artificial Intelligence Applications in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1986.