edited by RUSSELLti. BAT^ Kenyon College Gambier, OH 43022
Do Loops in Spreadsheets Benson R. Sundhelm New Yark Unlverslty 4 Washington Place. New York. NY 10003 The forte of computers is their ability to do highly repetitive calculations rapidly. The instruction to repeat some set of instructions many times is usually given by the DO loop; e.g., in FORTRAN in BASIC in PASCAL DO 1001=1,N FORI=l toN WHILE.. .DO 100 .
..
NEXT I
END
The common spreadsheet1 does not ind u d e looping in its syntax (except within macros-see below), so that it not always recognized that many computations usually programmed in high-level languages, (1-3) can also be readily carried out in spreadsheets. Here we discuss four computations of this sort that are useful to chemists and that entail various sorts of looping. This discussion is directed toward teachers. The emphasis is on the way in which repetitive structures may be expressed in a spreadsheet lavout. Snecific realizations of each examplc and write-ups directed toward students are givcn in another place: In itssimplest form theDO loupsmakesit possible to program a repetitive operation with a compact statement. The equivalent in a spreadsheet4is the (relative) COPY operation. Any instruction or group of instructions can he copied as many times as required by the loop range.
Rate Equations An example of this form of DO loops is afforded by the integration of a set of chemical rate equations using the simple Euler a l g ~ r i t h mThe . ~ scientific problem is the expression of the conservation of matter. This takes the form of "master equations"; ie., a set of simultaneous first-order (nonlinear) differential equations, e.g., Child (5). For example, consider the integration of a kinetic schema for a free radical reaction, well known to be difficult and awkward to handle analytically:
A300
Journal of Chemical Education
Br,
k 4 2Br
B~.+H,% HB~+H. H.
+ ~ r , % HBr + Br. b
H+HB~+H,+BI. Br.
+ Br.
k
H.+H.% H.
Br,
H,
+ Br. % HBr
Arate equation is required for each chemical species. Thus
While analytic integration is preferable, it is frequently (as here) infeasible so that we must have recourse to numerical integration. In thiscase that means the integration of this set of equatronr for specified initial cunditionr throurh a series ofdrscrete tmw steps. A ~ortran;ealization of this process would uodate the concentrations at each time step, carrying out the repetition by a DO loop. In a spreadsheet realization each coucentration is stored in a specified cell along with a formula for calculating its value at the next time step. For [B.] in this example we would have
Each of the concentrations is identified with a relative cell address and each of the constants with an absolute cell address. A set of initial concentrations and rate constants is specified, and then the set of basic eoneentration cells is replicated with a single "COPY" instructions for, say, 100 steps. Within a few seconds the complete set of computed data appears. I t may be inspected and then plotted with simple instructions. The rate constants, initial conditions, time step and number of steps can be changed at will, and the spreadsheet will automatically update itself within seconds. The only modification required to study a different schema is the rewriting of the formulas in the basic cells (one for each of the species appearing in the rate equations) and repeating the copying step. Since the relation hetween the expressions inserted in the cells and the model is very direct, student attention is focused on the model rather than on the computational details. The complexity of the reaction scheme is irrelevant. I t is as easy to set up a free radical chain reaction as a first-order decomposition. The results appear rapidly and in convenient form, so that a feel for the dynamics can be obtained by varying the input data.
As a very simple (and powerful) example ofaslightly d~fferentkind ofreperition usuallv carried out hv a DO l o o ~we . look at the process of couvoiution. As described by Savitsky and Golay (61,smoothing andlor differentiation of data by least-squares procedures can be conveniently accomplished by calculating the convolution of the set of equally spaced data points with the appropriate function, given numerically in their various tables. The sample imnlementation. given in their program 1, is-for a ~ ) - ~ o i n t cubic smoothine of Ndatanoints. That is. 9 suucessive puintssre fit, in the least-squares sense, t u a pdynumial,snd the central point is replaced in the smoothed set by its value calculated from the polynomial. The set of points under consideration is slid down one step and the procedure repeated. In their Fortran listing, the essential operation is
-
carried out in two nested DO loops, the auter .-one aver the data noints and the inner over the convoluting coefficients. According to their prescription the smoothedvalue at a given point is composed of a weighted sum over the set of 9 points centered on the given point, the weighting coefficients heing given in their Tahle I as 59 for a,, 54 for a,,,, 39 for a,,~, 14 for a,*% and -21 for a,*&, all divided by 231. Each succeeding point is treated in the same way, the six endpoints being treated separately. ~
In order to carry out this operation in a spreadsheet, we hegin with thedata listed in a column, say A3 to A103. I n cell H5 enter
This leads to the smoothed point being entered in cell B5. By the (relative) COPY instruction, eell B5 is replicated from B6 to B96. The automatic updating immediately fills column B with the 9-point least-squares smoothing of the raw dsta listed in column A. (The three pointsat either end cannot be smoothed by this algorithm.) In addition to smoothing, the convolution procedure as implemented here can also provide formulas for filtering and differentiating dsta by modifications of the weighting coefficients as described in ref 6 and for computing the effect of (linear) instrumental response functions on inputs.
R o d Extraction Slightly more complex DO loops occur when a variable used at some step has been changed by an earlier step within the loop ("reflexive looping"). This may include IF statements whose outcome depends on calculations made earlier within the loop. These may also be dealt with in spreadsheets by use of so-called "circular references". An example of their utility isprovided by root extraction. Since the determination of mots of algebraic andlor transcondental equations is one of the most common computational
tasks arising in the early part of chemical training, a reliable method is an important part of the computational armentarium. Root bisection is the surest (albeit the slowest) method (& e.g., Press e t al. (7)) and it has the additional virtue of heing easily and completely understandable even to students who have not yet had calculus. We restrict ourselvesto the case where the root is known to be real and positive on physical grounds (e.g., a concentration). Then the steps are to bisect the assimed ranee. - . evaluate the function at the midpoint, test its sign. and, depending on the result, move eithrr the upper or lower boundary and then repeat the procedure. I t is possible to set the defaults so as to iterate,say 10times, at a keystrokeand to specify the order of calculations, say by
gramming language permitting a group of instructions to be executed upon a single command. These include a rudimentary subroutine facility and a DO instruction that is limited to subroutine calls. As an example of the power of this facility we turn to calculation of a Hermite polymomial via series expansion. (A similar ealeulation of Legendre polynomials is described in detail by Orvis ( 8 ) ) .The relevant expression to he evaluated is
TOWS.
The summation, which has a finite number of terms, requires the calculation of a number of factorials. In the soreadsheet calculation, the fartnrrals are computed in a rubroutine that is called from n DO loop in the macro.
..
~~~
In order to prevent the repetitive loop from occurring until we a r e ready, s "switch" must be installed. For example, the switch eell may he loaded with a zero and the upper and lower limit cells given the instruction to use the corresponding initial values if the switch cell contains a zero. If that cell value is not zero, then the values calculated in the loop are used, so that entering 1in the switch cell will then start the recalculation procedure. The number of loops may be limited either by setting the number of repetitions in the DEFAULT1 RECALCULATION/ITERATION menu or, alternatively, the recalculation may beset to manual and the loop executed one step at a time (Function key 9 inQUAITRO) so that the student can watch the process of convereence. The nromam. aside from Labels that .. may be added fur rlarn,, occupies only 8 cells. Once the ~ t u d e n has t srudird the process by stepwiac manual operation, he she may then use a formulation in which a specified number of cycles is invoked sutomatically or until a predetermined accuracy is obtained and for which the signs of the funetion a t the upper and lower boundaries ere automatically checked. (See the macro discussed below.)
.
'
For example. QUATTROor OUATTRO PRO by Borland International,LOTUS 123by Lotus, EXCEL by Micrasolt Submined to J. Chem. Educ.: Software. For convenience we will use the Fonran erpressions. it should beposeibleto transcribethese readily to the corresponding expressions in Pascal or Basic. We assume some degree of familiarity with Fortran by the reader. F o r definitenesswe will usethe notation of t h e Borland prOdUCI OUATrRO Transcrlplion to otnsr popular s ~ e a a s h e esnould t ~ be easy We sssdme some degree of famlliardy with spreadsheeloperations. See, for example, ref 4. Shce any desired accuracy can be obtained by varying the time step, for this kind of calculation there is no significant advantage to using highw order infearation rwtins. Furthermore. the direct expression is physically appealing to students unsophisticated in numerical analysis.
Macros Spreadsheets also include a so-called macro facility that provides a kind of proVolume 67
(Continued on page A3021
Number 12 December 1990
A301
the computer bulletin board
The macro carries out an initialization Loop over the terms and computation of each term. Looping is carried out by the macro instruction FOR, which identifies a cell to be used as the counter, gives the starting value, s t e p a n d terminal value of t h e counter, and the cell location where the suhroutine far calculating and summing each term begins, This subroutine in turn calls a second subroutine for calculating factorials, the variables being passed to the subroutine by giving the addresses where they are stored. In operation, n and r are loaded into designated cells and the macro executed by entering its name, the result appearing in another designated cell. By judicious use of a copy operation, the polynomial may be evaluated for a series of values of x and then plotted for inspection. Any of the commonly encountered polynomials (Hermite, Legendre, Laguerre, Chehyehev, etc.) as well as thespherical harmonies can be computed in this way. Concluslons For most of the calculations of interest to chemists, particularly for undergraduates, spreadsheets afford a simple (because menu driven) and flexible computing "language". Rather sonhistieated DrOEIamS can be devised withbut resorting &ore highly stylized compiled languages. In the crowded chemical curriculum, there is a real advantage to introducing entering students to spreadsheets and then continuing their use as the physical and mathematical problems mow more challenping. As shown here, it is not necessary to s&rifice computing power in this process.
A Comparison of Three Software Programs to Solve Acid/ Base Problems
set up of the prohlem, and they give results that are equivalent to those obtained from a spreadsheet. The two problems illustrated here involve finding the pH of a solution of acetic acid and the same prohlem illustrated hy Parker and Breneman. The three programs used to solve the problems are Quattro, a LOTUS 1-2-3 compatible spreadsheet from Borland', Eureka: The Solver also from Borland, and the Student Edition of MathCAD Version 2.0 from Addison-Wesleyz. All three of these programs run on PC-compatible computers, and the Borland programs are available at educational nrices ~rovidedvou are either a student or faculty membir at an educational institution. The simultaneous equations used to ealculate the pH of monosodium sslt solutions of diprotic acids are illustrated in the article by Parker and Breneman and will not he reproduced here. The equations usedto calculate the pH of an aeetie acid solution are as follows:
a
K, = [H30t] [OH-]
+ [A-]
(3)
+ [OH-]
(4)
C, = [HA] [H30+]= [A-]
These equations are rearranged togive eq 5, used to make the successive approximation calculations for the spreadsheet solution to the prohlem.
+ [OH-]) + K,))'.'
($KAS(A4 - C4
+$KW/B4
Journal of Chemical Education
(6)
+ 04) + $KW)^0.5(7) (8) +$KAa(A4 - C4 + D4)/B4
Edgar H. Nagel Valparalsa Unlversily Vslparaioo. IN 46383
A302
(5)
S p r e a d s h e e t Solutlons The Quattro results for the monosodium sslt of malic acid are the same as in the anicle by Parker and Breneman The results for aceucarid are calculated ualng the following equations. Initial Acid Concentration
A recent artide bv Parker and Breneman (9) suggests the w e of a spreadsheet to generate numericalsolutions for the pH of ROIUtions of monosodium s a l u of diprotic acids. Spreadsheetsare good at solving these types of problems. However, difficulties arise with circular logic and the authors are careful to point these out. Circular Logic arises when two different cells reference each other. Equation solvers allow you to type in the equations so that they look like the original
(2)
-@LOG(B4)
(9)
(10)
A P E 4 are cell references. Column B contains [H30t], C contains [A-], D contains [OH-], and E contains the pH. 5K.4 and
'
BorIand imernatlonal. Inc., 4585 Scotts Valley Drlve, Scom Valley, CA 95066. Z A d d l ~ ~ n - W e ~ I Publishing ey Company. Inc.,
$KW are references to cell names that contain the appropriate constants. An example of circular logic occurs in eq 7 since it uses the resulta of eqs 8 and 9, which use the results of eq 7. The spreadsheet calculation, for aeetie acid, fails when the initial concentration of acid is 10-5 or less. 0uattro indirates failures by placing ERR incells. When ERR occurs in cells that arc wed in circular calculations, the only way to continue is to erase the circular cells that have ERR in them, change the cell that caused the error, copy the equations hack into the appropriate cells, and recalculate. ~~~~~~~
~
~
Eureka Solutlons The Eureka equations for solving the monosodium salt prohlem are similar to the defining relationships in the article hy Parker and Breneman and the results are identical to those of the spreadsheet for both high (0.10) and low (0.00001) concentrations. The differences between Eureka and the soreadsheet are that it is more difficult to obtain output ior a series of roncentrntims in t a l ~ l a form r and thr Eurekn Iterations are hidaen from the user. In gen~ral. equation sblvers set u p their own internal iterations and spreadsheets require the user to set up the iterations. The acetic acid equations for Eureka are entered as they appear in eqs 1 to 4 except that subscripts and superscripts are not used, all [ ] are removed, and [HsOt] becomes H. Initial guesses of H = (K.*C.)O5, HA = C,, OH = K,/H, and A = K.*HA/H are added slang with the constraints H > 0 and OH > 0. The final results from Eureka may sometimes give physically impossible answers. The results depend on the initial guesses and on the "accuracy" setting under the Options menu. Eureka's first attempt gave amaaimum error of 2.83-07,pHof 2.84 and [OH-] = 2.963-4. Even though the maximum error is small, the value for [OH-] is impossible. After several iterations, the maximum error was reduced to 9.993-15 with a pH of 2.88. However, 1.323-17 for [OH-] is still impossible. If the OptionsSetting-Accuracy is changed to 1E-25, correct answers are obtained for all concentrations of acetic acid. Eureka has a tendency to "remember" results. If you use the default accuracy to solve a prohlem and then change the accuracy to get a better answer, you also have to rewrite one of the equations to force a new calculation. If that does not work, you may have to return to DOS, start Eureka, make changes to the options, and then solve the problem. Student Edltlon of MathCAD The monosadium sslt problem for the Student Edition of MsthCAD are the same as Eureka. The acetic acid equations for MathCAD are similar to the Eureka equa-