The use of nonlinear least squares analysis - ACS Publications

leads to a solution of a polynomial arising in a gas-phase, chemical-equilibrium ... successive substitution and the Newton-Raphson method. (The autho...
0 downloads 0 Views 2MB Size
leads to a solution of a polynomial arising in a gas-phase, chemical-equilibrium problem will broaden the arsenal of equation-solving tools for chemists and chemistry students who may not hefamiliar with the technique. There are, however, several aspects of this letter that require comment. The author states that bisection is superior to other successive approximation techniques such as successive suhstitution and the Newton-Raphson method. (The author also includes Aitken's method, which is actually a techniaue for acceleratine convereence durine use of the successive suhstitution metKod.) A Aaim of thislkind should he based on such factors as: (a) the required level of mathematical skills, i.e., the simplicity of the method, (h) the likelihood of convergence on the desired root, (c) the rate of convergence, and (dj the applicability of the method to systems of simultaneous, nonlinear equations. Like successive suhstitution, the bisection method requires no knowledge of calculus and thus hoth methods are attractive for chemistry students, even at an elementary level. In contrast, the Newton-Raphson method requires calculus. Regarding the likelihood of convergence the method of successive suhstitution often diverges rather than converees. I t is claimed that the method of bGection, in contrast, G a r antees convergence. This statement. however. reauires further examinacon. If the function in question has opposite signs a t the upper and lower hound of the interval, this does not guarantee-that there is one root between them, hut rather an odd number of roots. If an interval is selected which does indeed contain more than one root, then the program presented in the letter will only find one of those roots. Thus. the "euarantee" of convereence does not annear to be ahsolke. & a related matter thz author claims'& an advantage of the bisection method its lack of deoendence on a ~ i n g l first e guess. Clearly, however, theauccess~ofthe bisection method depends strongly - .on the iudicious choice of iwo "first guesses". Although bisection certainly gets "high marks" for simplicity and "likelihood" of convergence, it scores very poorly on rate of convergence. Norris (1) points out that bisection is a 'Is-order process and thus is very slow in its convergence. Successive substitution and the Newton-Raphson method, however, are first and second order processes, respectively, and as a result are hoth more rapidly converging than hisection. Finally, the method of successive substitution and the Newton-Raohson method are hoth easilv extended to svstems of simultaneous, nonlinear equations (2, 3). By contrast, it apDears very difficult to extend the bisection method to suci;problemi without a very great increase in complexity. For simultaneous equations other "search" procedures, like the simplex method (4), would appear to be much preferred. In summary, then, bisection is mathematically simple, will usually converge, has a very slow rate of convergence, and is extremely difficult to apply to two or more simultaneous equations. Certainly there are many chemical problems where bisection is an appropriate method of solution, especially when a microcomputer is available to remove the frustration of slow convergence.

J. G. Eberharl University of Coloradd Colorado Sprlngr. CO 90833

"The Use ol Nonlinear Least Squares Analysis" To the Editor: In his article in this Journal [1984,61,778] on "The Use of Nonlinear Least Sauares Analvsis". Cooeland discusses a specific program for medium-to-large computers and comments that there are less sophisticated programs that will run on microcomputers. I have no idea which particular programs he has in mind in this regard. I would like to point butrhowever, that there is at leastone such statisticaidata treatment program that has heen available for some time, that is readily used with microcomputers and gives parameters that are identical with those quoted for the treatment of the "simulated experimental data" listed in Table 4 of Copeland's article ( I ) . I am referring to the use of the Marquardt algorithm (2) in the CURFIT program described by Bevington (3). For a number of years, I have been making use of this program in a version which is appropriate for the Commodore PET microcom~uterseries (32K). This version was prepared by Robin cox of this department by translation of the original Fortran listing of Revington into the appropriate Rasic language,and LO whom I am grateful for generously making it available. Modifications for use on essentiallv any rnicroc~mputerwould be quite simple. This program cHn bk used fur fitting essentially any function of a single variable containing up-to 10 parameters by simply providing the algebraic function and the first derivatives of this function with respect to each parameter. I have used this procedure for fitting a wide variety of sets of kinetic data, pH-rate profiles, spectrophotometric and potentiometric titration data, first- and second-degree rate equations for steadystate enzvme kinetic data. etc. I also wish to point out that the use of simulated data such as that listed in Copeland's Tahle 4 is not a realistic test of the usefulness of any particular data fitting technique to a real set of experimental data. The data in this table are correct to five significant figures and show no random error such as would he present in any real set of experimental data. In a practicaltwo-site binding experiment,;urh arruracy is quite out of the question. To demonstrate the usefulness of the CURFI'I' terhnique for data fitting on the microcomputer, I have modified Copeland's data set in two ways and nubjected it to parameter evaluation. First, I corrected the individual data points to four, three, and two significant figures and individually fitted these data sets to the apprupriate four parameter equation. Results of these fits are given inTablc 1. Second, 1 randomly assigned an error in the range f I, 12. is,and r10' to each data point and evaluated the parameters. These parameter rvaluations are listed in Tahle 2. In hoth Tables 1 and 2, the n, and k, parameters are determined more precisely than n2 and kz-at all levels of experimental data accuracy. This observation will hold no matter what technique of parameter evaluation is adopted and is inherent in the relative values of these four parameters in the current case. Note that the n, lAl/(k, IAl) term makes a far greater contribution to v t h i n does n z [ ~ j / ( k+ 2 IAl) for all values of IAl in the ranee 0.1-2.0. The contrihuii& of the latter term Garies from 50% of v (at [A] = 0.1) to onlv 13% (at IAl . . = 2.0). and for 18of the 20 data points this corkihution is less than 30% Insuch cirrumstances, there is little chance of ~reciselvdetermining the n? and k, parameters, especiall; when-realistic experimental errors are present as in the latter two entries in Table 2. I make no claims that the CURFIT technique is the most statistically sound data treatment in all cases, hut I have found it useful for parameter evaluation in a variety of experimental situations and can recommend its use highly. Like the PAR program for mainframe computers (I), the same program can be used for a wide variety of mathemati-

+

Volume 65

Number 9

September 1986

839

Table 1. Parameter for s SlgnlflcantFlguresa %

kr

k*

RMS

9.997 i 0.002 10.00 i 0.02 10.0 i 0.2 12 i 2

0.99989 f 0.00006 0.999 f 0.0006 0.006 1.002 1.07 f 0.06

0.00998 0.00002 0.0100 0.0002 0.010 32 0.002 0.02 0.01

+ + +

0.00032 0.0032 0.033 0.25

n~

J

5 4 3

100.00002 99.99 100.1 99.6

2

* 0.00009 * 0.01 0.1 0.9

Data 01 Table 4 of Ref. lmnscted to o oignilicmt flguea, and fined to me h v ~ i l binding e equation. Parameter n,, n2. k,. klaodsflned in ref. (1); RMS isme root mean muare enor.

Table 2.

Parameter Evaluallon afler Asslgnlng fp% Random Error

P

h

n?

k

k2

RMS

1 2 5 10

99,8* 0.8 98 f 2 97 i 3 90 i 30

8i2 13 i 5 5 5 10 40

0.91iO.05 1.1 i 0.2 0.2 0.8 0.9 i 0.8

-0.OliO.01 0.03 0.04 -0.04 i 0.06 0.1 0.4

0.333 0.64 1.62 2.98

* *

+

*

+

cal functions, with only minor changes to accommodate the specific equation of interest. I feel that the availability of this convenient data-fitting technique for use on microcomputers should be publicized more widely. Literature Clted 1. Copeland, T. G. J. Chem. Educ. 1984,61,778. 2. Marquan3t.D. W.J.Soc.Ind.App1.Moth. 196S,11,431. 3. Bevington. P. R. Doln Reduction and Error Analysis /or the Physicol Seioneoa; McGrsw-Hill: New York, 1969:p 237.

John W. Bunting Universliy of Toronlo Toronto. ON M5S 1Al

Canada

To the Editor:

In his letter, Bunting raises two important issues. The first is scientific, and the second is ~hiloso~hical. First, I filly agree with his additional analysis of my simulated data. The analysis that I used on the simulated data was not meant to he thorough test of the capabilities of the PAR program. Rather, it was meant to show how the program could be used. The type of analysis that Bunting does is a very useful exercise that should always he done before designing an experiment. T o complete the analysis, I would also vary the number of points used and introduce error values for each point based on actual experimental conditions. The last step is necessary if points have different relative errors depending on their values. This is often the case at one extreme or at bothextremes of the data. Thispreexperiment exercise can often greatly increase the usefulness of the experiment being done. The second issue is whether to use locally written programs (on any size computer) or to use the nationally recognized statistical packages that are availableat most research n ~ implemented a sound, well-known institutions. ~ u n t i has nonlinear least squares method from a standard text. The program could he implemented on almost any microcomputer with a modest amount of memory. Unfortunately, I have seen many other published papers that use outdated or statistically unsound methods or that never describe the methods used. Many people simply do not have the time nor the expertise to implement good computer programs for complicated statistical analysis. The use of high huality statistical packages (such as BMDP, Minitab, SAS, or SPSS) allows many~-people to use very well established methods without the large investment

a

840

Journal of Chemical Education

in time required to understand thoroughly all the subtleties of the methods and to write the programs. In addition, once you are familiar with a statistical package, i t is very easy to do many different analyses. For example, the BMDP package, in addition to having a nonlinear regression program (PAR), has programs for doing linear regression, polynomial regression, multiple-linear regression, stepwise regression, and all ~ossiblesubsets remession. All of these programs use the molt current statistical techniques for analyzing outliers and non-normallv distributed residuals. SAS has similar programs and, in addition, the SAS nonlinear regression routines im~lementfour different methods of calculation (modified dauss-~ewton,Marquardt, gradient and multivariate secant (DUD)). Minitab and SPSS also provide a wide range of regression analysis options but do not contain non-linear regression analysis. These packages are updated on a regular basis and new state-of-the-art methods are added routinely. Statistics is a very active field of research and new methods are routinely developed. No research chemist can he expected to be familiar with these new techniques as soon as they are developed. Nor is it useful for chemists across the country to be implementine all these new techniaues. The companies that put out the major statistical packages have professional statisticians and . Droerams on their staffs. We should rewrnize and " . use their expertise. A final ~ o i n concerns t the use of statistical packages in undergraduate and graduate teaching. If students learn to use one of these statistical packages as a regular part of experimentalscience, they can cnrry that knowledge tu most milior research centers and continue their work casily. The same often cannot be said for locallv Droarams " nroduced . . that may or may not he easily transported. More information on these ~ a c k a e e can s be obtained from the following: ~~

~~

~

BMDP 1964 Westwood Blvd. Suite 202 Los Angeles, CA 90025 Minitab Project 213 Unrversity Pond 1.aborntory llnivervity Park. I'A 16801 SAS Box 8000 Cary, NC 21511 SPSS Suite 3300 444 North Michigan Avenue Chicago, 1L 60611 Thomas G. Copeland Middlebury College Middiebury. VT 05753