Convergence of Generalized Simulated Annealing with Variable Step

borious and time consuming or frequently converge to sets of local optima. Optimizations based on generalized simulated annealing. (GSA) have been sho...
1 downloads 3 Views 493KB Size
Anal. Chem. 1001, 63, 2303-2306

attempted to do electrochemistrywith this composite. Work is underway to use a glassy-carbon rod instead of a plate. Composites formed with the rod can then be used in the current probe design for electrochemical characterization.

2383

To whom correspondence should be addressed. ’Department of Chemistry, Iowa State Universlty, Ames, IA 50010.

Duane E. Weieshaar* Brian Lamp1 Pat Merrick Scott Lichty

LITERATURE CITED (1) Kaaret, T. W.; Evans, D. H. Anal. Chem. 1988, 60, 657-662. (2) SChiaVOn, G.; ZOtti, G. A M I . chlin. AClo 1989, 211, 27-41. (3) Katayama-Aramata, A.; Ohnkhl, R. J . Am. Chem. Soc. 1983. 105, 658-859. (4) Ogumi. 2.; Nlshb, K.; Yoshlzawa, S. €lecPo&lm. Acta 1081, 26, 1779-1782. (5) Sanazin, J.; T a l k , A. J . Eleciroenel. W”. I n t w f a d e l E ~ . 1982, 137, 183-188. (6) SchleffW. G. W. A M I . Chem. 1985. 57, 2745-2748. (7) Wdmagen, J. L.; Soltes, E. J. J . Chem. Educ. 1977. 5 4 , 619. (8) Wassemggw, H.: In Anhalt, D.: Jaeger, K. US. Patent 2204538, June 11. 1940.

Department of Chemistry Augustana College Sioux Falls, South Dakota 57197 RECEIVED for review April 2, 1991. Accepted July 19, 1991. We are grateful for supporting granta from William and Flora Hewlett Foundation of Research Corp. and the Augustana Research and Artists Fund.

Convergence of Generalized Simulated Annealing with Variable Step Size with Application toward Parameter Estimations of Linear and Nonlinear Models Sic Numerous chemical processes often require some sort of optimization. Over the years, diverse optimization techniques have evolved. Unfortunately, many have limited uses (1-4). That is, various optimization practices are often laborious and time consuming or frequently converge to sets of local optima. Optimizations based on generalized simulated annealing (GSA) have been shown to alleviate handicaps concerned with alternate methods of optimizations (4-8). GSA was developed for ascertaining global maxima or minima located on continuous multidimensional response surfaces (5). The procedure is b e d on simulated annealing, a modified Monte Carlo method (6,8-10). Recently, GSA was adapted for optimization of discrete functions (11). GSA, in its present form, must maintain a constant step size as it sequentially searches a multidimensional response surface for optimal conditions. Typical optimizations by GSA necessitate the algorithm to first locate near-global conditions. After which, another GSA optimization could be started at the near global conditions with a reduced step size. The processes can continue until GSA identifies the exact global position on the response surface. This approach has been shown to be quite successful (11). Obviously, significant enhancements are conceivable if GSA is able to automatically reduce or increase the step size on its own accord. Most importantly, convergence to global optima without human intervention would become achievable. This paper is concerned with the development of a variable step size GSA (VSGSA). The process is tested on a mathematical function previously optimized by GSA (4, 5). Additionally, mathematical model parameters are estimated for a linear and nonlinear model. VARIABLE STEP SIZE GENERALIZED SIMULATED ANNEALING GSA was developed by Bohachevsky et al. (5) and represents a variation of simulated annealing as related to the annealing processing of a solid. The theory and applications of simulated annealing and GSA have been amply discussed and reviewed in the literature (4-11). Here, we will modify GSA to incorporate variable step sizes during an optimization experiment. Specifically, VSGSA will be capable of automatically increasing or decreasing the step size. Toward the end of an optimization,step sizes should continuously shrink 0003-2700/9 1/0363-2383$02.50/0

allowing for the convergence of VSGSA to the exact global optimum. Briefly, with regard to “ h a t i o n of an objective function, GSA begins by computing the objective function C, for an initial random current position on the response surface. A new random position is then selected in the neighborhood of the current followed by computation of its objective function C,. If the new objective function is less than the current, acceptance of the new position as the current position ensues and the process continues. If the new objective function increases, then it is accepted as the current with probability I

P

-)

P = exp( -PAC

cc - c,

where AC = C, - C,, C, denotes the global minimum for the objective function, and j3 symbolizes a control parameter empirically chosen. Other sequential optimization techniques do not accept detrimental movements that produce convergences to local optima. GSA, however, accepts detrimental positions if P is greater than or equal to a random number p acquired from a uniform distribution on (0,l). Otherwise, selection of a new random position occurs and the process repeata. Thus, GSA can be described as a b d random walk and can walk off of local optima and enter the global area. The subjective choice of j3 depends on the initial magnitude of AC. Past studies have shown that in order for 5040% of early detrimental positions to be accepted, j3 should be chosen so that 0.5 P < 0.9 ( 4 , 5 , 7,11). As GSA advances towards C,, P diminishes in value, resulting in acceptance of less detrimental positions. At C,, P equals zero and detrimental positions are no longer accepted. In order to determine, n, the vector containing the d coordinates for C,, a corresponding d X 1 vector of random numbers, v, must be generated. The random numbers are chosen from an N(0,l) distribution and normalized to unit length. With a step size equal to Ar, n = c + Arsv where c signifies the vector encompassing the coordinates for C,. Termination of GSA can occur after a specified number of unaccepted consecutive evaluations. Alternatively, GSA may be terminated when IC, - C,l < c or IC, - C,l < c where c expresses the respective desired level of precision. Another Q 199 1 American Chemical Soclety

2384

ANALYTICAL CHEMISTRY, VOL. 63, NO. 20, OCTOBER 15, 1991

stopping criterion can be when the parameters undergoing optimization are no longer changing beyond the proper number of significant digits. Further explicit discussions on the operation of GSA can be found in refs 4 and 5. Since past studies have shown that GSA was unable to converge exactly to global optima and only locate near global optima, a VSGSA was developed. Reference 7 investigated a variable step size simulated annealing based on a Choleski decomposition of the covariance matrix for the evaluation positions. The decomposition approach was attempted, but results were far from acceptable. Instead, this investigation adjusts step sizes depending on an acceptance ratio computed as the ratio of the number of positions accepted to the total number of positions evaluated. If more than 90% of the evaluation positions for a block of standard GSA evaluations are accepted, GSA is generally far from the global optimum and the step size should increase. If less than 20% of the evaluation positions are accepted, GSA is probably near the global optimum and the step size should decrease. Hence, an essential decision involves determining how large of a block of standard GSA evaluations is necessary before an acceptance ratio should be calculated. The fewer the number of evaluations, the faster the algorithm can proceed. However, if insufficient evaluations are attempted, improper estimates of the acceptance ratio transpires. After trial and error, it was established that a compromise block of 20 GSA evaluations was suitable for an appropriate estimate of the acceptance ratio. Suitable empirical selection of control parameter P for VSGSA to operate properly requires new limits on P. With the suggested standard GSA values 0.5 < P < 0.9, the selected j3 continuously prompted the step size to grow in magnitude after the first acceptance ratio calculation. Adjusting 0for the GSA-recommended P values reflects an approximate average 70% acceptance ratio. This almost guarantees an overall acceptance ratio greater than 90% and an increase in the step sue. Since enlargement of the step size is not always relevant after calculation of the first acceptance ratio, especially when the initial parameters are near-global, further modifications to VSGSA become imperative. The solution consisted of adjusting j3 until 0.2 < P < 0.7. This reduced the chance of a forced step expansion and, yet, permitted VSGSA to walk off local optima with the acceptance of detrimental evaluations. A recent study revealed that as the step size for GSA was manually changed, j3 required additional adjustments (11). Similarly, in order for VSGSA to operate spontaneously, automatic adjustments of j3 becomes essential. When step sizes change, significant differences between C,, and C,are possible. Consequently, the consistency of P is not preserved, since this difference plays a major factor in the calculation of P. That is, P does not necessarily approach zero as the global optimum is attained. Instead, P can drastically increase or decrease, causing VSGSA to indiscriminately wander or converge to local optima. To overcome this difficulty, a new j3 was calculated for each step size change according to p, = PL\C/AC, where P, signifies the new P, j3 represents the original P used to establish 0.2 < P < 0.7, denotes the average difference for the first three detrimental evaluations attempted with the original P, and AC, describes the difference for the first detrimental evaluation with the new step size. A new @ value is not calculated for a step size change until a detrimental evaluation occurs. After P, is determined, VSGSA proceeds with a block of 20 evaluations. Termination of VSGSA cannot occur after a specified number of successive rejected evaluations. Instead, termination becomes practical when VSGSA progresses to IC, - C,I < c, or IC, - C,,l < c, or Ar < c, or the

Table I. GSA and VSGSA Optimization of Mathematical Function

optimization technique

initial coor-

final coor-

0

dinate

dinate

GSA

3.5

VSGSA

0.80

VSGSA

1.0

0.85 0.85 0.85 0.85 -0.50 -0.50

0.058 0.033 3.9 x -1.5 X -4.5 X 10" -3.9 X lo4

final

R

0.083

no. of evaluations 105

1.3 x lo-'*

1302

1.6 X lo-'*

3310

optimization parameters are only changing beyond the proper number of significant digits. Again, c denotes the respective precision sought. EXPERIMENTAL SECTION For all VSGSA optimizations, the initial P was adjusted until 5 blocks of 20 evaluations maintained an average P value approximately equal to 0.45. During this process, step sizes were allowed to change along with P if acceptance ratios dictated. The initial P for the 100 evaluations was then used for the correspondingoptimization. If C, was not known,an estimate was uaed. With regard to minimization, if VSGSA located a lower C,, then a new reduced estimate was used. For all of the studies, blocks of 20 VSGSA evaluations were performed before adjustment of 0 and the step size. At this point, if the acceptance ratio mandated, a factor of 2 increase or decrease in step size ensued. If a step size change occurred, then after the next detrimental step, B was modified to its proper value and another block of 20 evaluations was executed. Previous mathematical data studied by GSA were used in this investigation (4). Boundary conditions for the optimized parameters were equivalent to those used previously. Termination of VSGSA emerges when objective functions were within the precision of the computer. For double precision this constitutes approximately 16 digits (12). Hence, VSGSA terminated when the first 16 digits of a parameter did not change. Parameters for a linear and a nonlinear model were estimated by VSGSA. These two models were previously analyzed by a simplex algorithm (13). Calculations were done on an IBM PSj2 Model P70 computer using FORTRAN as the language. Numerical Algorithms Group (The Numerical Algorithms Group Inc., Downers Grove, IL) software was used for computations. The GSA and VSGSA programs are available from the authors. RESULTS AND DISCUSSION Mathematical Example. To first evaluate VSGSA, a complicated mathematical function containing many local optima and one global optimum was explored. This function has been rigorously tested with GSA ( 4 , 5 ) and is described as R = x 2 + 2y2 - 0.3 cos ( 3 ~ x 1 0.4 cos (4xy) + 0.7, where x and y are the parameters optimized and R designates the objective value. The function has a global minimum a t R = 0.0 and ( x , y) = (0, 0) and refs 4 and 5 contain plots and contour diagrams. Prior applications of GSA showed that only near global conditions could be located. Table I lists results for GSA and VSGSA starting from the same set of coordinates and the previously used initid step size 0.15. Clearly, significant enhancements have been obtained by VSGSA in terms of locating the global optimum. As a sacrifice, 10 times the number of evaluations were necessary. However, after VSGSA performed an approximate equivalent number of evaluations as GSA, VSGSA was near the global minimum and essentially, the remaining evaluations represent convergence to the exact global minimum. The number of evaluations reported in this paper reflect typical optimizations. Because VSGSA operates randqmly, the total number of evaluations for repetitive optimizations most often ranged f200. Table I also presents an optimization result when VSGSA was initialized a t local minima (4.5, 4.5). Repeating

ANALYTICAL CHEMISTRY, VOL. 63, NO. 20, OCTOBER 15, 1991

0.01 -

A

B

0.4-

-0.03

i

i,

0.J

b ,

s

0.5

-

,

2

!4

1 \

-? 1

012

0:4

1.2

0:6

1.4

1.6

1.8

4'

x (shear rate)

Flgure 1. Plot of the fitted data for Model I using simplex (a)( 73)and VSGSA (b). For VSGSA, @ = 15 and Ar = 0.5. The experimental data were x = 0.001, 0.05, 0.10, 0.20, 0.30, 0.40, 0.50, 1.00, and 2.00; y = 0.961, 0.789, 0.017, 0.378, 0.233, 0.145, 0.092, 0.017, and 0.010.

the optimization revealed similar results when beginning VSGSA at different coordinates. Final Ar values for optimization of this function ranged from 10-loto 10-13. If VSGSA was not stopped at this point, then with this magnitude of At-,C, and C, appear equal to the computer due to the scaling operation of the exponents for the optimization parameters (14). The acceptance ratio commences to increase because C, = C, is considered an accepted evaluation. Hence, Ar eventually expands. The acceptance ratio ultimately becomes less than 20%, Ar descends to approximately 10-l0, and the cycle reiterates. Various studies were performed in order to evaluate the general effecta of the initial step size, the factor by which step sizes are changed, and the initial guess of C,. With regard to the initial step size, empirical observations suggest that the initial step size can vary from 5% to 10% of the analysis range. Initial step size less then 5% of the ranges required approximately twice the number of evaluations to locate the global optimum. Initial step sizes greater than 10% of the range tended to produce similar results. That is, if the initial step size is too large, the algorithm soon finds itself on the boundaries invoking boundary violations. Eventually,VSGSA moves away from the boundary but not after an excessively large number of evaluations. Investigations into the effects of the factor to increase or decrease step sizes generated expected results. If the factor is too small, on the order of 1.2-5, a considerable number of evaluations become necessary. If the factor is too large, the algorithm shortly moves to the boundary and, generally, remains there. Clearly, the factor for increases or decreases should be such that early changes in step sizes are not greater than 50% of the range. Otherwise, VSGSA quickly progresses to the boundaries for evaluation positions. The initial guess of C, was found to be robust with respect to overestimation. Provided C, is not too severely underes-

-41

4

e

e

238s

Anal. Chem. 1881, 63,2386-2390

2380

= 35.84 (13). VSGSA located parameter levels of a = -0.0022, b = -0.0041, c = -0.0209,d = 0.0020, e = 2.3226,and f = -5.1236with OC = 9.444. Figure 4 displays the residuals from the simplex and VSGSA fit. In the simplex case, the residuals are all positive while the VSGSA residuals are randomly scattered about the zero line. The large values of OC for both simplex and VSGSA are probably due to the fact that only nine data points were used to estimate six parameters. A better fit would be obtained if more data points were utilized. These applications of VSGSA express the ability of the algorithm to converge onto global or near-global conditions.

Table 11. Experimental Data for Model I1

experiment no.

7% ethanol

PH

selectivity

1 2 3 4

10 20 30 10 20 30 10 20 30

5.0 5.0 5.0 5.5 5.5 5.5 6.0

6.08

5 6 7 8 9

2.42 2.10 7.31 3.00 3.13 7.06 3.72 3.37

6.0 6.0

LITERATURE CITED

A 2

3

4

Edgar, T. F.; Hlmmelblau. D. M. Optknlzetbn of Chemical Recesses; McGraw Hill: New York, 1988. Bayne, Charles K.; Rubln. Ira 0. Practical Expertmental DesW and Optimization Methods for Chemlst; VCH Deerfield Beach, FL, 1986. Box, George, E. P.; Draper, Norman R. Ernpkical Modd-BuRdhg and Response Surfaces; Wlley: New York, 1987. Kallvas, John H.; Roberts, Nancy; Sutter, Jon M. Anal. Chem. 1989, 6 1 , 2024. Bohechevsky, Ihor 0.;Johnson. Mark E.; Steln, Myron L. TechnomeMcs 1986, 28, 209. van Laarhoven. P. J. M.; Aarts, E. H. L. Sknuleted Annsehhg: Theory and Applications; D. Reidel Publishing: Dordrecht, The Netherlands, 1988. Vanderbllt, DavM; Louie, Steven 0. J . -ut. phvs. 1984, 36, 259. Johnson, Mark E., Ed. Am. J . Math. Msnag. Scl. 1988, 8 (3, 4). Guell. Oscar A.; Holcombe, James A. Anal. chem. 1990, 82, 529A. Kirkpatrick, S.; Watt, C. D., Jr.; Vecchi, M. P. Science 1983. 220. 671. Kalhras, John H. J che"etrlcs 1991, 5, 37. Ratzlatl. Kenneth L. I n t " to Computer-AssbPted Ekp&mntatbn; Wlley: New York, 1987; Chapter 2. Zupan, J.; Rius, F. X. Anal. Chlm. Acta 1090, 239, 311. Jws, Peter C. Compuw Software Appllcetbns in chemkby; Wiley: New Ywk, 1988; Chapter 2.

A

5

6

7

8

I

9

experiment number

Flgwe 4. Model I1 residual plot for simplex fit (0)and VSOSA fit (A).

in the VSGSA residual plot indicates that a different model could also be in order. The second model (model 11) fitted corresponded to y = ax2 + bz2 + cxz + dx + ez + f, where a, b, c, d , e, and f define the parameters to be estimated. This model was used to establish the relationship between the chromatographic selectivity (y) and the composition of the mobile phase, percent ethanol content ( x ) , and pH (2). Table I1 contains the experimental results used to estimate the parameters. The simplex located parameter values at a = 0.0179,b = -1.4202, c = 0.0145,d = -0.9951,e = 16.159,and f = -33.4635 with OC

Jon M. Sutter John H. Kalivas* Department of Chemistry Idaho State University Pocatello, Idaho 83209

RECEIVED for review April 5,1991. Accepted July 17,1991. This work was supported in part by a grant from the Idaho State Board of Education.

TECHNICAL NOTES Adjustable Open-Spilt Interface for Gas Chromatography/Mass Spectrometry Providing Solvent Diversion and Invariant Ion Source Pressure Woodfin V. Ligon, Jr.* and Hans Grade General Electric Company, Corporate Research and Development, Schenectady, New York 12301

I NTRODUCT10 N Various approaches to the interfacing of gas chromatographs to mass spectrometers have been described. Of those approaches which utilize fused-silica capillary columns, the most common is the direct connection of the capillary to the ion source (I, 2). This approach has the advantage that no interface exists to possibly reduce chromatographic resolution. There are, however, a wide variety of disadvantages to this method. The most important of these is the fact that all of the column effluent necessarily reaches the mass spectrometer. As a result, mass spectrometers that are designed to accept 0003-2700/81/0363-2386$02.50/0

nanoarams of sample are routinely exposed to milligrams of solveks. In the lase of high-voltage- magnetic mass spectrometers, it is often necessary to turn off the high voltage while large solvent peaks elute. This unavoidable exposure to large amounts of eluents can cause substantially increased maintenance due to the ion source, analyzer, and vacuum system contamination and to reduced filament lifetime. The direct connection is also nonideal with respect to its effect on GC separation. The mass spectrometer's vacuum invariably causes changes in column flow rates that result in shifts in GC retention times. Accordingly, experiments that have been carefully designed with a flame ionization detector (6

IBBl Amerlcan Chemical Society