UNIPLEX. Single-factor optimization of response in the presence of error

sponsored by theProcter and Gamble Company. UNIPLEX: Single-Factor ... easily automated and converges in the presence of error. (11-15). Single-factor...
0 downloads 3 Views 611KB Size
latches could easily be multiplexed by using three-state latches tied via a common bus to the counter data inputs. The feasibility and reliability of this new digital current control system has been tested and proved over a period of one year.

RECEIVEDfor review January 17, 1974. Accepted May 15, 1974. This work was supported in part by NSF Grant G P 18910, and also in part by an American Chemical Society, Division of Analytical Chemistry Fellowship (to J.D.D.) sponsored by the Procter and Gamble Company.

UNIPLEX: Single-Factor Optimization of Response in the Presence of Error Paul G. King' and Stanley N. Deming2 Department of Chemistry, Emory University, Atlanta, Ga. 30322

A univariate sequential unequal-interval optimization strategy is described and its applicability in analytical chemistry is illustrated. The strategy offers a rapid approach to the optimum, converges in the presence of error, and never excludes any region of the factor domain from investigation. The algorithm is easily implemented using either manual or automated calculations.

A common problem in analytical chemistry is the optimization of response by varying only one controllable factor. Examples include the pen damping control on a strip chart recorder and the pH dependence of a reaction. Recent advances in computer-controlled instrumentation, as well as research on the automated development of analytical chemical methods (1-IO), have suggested a need for an efficient one-dimensional optimization strategy that can be easily automated and converges in the presence of error ( 11-1 5 ) . Single-factor optimization strategies can be either sequential or nonsequential (often called "preplanned" or "simultaneous") and the search intervals employed can be either equal or unequal. Nonsequential equal-interval searches are often used to determine the response over a P r e s e n t address, D e p a r t m e n t of C h e m i s t r y , U n i v e r s i t y of Georgia, Athens, Ga. 30601. A u t h o r t o w h o m correspondence s h o u l d b e directed. P r e s e n t address, D e p a r t m e n t of C h e m i s t r y , U n i v e r s i t y of H o u s t o n , Houston, Texas 77004. (1) T. E. Hewittand H. L. Pardue, Clin. Chern., 19, 1128 (1973). (2) E. C. Toren, Jr.. S. A. Mohr, M. G. Busby, and G. S. Cembrowski, Clin. Cbem., 19, 1122 (1973). (3) E. C. Toren, Jr.. R. N. Carey, G. S. Cembrowski. and J. A . Schirmer, Clin. Cbern., 19, 1114 (1973). (4) L. A. Currie, J. J. Fitliben. and J. R. DeVoe, Anal. Cbern., 44, 4972 (1972). (5) R. S.Swingle and L. B. Rogers, Anal. Cbem., 43, 810 (1971). (6) A . A. Eggert, G. P. Hicks, and J. E. Davis, Anal. Chern., 43, 736 (1971). (7) S. N. Deming and H. L. Pardue. Anal. Cbern., 43, 192 (1971). (8) R. G. Thurman, K. A. Mueiier, and M. F. Burke, J. Chrornatogr. Sci., 9, 77 (1971). (9) G. P. Hicks, A. A. Eggert, and E. C. Toren, Jr., Anal. Cbern., 42, 729 (1970). ( I O ) S. P. Perone. D. 0. Jones, and W. F. Gutknecht, Anal. Chern., 41, 1154 ( 1969). (1 1) D. J. Wilde, "Optimum Seeking Methods," Prentice-Hall, Inc.. Englewood Cliffs, N.J., 1964. (12) A . Dvoretzky, "Proceedings of the 3rd Berkeley Symposium on Mathematical Statistics and Probability," J. Neyman, Ed.. University of California Press, Berkeley, Calif., 1956, p 39. (13) J. R. Blum, Ann. Math. Statist., 25, 382 (1954). (14) J. Kiefer and J. Wolfowitz, Ann. Math. Statist., 23, 462 (1952). (15) H. Robbins and S. Monro, Ann. Math. Statist.. 22, 400 (1951).

1476

given domain of the controllable factor. They are appropriate for mapping the region of a known optimum but are inefficient when used to search for an optimum with a given degree of precision. For example, if a nonsequential equalinterval search were used to locate a p H optimum to fO.l p H unit over the pH range 6.00 to 8.00 inclusive, 21 experiments would be required. The optimum region of the response curve will be well defined a t the expense of overdetermining regions of secondary interest (7). Regression analysis is often employed with this method to minimize the effect of random experimental error. The sequential equal-interval search, in general, requires fewer experiments to reach the optimum but is still inefficient. The search begins a t one boundary and proceeds by equal steps toward the other boundary. When the optimum has been passed, the search is halted. This procedure does not provide information about the shape of the response curve between the optimum and the other boundary. Moreover, experimental error can indicate a false optimum if the search is carried only a short distance beyond the suspected optimum. Nonsequential unequal-interval searches are seldom employed. They might be useful if a rough outline of the response surface were desired and previous information suggested that the optimum might be in one particular region. The intervals would then be large away from the suspected optimum and narrow near it. The nonsequential unequal-interval search might also be useful in reducing the estimate of error in a desired region of the curve. A series of nonsequential equal-interval searches in which the size of the interval is decreased from series to series (in effect a combination of the three methods previously described) is often employed (7). Each series provides a rough idea of the location of the optimum. Each follqwing series more closely defines the exact location of the optimum. If random error is associated with the response, care must be taken that the change in the size of the interval and the change in the domain studied do not preclude the possibility of exploring in future series regions that might contain the optimum ( 1 1 ) . The Fibonacci or "golden section" search (16-18) is the most efficient single-factor search in the absence of random error. It is a sequential unequal-interval search. Each evaluation of response contracts the size of the search interval by a factor 2/(1 + 4 3 ) and, in so doing, eliminates regions (16) G. S. G. Beveridge and R. S. Schechter, "Optimization: Theory and Practice." McGraw-Hill Book Co., New York, N . Y . , 1970, p 180. (17) J. Kiefer. Proc. Arner. Math. Soc., 4, 502 (1953). (18) S. M. Johnson, "RAND Corp. Rep. p-856," Santa Monica, Calif., 1956.

A N A L Y T I C A L C H E M I S T R Y , V O L . 46, NO. 11, SEPTEMBER 1974

of the factor domain from further study. Thus, in the presence of random error, the Fibonacci search will eventually fail. Toward the latter stages of the search, when the differences between responses approach the same magnitude as the random error, the presence of the error will cause the region containing the optimum to be excluded from further search. The problem of optimization in the presence of random error has been studied by Robbins and Monro (15), Kiefer and Wolfowitz (14), and Dvoretzky (12) and has been discussed by Wilde (11). These stochastic sequential unequalinterval searches guarantee mean square convergence with probability one but because the interval is contracted by l / n (where n is the iteration number), convergence upon the optimum is slow. This paper presents a sequential unequal-interval strategy that is capable of expanding as well as contracting the size of the interval. I t never excludes any region of the factor domain from further investigation, it offers a rapid approach to the optimum, and it converges in the presence of error. The logic and arithmetic are simple. The algorithm is, therefore, easily implemented using either manual or automated calculations. No search algorithm guarantees convergence to a global (overall) optimum (16).Response surfaces that are convex over the region searched will exhibit only one optimum, but, in general, the form of the response surface will not be known before the search is begun. Confidence that the local optimum found is global can be increased if the search algorithm is begun from different regions of the factor being studied and the results of the searches agree.

STRATEGY UNIPLEX is based on the sequential simplex method of Spendley et al. (19),as modified by Nelder and Mead (20). A simplex is a geometric figure defined by a number of points equal to one more than the number of dimensions of the factor space. Thus, a simplex in one dimension is a line segment. The simplex moves toward the optimum by reflections, expansions, and contractions (21). In the initial simplex B W shown in Figure 1, vertex B was measured and found to have the best response. Vertex Wwas measured and found to have the worst response. Reflection is accomplished by extending the line segment WB beyond B to generate the new vertex R:

R = B + (B - W ) (11 Two possibilities exist for the measured response a t R: a) The response a t R is more desirable than the response a t B. An attempted expansion is indicated and the new vertex E i s generated: E = B +y(B-WW)

(2 )

where y (20, 22) is the expansion coefficient (y > 1, usually 2 ) which determines how much farther than R the segment BR is to be extended. If the response a t E is better than the response a t R, it is retained as the new simplex BE. If the response a t E is not better than the response a t R, the expansion is said to have failed and BR is taken as the new simplex. The next iteration of the strategy ranks the retained vertices as a new B a n d a new W. b) If the response a t R is not more desirable than the response a t B, a step in the wrong direction has been made (19) W. Spendley, G. R. Hext, and F. R. Himsworth, Tecbnometrics, 4, 441 (1962). (20) J. A. Nelder and R. Mead, Computer J., 7, 308 (1965). (21) S. N. Deming and S. L. Morgan, Anal. Cbem., 45, 278A (1973). (22) D. M. Himmelblau, "Applied Nonlinear Programming," McGraw-Hill Book Co.. New York, N.Y.. 1972, p 154.

A

ch

d

_ ?

Figure 1. Possible moves for onedimensional simplex

Figure 2. Flow diagram for onedimensional simplex algorithm

and the simplex should be contracted. One of two possible vertices must be generated: 1) If the response a t R is better than the response a t W, the new vertex should lie closer to R than to W:

C, = B

+

P(B - W )

(20, 22) is the contraction coefficient (0

(3)

B, then R of the present simplex becomes R of the new. In the presence of experimental error, it is possible that a persistent vertex could have a spuriously high response. T o correct for such a situation, the " h 1 rule" recommends that if a vertex has been the best vertex in a number of simplexes equal to one more than the dimension of the factor space (h; in UNIPLEX, h = l),the response a t that vertex should be reevaluated (19). The new response, the average of the responses, or some other function of the responses is substituted for the response a t that vertex. Search algorithms are normally terminated when a predetermined number of iterations has been performed, when the differences in response become less than some predetermined value AY, when the range of the factor !evels becomes less than some predetermined value A X , or when the standard deviation of the responses becomes roughly equal to the experimental error. Termination criteria for UNIPLEX are considered in the discussion section. A programming algorithm is given in Figure 2. BASIC,

+

A N A L Y T I C A L C H E M I S T R Y , VOL. 46, N O . 11, SEPTEMBER 1974

1477

Table I. Results of Automated Optimization Vertex

Chromate pump speed, steps sec-'''

Generated byb

Retainedb

Net Absorbance, 510 nm

1

100.

S

0.1918

2 3 4

200, 300. 400,

S R E

0.3666 0.5254 0.6742

5

600. 800. 500,

R E C W

E

6 7

W

0,7625 0.4299 0.8078

8

550.

C,v

W

0.8370

9

525.

C W

W

0.8402

10

537.5

C W

W

0.8519

11

531.25

C W

W

0.8524

12 13 14 15

534.375 528.125 532.8125 531.25

Cw R C

16 17

529.6875 532,03125

R W

0.8521 0.8488 0 . 851SC 0.8524 0 . 8524d 0.8459 0 . 8524e

18 19 20 21 22 23 24

531.64063 532.42188 532.8125 532.22656 532.61719 532,32422 532.42188

C R E C R C,

A19

25 26

532.51913 532,4707

R C;

R

i"

All C W

10'

0 . 852OC 0.8551f 0.8512 0,8E112~ 0.8503 0.8503? 0,8505 0 . 8528d 0.8504 0 . 85Olc

,I

E

\\

Simplex

I

1211

'I Rounded by system to integer values. I' S = starting simplex, W = worst vertex, R = reflection, E = expansion, C, = C,, contraction, C, = C , contraction, AXX = repeat evaluation of vertex X X . Response worse than previous worst. Average of measured values. e A Y convergence. f Small air bubble passed through flow cell.

FOCAL, and FORTRAN programs are available from the

authors.

Table 11. Simplex Centroids and Derivatives

Simplex

Simplex centroid steps sec -1 a

1 2 3 4 5 6 7 8 9 10 10' 11 12 13 14 15 15' 16

150.00 300.00 500.00 550.00 525.00 537.50 531.25 534.38 532.81 532.03 532.03 531.64 531.84 532.23 532.32 532.37 532.37 532.45

EXPERIMENTAL Chemical system. The formation of dichromate ion from chromate and hydrogen ion (23-26) possesses a well-defined optimum within the domain of the variable factor.

Cr04'2HCr04-

+

H'

Z Z

HCr04-

(5)

+

(6)

Crz07'-

HzO

Equal volumes of equimolar solutions of Cr042- and H+have been shown to produce a maximum net absorbance due to dichromate between 470 and 555 nm ( 2 4 ) . Reagents. Hydrochloric acid, 0.1000M (certified, Lot No. 725512, Fisher Scientific Co., Atlanta, Ga. 30324) was used as a source of standard acid. Potassium chromate (100.3% assay, Lot No. 726273, Fisher) was used without further purification. K2Cr04 was added to a 2.000-1 Class A volumetric flask and diluted to volume with distilled water to give a solution 0.100M in chromate. Distilled water was used to produce constant total flow when determining base-line response. Apparatus. The study was performed by an instrumental system designed for the automated development of analytical chemical methods. The system is controlled by a PDP-9 computer (Digital Equipment Corp., Maynard, Mass. 01754) with 18-bit word (23) J. D. Neuss and W. Riernan Ill, J. Amer. Chem. SOC.,56, 2242 (1934). (24) W. C. Vosburgh and G. R. Cooper, J. Amer. Chem. Soc., 63, 437 (1941). (25) J. Y . Tong and E. L. King, J. Amer. Chem. Soc., 75, 6180 (1953). (26) E. A. Johnson, Nat. Bur. Stand. (U.S.) Photoelec. Spectrom. Bull., 17, 505 (1967).

1478

Derivative 1 0 3 ~ . ~ . sec step-'

x

1.75 1.54 0.442 -0.453 0.584 -0.126 0.936 -0 . 0 8 1 2 -0,0815 -0,409 -0.392 0.0578 1.04 6.82 20.1 49.3 25.8 -54.3

Simplex centroid is defmed in one dimension as the average of the factor levels of the vertices in the simplex.

length, 8 K core memory, hardware clock, DEC tape unit, teletype, and an 18-bit bidirectional parallel-transfer interface. A modified FOCAL LSL (27) is used as the major programming language. A major feature of the automated system is its capability of con(27) L. v . East. 1973.

ANALYTICAL CHEMISTRY, VOL. 46, NO. 11, SEPTEMBER 1974

"FOCAL LSL",

DECUS

Prog. Lib. Cat., No. 9-79a-c, March

10

08

5z

06

< a LL

sia