Ind. Eng. Chem. Process Des. Dev. 1886, 25, 1042-1044
1042
CORRESPONDENCE Comments on “Sequentlal Experimental Design for Preclse Parameter Estlmatlon. 1. Use of Reparameterlzatlon” parameters K by the same constant factor IGl. An obvious consequence is that the optimum variable settings for a proposed new experiment to minimize the expected confidence region volume are independent of any parameter transformations on the model. We suggest that the deviations from this condition in the reported results of Agarwal and Brisk may be due to inaccurate estimation of the best parameter values in the regression calculation or inaccurate determination of the optimum experimental variable settings or a combination of these effects. Accurate parameter estimation is made more difficult if the parameters are highly correlated as in one of the models in Agarwal and Brisk’s example. It is well-known that the optimization problem to determine experimental variable settings is often made more difficult by the occurrence of local minima (Rippin et al., 1980). Calculations similar to those of Agarwal and Brisk were made to check if these difficulties could be overcome. The calculations could not be reproduced exactly since the values of the model response, the reaction rate, were not reported. The same procedure was used of generating response values for experimental variable settings reported in Table I1 and adding Gawian noise at levels of 10% and 40%. It sould be noted that the “true” parameter values given in Table I of Agarwal and Brisk for the non-RP model and the RP model are not exactly consistent with the transformation procedure reported. In what follows we shall use the values from Table I for the non-RP model. Reconstructed measurement values for the first eight or nine experiments are shown here in Table I. Regression calculations were carried out for the initial series of six experiments to estimate parameter values for models 1 and 2 with 10% and 40% noise. The regression program (Klaus and Rippin, 1979) used a modified Marquardt procedure, and the criterion for convergence was that the where G is the square matrix of the partial derivatives of relative changes in the estimated parameter values in the transformations as shown. successive iterations should all be less than When Agarwal and Brisk’s definition of the matrix D For regression with the same data the results of indeis used of which the elements are the partial derivatives pendent regression calculations using models 1and 2 were of the model response with respect to the parameters K virtually identical. The individual residuals agreed to a at each measurement point relative accuracy of better than 10“ and the objective function, the sum of squares of residuals, to a relative V* = GT(DTD)-’Ga2 accuracy of better than When the parameter values The confidence region volume for the parameters K is for model 2 were transformed back to the form of model proportional to 1,the resulting parameter values agreed to a relative acl(DTD)-111/2 curacy of 2 x 10” or better. When the variance covariance matrix for the parameters of model 2 is transformed by Similarly, the confidence region volume for the parameters the G matrix given above, the resulting variance covariance K* is proportional to elements agree with those of model 1to a relative accuracy IGT(DTD)-1GJ1/2 = IGT11/21(DTD)-111/21G(1/2of 2 x or better. = IGll(DTD)-111/2 For the parameter transformation from model 2 to model 1, it can be shown that IGl = A&. The confidence The transformation matrix G is a function only of the region volumes for the parameters of the two models parameter values K. Thus, for given values of K , IGl will should differ by this constant factor. Using the true values be a constant for a given set of parameter transformations. of the parameters from Table I of Agarwal and Brisk For any choice of experimental variables, the confidence IGl = A,$ = 4629 X 11600 log IGI = 7.730 region volume for parameters K* will differ from that for
Sir: Agarwal and Brisk (1985) describe how a reparameterization can be used to improve the precision of parameter estimates and the effectiveness of a sequential experimental design procedure. The sequential experimental design procedure selects for the next experiment those values of the experimental variable settings at which the expected volume of the confidence region of the parameters after the next experiment is minimized. However, it is well-known that the volume of the confidence region of the parameters is invariant against parameter transformations (Federov, 1972). Thus, it is expected that the confidence region volume is a function of the experimental variable settings and that this function is independent of what variable transformations have been used in the model. This is contrary to the experience of Agarwal and Brisk as reported in their case study. Their Table I1 shows that after a few initial experiments different experimental variable settings are called for by their models 1and 2 to minimize the expected confidence region volume. A group of parameters K has a variance covariance matrix V. We wish to determine the variance covariance matrix V* of another group of parameters K* equal in number to the components of K and related by the transformation K*i = gi(K) I t is well-known that
Ol96-4305f86f 1125-IQ42$Ol.50fO
0 1986 American Chemical Society
Ind. Eng. Chem. Process Des. Dev., Vol. 25, No. 4, 1986
1043
Table I. Reconstructed Measurement Values for the First Nine Experiments with 10% Noise and the First Eight Experiments with 40% Noise r + 40%, expt temp, K XCZH4 XHZ rate, 1.0 X lo6 random no. r + lo%, 1.0 X lo6 1.0 X lo8 1 2 3 5 6
318.0 326.0 333.0 341.0 348.0 358.0
0.35 0.30 0.28 0.38 0.25 0.15
0.65 0.70 0.72 0.45 0.55 0.60
22.648 36.372 53.285 55.574 83.478 122.93
0.0991 0.3571 -0.3970 -0.5971 0.2825 0.3228
22.8725 37.6709 51.1698 52.2557 85.8365 126.8977
7 8 9
363.0 358.8 340.43
0.10 0.10 0.10
0.40 0.80 0.80
89.74 148.69 61.80
0.5934 0.2753 -0.2280
95.0652 152.7864 60.3912
7 8
355.09 363.0
0.40 0.40
0.60 0.40
148.66 141.80
0.1219 -0.3877
4
To the accuracy to which the figure can be read, this difference corresponds exactly to the constant difference in confidence region volumes between models 1 and 2 shown in Figure 5 of Agarwal and Brisk. I t is important to note that the smaller confidence region for model 2 is simply a result of the form of the parameter transformation and is in no sense an indication that model 2 is uperior to model 1. In the experiments reported by Agarwal and Brisk, the designed experiments are identical for models 1 and 2 up to experiment 8 with 40% noise and to experiment 9 with 10% noise. We examined the dependence of predicted confidence region volume on the choice of experimental variables by using a general purpose nonlinear experimental design program (Klaus, 1981). Calculations of the predicted confidence region volume were made for the 64 points of a 4 X 4 X 4 grid over the permitted experimental region for experiment 9 with 40% noise and experiment 10 with 10% noise. The dependencies on the experimental variables were virtually identical for models 1and 2, differing only by the constant factor A J . The best choice of experimental variables among the 64 points was the same for both designed experiments and both models at the highest values of all three variables T = 90 "C XCzH, = 0.4 XH2 = 0.8 The location of the minimum was confirmed for the case with 10% error by a search using the simply bounded optimization subroutine BCQNDF (NPL 1978) which is also part of the above experimental design program. The calculations reported by Agarwal and Brisk incorporated two additional constraints on the experimental region: an explicit constraint on the mole fractions XCPH + Xy2< 1.0 and an implicit constraint that the predicted reaction rate at an experimental point should not exceed 1.5 X kg-mol/(s kg of catalyst). Since we found that the reaction rate predictions and the confidence region volumes calculated for models 1and 2 were virtually identical over the whole experimental region, the locations of the constrained optima for the predicted confidence region volume for a further experiment are also identical. For experiment 10 with 10% noise, constrained optima were located at T = 90 "C XCzH4 = 0.4 XH2 = 0.409 global optimum
T = 90 "C XczH4= 0.188 XHz = 0.522 local optimum and for experiment 9 with 40% noise at T = 90 "C XCzH4= 0.4 XH2 = 0.4 local optimum T = 90 "C XCzHl = 0.1 XH2= 0.569 global optimum
23.5459 41.5675 44.8242 42.3007 92.9119 138.8007
155.9115 119.8075
The difference in the objective function value log (volume of confidence region) between the local and the global optima in each case is approximately 0.2. The local optima arise because of the form of the constraints. Contours of the predicted confidence region volume are of similar shape to contours of equal reaction rate. Thus, small changes in parameter values can change the relative positions of these contours and the location and relative magnitude of the local minima. Optimal experimental designs for minimizing the confidence region volume tend to cycle around a restricted number of points. It is of interest to note that the position of the local optimum for experiment 9 with 40% noise corresponds closely to the global optimum for experiment 10 with 10% noise and vice versa. Furthermore the point identified as the global optimum for experiment 10 with 10% noise recurs in Agarwal and Brisk's experimental series for both models 1 and 2 with this noise level. The design points selected by Agarwal and Brisk for experiment 10 with 10% noise for models 1 and 2 both have objective function values worse than those at our global optimum design point by approximately 0.35 when recalculated with the parameter values from our regression fit. For experiment 9 with 40% noise the design point selected by Agarwal and Brisk for model 2 corresponds to our local optimum but is inferior to the global optimum. With the parameter values from our regression, at Agarwal and Brisk's design point for model 1 the reaction rate constraint is violated (2.1 X Our global optimum recurs in Agarwal and Brisk's experimental series for this noise level for both models. For this example optimum design points seem to lie on at least one of the constraints, and the objective function value is not very sensitive to displacements along the constraints. There is no difference between experimental design points predicted by the two models, also in a constrained experimental region. The claim that experimental design based on predicted confidence region volume can be improved by reparameterization is unfounded. The differences shown in Figure 5 between the two models are the result of a numerical scaling factor that has been explained above. We claim that the other differences resulting from the use of the different models are random effects arising from lack of convergence of the regression routines used by Agarwal and Brisk or from incomplete convergence or possibly convergence to subsidiary minima of the routine used for the optimal design. The comparison with the work of Agarwal and Brisk is not exact since we did not have access to the response values which they used, but we believe that the results are sufficiently clear to be conclusive.
1044
Ind. Eng. Chem. Process Des. Dev. iS86, 25, 1044
Literature Cited
Rippin, D. W. T.; Rose, L. M.;Schifferii, C. Chem. Eng. Sci. 1980, 35, 356.
Agarwai, A. K.; Brisk, M. L. Ind. Eng. Chem. Process D e s . Dev. 1985, 2 4 , 203. Federov. V. V. Theorv of O ~ t i ~ l E x ~ e r i t ~Academic: ~ n t s : New York, 1972; p 80. Klaus, R. Doctoral Dlssertation 6866, ETH, Zurich, Switzerland. 1981. Klaus, R.; Rlppin, D. W. T. Comp. Chem. €ng. 1979, 3 , 105. N R Numerical Optimization Software Library, National Physical Laboratory, Teddington. England, 1978.
Technisch-Chemisches Laboratorium E.T.H. Zurich CH-8092 Zurich, Switzerland
T. Rimensberger D. W. T. Rippin*
Received for review July 8, 1985 Revised manuscript received March 3, 1986
Response to Comments on “Sequential Experlmental Design for Precise Parameter Estimation. 1. Use of Reparameterizatlon” Sir: Rimensberger and Rippin correctly point out that the volume of the confidence region of transformed parameters will theoretically differ from that for untransformed parameters by a constant factor. Hence, successive sequentially designed experiments will be placed at the same locations for both “normal” and reparameterized models-in theory. The difficulty of course is that the factor /GI, the determinant of the matrix of partial derivatives of the transformed function with respect to the parameters, will depend on the accuracy with which the best parameter values are estimated. If the parameters are not estimated sufficiently accurately, IGl will not be constant. In our case, the first block of experiments gave the same parameter estimates for each model, as Rimensberger and Rippin found. These estimates were used to design the next experiment, giving identical experimental conditions for each model. These experiments were used (with the first block) to obtain the next estimate of parameters, but now the two sets were not identical. From this point on, the two models diverge. Theory and Rimensberger and Rippin’s evidence indicate otherwise. Why? We believe the explanation is that the search routine (optimizer)used in our case failed to converge to the %rue” values of the parameters in the estimation stage. The reparameterized model produces a better response surface, so the routine was far more likely to approach the best estimates. The non reparameterized model gives rise to the well-known elongated valley surface characteristic of highly correlated parameters. Such a surface will cause problems for some optimizers, and these problems increase
0196-4305/86/1125-1044$01.50/0
as the true values are approached in successive experiments. Whilst it can be argued with hindsight that we should have used a better optimizer, we believe our original claims still have some merit. The use of reparameterization, because it gives one a better chance of success in the parameter estimation stages, leads ultimately to a more effective sequential design. In cases of high experimental noise, or inadequate convergence of the parameter fitting stage, reparameterization does help. It does not and cannot affect the design stages as such, but our empirical evidence is that it provides a more robust approach to the combined design-parameter estimation problem when one does not have total success with optimization routines. We agree that the reparameterized model is not “better” than the original model, per se. Its smaller volume confidence region arises from the scale factor /GI which is approximately constant. Finally, in response to the comment that the values in Table I of our paper %e not exactly consistent with the transformation procedure reported”, we note that the values were calculated using R = 1.987 19 and T (K) = t (“C) + 273.16 and are somewhat sensitive to small variations in these constants. School of Chemical Engineering & Industrial Chemistry University of New South Wales Kensington, N S W 2033, Australia ICI Australia Operations Pty. Ltd. Matraville, NSW 2036, Australia
0 1986 American
Chemical Society
Ani1 K. Agarwal*
Michael L. Brisk