CORRESPONDENCE On the Violation of Assumptions in Nonlinear least Squares by Interchange of Response and Predictor Variables SIR: In a recent article Mezaki, et al. (1973), presented an analysis of the effect of the choice of the difference to be utilized in the evaluation of constants by least squares. As they correctly point out, the statistical justification for the use of least squares includes the presumption that the errors in the dependent variable are independent and random in absolute value. Their analysis is based on an empirical equation which is implicit in the dependent variable. They use simulated data in which random errors are added to the exact values of the dependent variable as calculated from the empirical equation. They demonstrate that the values of the constants determined by least squares from the simulated data agree better with the true constants when the differences are expressed in terms of the dependent variable than when they are expressed in terms of one of the independent variables. This result was to be expected. However, had the random errors been applied to the independent variable the alternative result would have been obtained. Their conclusion is therefore useful in the treatment of real data only if the errors
in the dependent variable are known to be independent and random in absolute value and the errors in the dependent variables known to be negligible. This is rarely the case and their proposed restriction on the form of the empirical equation to be used in the application of least squares is hence not generally justifiable. Indeed, rearrangement of the equation may yield a form which more closely conforms to the presumptions on error. In general, the use of least squares with data in which the characteristics of the error are complex or unknown can be justified only as a reproducible, standardized procedure rather than on statistical grounds.
SIR:The questions raised by Professor Churchill concerning the appropriateness of the method of analysis presented in Mezaki, et al. (1973), seem somewhat to cloud the main issue. Consequently it is important t o reaffirm the proper attitude toward the results contained in that paper. Professor Churchill’s letter contains, in our opinion, correct, doubtful, and incorrect points. We wish to comment on them specifically as follows. 1. He is correct that our assumptions are that a model of E is tentatively adopted where only y form y u = f(xu,8) (and iiot x) is subject to error, and we minimize the sum of squares of deviations of the f (xu,6)from the yu. 2. He may or may not be correct when he says that “had the random errors been applied to the independent variable the alternative result would have been obtained.” If he wishes to assume that, in our (6), only x J Uis subject to error and yu and the xu* are not, he would be correct only if the appropriate estimation procedure was to minimize the sum of squares of deviations ( 7 ) . He must, however, provide suitable justification that the error structure which generated the experimental data is of the appropriate form to lead to such an estimation procedure. 3. On a very important point, he is incorrect in stating that “their conclusion is therefore useful in the treatment of real data only if the errors in the dependent variable are known to be independent and random in absolute value and the errors in the independent variable known to be negligible.” We disagree with him because, by its basic nature, all of model building is an iterative procedure and one has to start somewhere. After a model has been fitted, it is usual to examine the patterns in the residuals to see if the assumptions made appear to be correct. If the residuals show that they are not, the model is reexamined and revised. I n these circumstances, it would usually be most appropriate to first choose a model to explain the response variable, and not one of the predictor variables, as he appears to advocate. The approach we pre-
sented would usually be a sensible first step in the right direction, unless specific information about the error structure indicated a better method of attack. 4. He is basically correct when he says that “In general the use of least squares with data in which the characteristics of the error are complex or unknown can be justified as a reproducible, standardized procedure rather than on statistical grounds.” However, he also leaves the false impression that we disagree and that we advocate always using least squares. We do not. I n the circumstances we described, the technique we suggested is correct. If, however, in a given experimental situation, some other specific distributional assumptions are given or made concerning both the errors in the response y and the independent variables x, an appropriate estimation procedure could often be worked out via the method of maximum likelihood. This will usually involve minimizing a function of some other distances other than those corresponding to (2) or (7) in the paper. They would probably be some kind of “diagonal distances” between predicted value and response, the exact meaning of “diagonal” being determined by the assumptions made concerning the experimental errors. Where this is not simple or where the proper distributional assumption are completely unknown, a least-squares analysis o n the response variable would usually be the best starting point with development as described in our (3) above.
+
Literature Cited
Mezaki, R., Draper, N. R., Johnson, R. A., IND.ENQ.CHEM., FUNDAM. 12,251 (1973).
Stuart W . Churchill School of Chemical Engineering University of Pennsylvania Philadelphia, Pa.19174
Reij’i illeezaki* Chemical Engineering Department New York University Bronx, N . Y . 10453 N o m a n R. Draper Richard A . Johnson Statistics Department University of Wisconsin Madison, W i s . 63706 Ind. Eng. Chern. Fundarn., Vol. 12, No. 4, 1973
491