Regression through the origin. Reply to comments - American

of the least squares estimator of model II, with equality occurring at not more than one x value. That is, fitting model. II will not result in a decr...
0 downloads 0 Views 289KB Size
Anal. Chem. 1980, 52, 1152-1153

1152

y = a + bx

+e

(1)

and

y=bx+e

(2)

where a and b are constants and e is a random error. The models will be referred to as I and 11, respectively. It can be shown, that the standard error of the least squares estimator of model I is greater than or equal to the standard error of the least squares estimator of model 11, with equality occurring at not more than one x value. That is, fitting model I1 will not result in a decrease in precision as noted in ( I ) ; but most likely a n increase in precision. This result holds, no matter which model is assumed to be correct. If model I is the true model, the resulting least squares estimator of model I1 will be biased, which (depending on the magnitude of the bias) could decrease the accuracy of model I1 compared to model I. If model I1 is the correct model, then fitting model I1 results in increased accuracy and precision. T h e standard error of estimate is a measure of how well a given model fits the data. Occasionally, this quantity is used as a measure of precision. This may lead to a misuse of the term precision. For example, if model I1 is estimated from t h e data, when t h e true model is I, then the standard error of estimate is influenced by the resulting bias and hence is not a true measure of precision. The formula for the standard

Sir: I a m pleased t h a t Ellerton found my note ( I ) of sufficient interest to make comments about it. I agree with some of his comments and disagree with others. Though Ellerton feels that chemists do not fully understand the meanings of precision and accuracy in a statistical context, I believe this is largely a matter of problems of communication. Chemists have a fundamental feeling for their significance, but the language of statisticians has become so sophisticated t h a t discussions with them are difficult. Conversely, statisticians do not use the language of chemists. One standard statistics text does not even have precision or accuracy in its index ( 2 ) . I agree thoroughly with Ellerton's definitions of precision and accuracy which apply, with no difficulty, to a chemist's repetitions of determinations on the same sample. The difficulty arises with application of statistics to analysis of a series of standard samples over a range of composition to establish a standard curve (regression analysis). While numerous repetitions on each standard would yield means and individual estimations of precision, it is not common to do this. Generally, one or two determinations on each standard are used t o establish a regression line that is assumed to be linear. Deviation of points from the best straight line by least squares is taken as a measure of precision and is calculated as the standard error of estimation ( I ) . I t was on this basis that I stated t h a t the precision of model I1 will be less than that of model I, contrary to the statement of Ellerton. In model I, a a n d b are chosen by least squares to give the smallest possible value of Z(y, - jj)2 (using the standard notation 9, the values of y calculated from the model equation). Any other line, including one constrained to pass through the origin, will have a larger value of Z ( y , - 9)'. Regarding my use of n 2 to calculate s,, rather than n 1, as recommended by Ellerton, I felt that requiring the best straight line to pass through the origin was a constraint on t h e system and therefore constituted a reduction in the number of degrees of freedom. I have found support for this point of view in the text by Bennett and Franklin ( 3 ) who state: "In general, any linear restriction which we impose on the observations, or any linear function of the observations which is used in determining the deviations from which s2 is ~

0003-2700/80/0352-1152S01 .OO/O

error of estimate given in ( I ) is incorrect. T o correct the formula, replace n - 2 with n - 1. Finally, the results presented in ( I ) are not new and appear in a number of texts. References 4 , 5 , and 6 discuss the topic "Regression through the Origin" in more detail, including tests of hypotheses, confidence intervals, prediction intervals, and models with two independent variables.

LITERATURE CITED ( 1 ) Strong, F. C. Anal. Chem. 1979, 57, 298-299. (2) Kish, L. "Survey Sampling"; J. Wiley and Sons: New York, 1967; pp 509-5 12. (3) Kendall. M. G.; Buckhnd. W. R. "A Dictionary of Statistical Terms"; Oliver and Boyd: Edinburgh, 1957. (4) Box, G. E. P.; Hunter, W. G.; Hunter, J. S. "Statistics for Experimenters"; J. Wiley and Sons: New York, 1978; pp 453-472. (5) Neter, J.; Wasserman. W. "Applied Linear Statistical Models"; Irwin: Hornewood, Ill., 1974: pp 156-159. (6) Bennett, C. A,; Franklin, N. L. "Statistical Analysis in Chemistry and the Chemical Industry"; J. Wiley and Sons: New York, 1954; pp 232-234.

Roger R. W. Ellerton Department of Mathematics and Statistics University of New Brunswick Post Office Box 4400 Fredericton, N.B. E3B 5A3 Canada

RECEIVED for review October 1, 1979. Accepted February 4, 1980.

to be computed, will result in the loss of a degree of freedom". Use of n - 1 was suggested to me earlier by Endrenyi ( 4 ) . Ellerton speaks of the question of which is the "true" or "correct" model, I or 11, corresponding to my Equations 1 and 2. This is a very helpful point to raise because it has bearing on whether to use n - 1 and on the definitions of variance and precision. I felt that my claim of greater accuracy for model I1 was supported by my calculations. It was a striking outcome that constraining the line to pass through the origin for both sets of data resulted in the same slope, whereas use of model I gave very different slopes and intercepts [ ( I ) , Table I]. I would say that this showed model I1 to be the true model since the slope of a spectrophotometric series is absorptivity which is a characteristic property. One way of seeing why this result occurred is by examining Figure 1 ( I ) . T h e concentrationabsorbance points are located a t some distance from the origin. This is generally true in spectrophotometric analyses. When the "best straight line'' is extrapolated to t h e region of the origin, the random errors in the determinations often cause it to pass a t a considerable distance from the 0,O point. There are two other reasons for opting for model I1 as the true model. One is that measurement of differences between cell absorbances with the same liquid in each are not subject to the errors possible in making up solutions of known concentration. The other is the universality of Beer's law, which requires that absorbance of monochromatic radiation be a linear function of concentration and/or thickness, and be zero when concentration or thickness is zero. Most so-called deviations from Beer's law are only apparent deviations resulting from its application to systems where concentration does not vary linearly with dilution ( 5 ) ,or correction for the blank is inadequate. If model I1 is chosen as being the true model, rather than as a restriction placed on model I, then Endrenyi's and Ellerton's recommendation of n - 1 must be accepted as the correct divisor in calculating variance and standard error of estimate. Then too, my statement regarding the lower precision of model I1 becomes incorrect. The lower value of S b i - 9)' for model I becomes immaterial because it does not measure precision if model I is not the true model. I find I must accept Ellerton's criticism that my suggestion of a least squares calculation for a line passing through the 0 1980 American

Chemical Society

1153

Anal. Chern. 1980, 52, 1153-1154

origin can be found in texts on statistics and is not new. I was unable to consult the two references cited by him, but found the derivation in Bennett and Franklin (6). I do apologize for this oversight; I have always considered inadequate survey of the literature to be inexcusable. T o the best of my knowledge, my originality consisted of the application of b = Sxiyi/Zxi2

(3) "Statistical Analysis in Chemistry and the Chemical Industry", Carl A. Bennett and Norman L. Franklin, John Wiley 8 Sons, New York, 1954, p 28. (4) L. Endrenyi, Department of Pharmacology, Universty of Toronto, Toronto M5S 1AB. Canada, private communication. (5) K. Buijs and M. J. Maurice, Anal. Chim. Acta, 47, 469-74 (1969). (6) Ref. 3, p 232.

Frederick C, Strong I11 Faculdade de Engenharia de Alimentos e Agricola Universidade Estadual de Campinas Caixa Postal No. 1170 13100 Campinas, S.P., Brasil

for the first time to a series of spectrophotometric measurements.

LITERATURE CITED (1) Frederick C. Strong 111, Anal. Chem., 51, 298-9 (1979). (2) "Introduction to Statistical Methods", D. L. Harnett, Addison-Wesley Publishing Co., Reading, Mass., 1970.

RECEIVED for review October 1, 1979. Accepted February 4, 1980.

Exchange of Comments on Neutron Activation Analyses for Simultaneous Determination of Trace Elements in Ambient Air Collected on Glass-Fiber Filters Sir: Lambert and Wilshire ( 1 ) are to be commended for attempting the difficult task of determining trace element concentrations in ambient air using what appear to be careful experimental techniques and methods of data reduction. However, the reliability and significance of trace element analysis of air samples collected on glass-fiber filters are questionable. For example, only 3 elements out of the 26 reported by Lambert and Wilshire were above their discrimination limit in 80% or more of the measurements. Dams et al. ( 2 ) have stated that glass-filters must not be used for nondestructive neutron activation analysis (NAA) because they contain high concentrations of trace metals. Our work on cotton dusts supports the contention of Dams et al. We sampled air in a cotton processing area on vinyl chloride filter media. The sample and filter were irradiated in the nuclear reactor at North Carolina State University (the same facility employed by Lambert and Wilshire). We were concerned with elements of both short and long half-lives. As expected, chlorine dominated the background, which showed considerable variation. Similar to the results of Lambert and Wilshire. our data were obscured by the background. There was no consistent pattern even for elements of long half-lives which were expected to be present in high quantities, and we were unable to differentiate between actual differences in dusts and sampling errors. Similar background problems were encountered using X-ray fluorescence (XRF) spectrometry when filter media containing high trace element concentrations were employed. However. there is little problem with back-

ground interference using cellulose acetate filter media ( 3 ) . Teflon filter media may be a better choice as it I S less sensitive to water when gravimetric measurements are desired. LVhile Lambert and Wilshire have developed a procedure that is adequate for a few elements with long half-lives and adaptable to the EPA requirement ihat fiber-glass filters be employed, we conclude that the NAA method is unreliable for samples collected on filters with a high trace element content. We suggest that, in general, dusts on glass-fiber filters cannot be properly analyzed by NAA or by any other purely instrumental multielement analytiral method (e.g., XRF).

Sir, Fornes and Gilbert have apparently had an unfortunate experience. However. it is not clear how that can lead reliably to their rather sweeping nonquantitative generalization. First, Fornes and Gilbert seem to deplore the use of glass-fiber filters. Yet nowhere do they mention their experiences with such filters, ambient particulate or the application of a discrimination limit. We did not intend to recommend glass-fiber filters and would relish a low background filter as would any analyst. However, all legitimate aspects of any situation need consideration and tradeoffs which produce less than ideal conditions are possible. We encountered a non-ideal situation and had, we believe. some success. It was necessary to move

beyond refusal to do the analyses or making unshaded assertions of unsuitability. The alternative was the implementation of a practical procedure which yielded a quantitative estimate of the smallest quantity of any element in the sample which could he distinguished from the filter contribution, i.e., the discrimination limit. This quantity could be determined before sampling is begun and the approach is applicable to other filter media and analytical techniques. Next, Fornes and Gilbert question ihe reliability of our data. In a n article by Walling et al. ( I ) , arsenic data acquired by us using this technique on ambient samples from glass-fiber filters was compared to those from the same fxlters using an acid extraction and flameless atomic absorption spectrometry.

This article not subject to U.S. Copyright

LITERATURE CITED (1) Lambert, J. P. E.: Wilshire. F W. Anal. Chem. 1979. 57. 1346-1350. (2) Dams. R.; Robbins, J. A . ; Rahn. K . A , ; Winchester, J. W. Anal. Chem., 1970, 4 2 , 861 (3) Fornes, R. E.;Gilbert, R. D.: Hersh, S. P.; Dzubay, T. G. Textile Res. J . , in press.

R. D. Gilbert

K. E. Fornes* School of Textiles North Carolina State University Box 5006 Raleigh, North Carolina 27650 RECEIVED for review September 16. 1979. Accepted March 20. 1980.

Published 1980 by the American Chemical Society