LETTERS - "Would You Believe 99.9969% Explained?" and "Pitfalls of

Jun 1, 1970 - LETTERS - "Would You Believe 99.9969% Explained?" and "Pitfalls of Stepwise Regression Analysis". Douglas C. Crocker. Ind. Eng. Chem...
0 downloads 0 Views 348KB Size
LETTER

sir : The May and June 1969 issues of Industrial and Engineerzng Chemistry presented a pair of articles by Robert A. Stowe and Raymond P. Mayer titled, respectively, “Would You Believe 99.9969y0 Explained? and “Pitfalls of Stepwise Regression Analysis.” Some of the discussion presented by Messrs. Mayer and Stowe supports the “little knowledge is a dangerous thing” warning of the easy misuse of regression techniques -a thesis deserving support. However, their cry of alarm is somewhat analogous to a nonswimmer plunging into treacherous, unknown watcrs, and, upon rescue, denouncing water as manifestly evil and malign. The very virtues and powers of regression analysis which can be, in competent hands, of great assistance in a case such as the authors describe, are renounced. They have been tricked and the whole fraud must be exposed-as if no one else had ever been swimming before. A number of their points are valid, though naive. T h a t they were beguiled by their anthropomorphic computer and crank-turning approach to problem solving is no surprise. They admonish the reader against imitating their folly but, in their zeal, overgeneralize their condemnation. The articles suffer from technical and interpretive misrepresentations and careless presentation. For the sake of brevity, I shall skip a large number of relatively minor

technical points and concentrate on a few major points.

...

You Believe ?” (A) While there is no particular objection to using (with understanding) a cut-off index of F = 2 during such a fishing expedition, the authors are remiss in alluding to a statistical significance associated with this use of the F ratio. I t can hardly be regarded as a proper test when the conditionally best-fitting variable from a practically infinite (relative to the sample size of 22, 64 predictors is a very large number and is effectively made much larger than the number 64 by the introduction of senseless, highly intercorrelated transformations of the original predictors) set of variables is tested as if it were-under the null hypothesis-a random selection from among a set of variables which have only chance relationship to the response. I t is as if the authors had stood overlooking a crowd, had picked the tallest person in that crowd, and had then performed a test to see if that person were unusually tall. (B) The principal point of the random number demonstration has been missed. The danger exposed is the use of automatic, model-building, variable-choosing algorithms. The authors wisely counsel (p 46) that: “A better approach is to specify the physical laws governing the problem, select the appropriate equations expected, and test the data for fit to these prespecified equations.” This From “Would

is a good prescription for model validation but few problem situations are so well understood that the final model can be specified apriori. What is needed is a search mode that falls in between the total rigidity of using only prespecified models, and the total abandonment of blind modelbuilding algorithms such as the stepwise procedure employed in this case. This search mode would be based on several prerequisites : 1. Reasonably specific goals 2 . An understanding of the statistical procedures and measures 3. Reasonable familiarity with the system represented 4. Restraint in using only those transformations of variables which make sense 5. Facility for a complete diagnostic examination of the data and the interrelationships 6. A cyclical approach which recognizes the question asking as well as the question answering aspects of regression analysis 7. A willingness to adequately validate the model I t is mainly in items 2 and 4 that this case study floundered. T h e last page of the discussion (p 46) is replete with examples showing that this experience did not cure the authors of their willingness to venture in over their statistical heads. They call for orthogonal design. This is a noble goal but one which ignores the 99% of the problems which lie beyond the pale of laboratory control. They make a claim regarding the proper VOL. 6 2

NO. 6

JUNE

1970

5

LETTER

relative size of correlations which is pure nonsense (under the heading, “Many small correlations . . . .”). This is followed by an equally ridiculous charge that : “The problem should have been rejected at Step 1, when the initial correlation was so small.” The crown of ignorance is added with the assertion that variation which is comprised of many small components cannot (or should not) be analyzed at all. (C) The next paragraph headed, “Adjusting to ‘discover hidden effects,’ ” is ludicrous. I t is based on a total lack of familiarity with or understanding of the spectrum of intercorrelation phenomena. The discussion is mainly concerned with the propriety of entertaining in the model, a predictor variable whose zero-order partial correlation with the response is lower than would be anticipated (considering the physics) because of inherent iritercorrelation between such predictor and some other predictor(s). (The other(s) may be conversely “hidden” by the first.) The fact that truth is teniporarily hidden from view by the nature of the system-a rather conimon occurrence-is no reason to reject that truth. We should rejoice that, used with understanding, niultivariate analysis techniques allow us to see through this camouflage. (U) The authors include in this same paragraph-as if it were more of the same (“hidden effects”)-the case of “confounding” which is also due to intercorrelation. Their advice for this is to get “better data” but they fail to quantify or ever qualify what is meant by “better.” They offer no insight into the nature of this problem or how to cope with it. The readers of Industrial and Engineerzng Chemzstry may wish to explore further some of these points. Reference (1) deals generally with the treatment of “nonexperimental” data and lists 14 additional refer6

ences. Reference (2) contains a section (pp 121-3) which deals specifically with the quantification of the intercorrelation phenomenon, a general theorem, and a classification scheme. T h a t work contains 36 additional references. Reprints of those two papers can be obtained by writing to this correspondent a t his home. (1) Crocker, D. C., “Intercorrelation and the Utility of Multiple Regression in Industrial Engineering,” J. Ind. Engrg., 18 ( l ) , January 1967, pp 79-85. (2) Crocker, D. C., “Linear Programming Techniques in Regression Analysis : The Hidden Danger,” AIIE Transactions, 1 (2), June 1969, pp 112-26. From “Pitfalls

..,



(A) Throughout this discussion, the authors misinterpret tests of statistical significance. There is great distaste for their “alarm” a t a t = 2.125 for the strongest noneffect among 15 known noneffects simultaneously tested (with no residual degrees of freedom) and their conclusion that this shows such tests to be unreliable. Of more concern are their irresponsible remarks concerning the associated probabilities. They claim (p 12) that, “Statistical techniques assist . . . by showing the probabilities

of the results having occurred by chance.” They offer (p 13) 100 (1 - CY) as the “. . . percent confidence that the effect did not arise by chance. . . .” Such statcments are terribly misleading to the statistical novice whose instruction is the avowed purpose of the authors. These statements combine two misconceptions. The concepts involved seem simple but this is a deception. These concepts are tricky and are frequently confused even by experienced analysts. One misconception is that a and 1 - p are coinpleinentary. The other involves attaching probabilities to facts rather than to random variables. Both of these arise from failure to recognize the conditional nature of the tcstassociated probability statements. The table below may hclp to make this clear. Truth is not a random variable. I n any given case, a particular hypothesis is (unknown to the analyst) either true or false. To make a decision, a test statistic is obtained. The decision to act as i j the hypothesis were true or to act as Zf it were false is based on the test statistic. The test statistic is a random variable. Its behavior can be described using probability statements. Hence, the decision process (based on this random variable) can be evaluated probabilistically. FollowTruth Effect does not exist

Effect is real

Decision based on test statistic and test rule

Hypothesized as “null hypothesis”

Hypothesized as “a1ternate hypo thesis”

“Accept” null (no-effect) hypothesis. Claim no effect and act accordingly

Correct decision, prob. = 1 - cy

Incorrect decision, prob. = /3

_____

Reject null hypothesis. Claim that Incorrect decision, effect is real and act accordingly prob. = a Total probability

INDUSTRIAL AND E N G I N E E R I N G CHEMISTRY

1. o

Correct decision, prob. = 1 - /3 1. o

ing a particular test procedure, CY is the probability that the procedure will erroneously claim a n effect-Le., claim a n effect when there is none. (The italics indicate the conditional part of the probability statement.) The complement, 1 - a,is the probability that the decision process will have the good sense to claim that there is no effect when there is none. These two possible results provide a total probability of unity for the condition, “effect does not exist.” The wording, “percent confidence that the effect did not arise by chance,” carelessly implies that the probability is associated with truth rather than with a declaration about truth made by the decision process. But that is not the main trouble. Worse by far is the twisted meaning imparted. By disregarding the conditional aspect, the tabled values (1 - a)--in the such as 75%-100 article are taken as the degree of belief that the effect is real when in fact there is no efect and the value, 1 - CY, is the conditional probability of correctly concluding that there is no efect-quite a different interpretation. The authors apparently want to evaluate the procedure for both conditions of truth simultaneously. They confess “great temptation to accept” the reality of the first four (1 - CY ranging down to 0.75). This temptation no doubt arises from their anxiety that this test procedure will disregard some real effects from time to time. But if we are to discuss this aspect and evaluate this risk, we must introduce this condition. How big is the effect? How important is what is its a n effect that big-i.e., practzcal signgcance? Given a test statistic and a critical value (associated with a prespecified risk, a),the risk, p, can be evaluated only with respect to some particular size of effect-some particular alternative hypothesis. For that real effect, 1 p becomes the probability that such a n effect will be recognized by the

test procedure. Rewording this : 100 (1 - p) is the percent confiLence that this effect, which did not arise by chance, will be detected. Compare this to the authors’ statement: 100 (1 - a) is the “. . . percent confidence that the effect did not arise by That conditional chance. . . .” “which” cannot be omitted. The complement of CY is not 1 - p. Now all this may seem like semantic quibbling. Not a t all. Truth cannot be established by statistical procedures; they can only serve to guide our search. The very foundations of the scientific process are eroded by careless use of these methods of search. “Wisdom is the principal thing; therefore get wisdom; and with all thy getting, get understanding.” (Bible) (B) I n the summary (p 16), the unsupported claim is made that: “The method of internal pooling of variance for estimating the error leads to even higher risk when applied to nonorthogonal data.” Although this is highly ambiguous, there is no general property of nonorthogonal data that could render this statement correct. There may be special cases but none has been demonstrated by the authors. (C) The authors close with a recommendation that their randomnumber-demonstration method be widely employed “for checking the validity of results obtained from plant and laboratory experiments.’’ Presumably, they mean that a n understanding of the behavior of statistical models is prerequisite to their use. The task of extending our understanding beyond this trivial display (for which very adequate models already exist) lies far beyond the exhibited competence of these authors.

APPLIED KIIYETICS and chemical reaction engineering

This is the third of a series of sfate-of-fhe-art books growing out of summer symposia sponsored jointly by I&EC and fhe Division of Industrial and Engineering Chemistry. The 15 papers from the 1966 symposium which werepublishedfrom September 1966to June 1967in INDUSTRIAL AND ENGINEERING CHEMISTRY are combined in this book. Reaction engineering, with its focus on the chemical fransformation itself, lays some claim to being the discipline that uniquely differentiates chemical engineering from other branches of engineering. The recently developing interest in reaction analysis has served to reorient chemical engineering research toward the reaction itself and increasingly to cpnsider unit operations from the viewpoint offheir interaction with the chemical transformation. This book offers much fo the engineering researcher and reactor designer as well as the practicing chemical engineer. Robert L. Gorring and Vern W . Weekman o f t h e Systems Research Division of Mobil Oil Corporation and co-chairmen of the symposium, contributed the introduction. Chapter tifles and the authors appear below: Mixing and Contacting i n C h e m i c a l Reactors Kenneth 8. Biochoff K i n s tH. John i s Sinfelt Considerations in Surface C s t d y s i s Photochemical Reaction Engineering A, E. Cesssno, P. L. Sliverten, and J. M . Smith Reaction Mechanisms for Engineering D e s i g n H u i h M. Huiburt and Y. G . Kim I s Sophistisatlon Really Necessary? Rutherford A r k Disguised Kinetics James Wei Surface Models in Heterogeneous Catalysis Giureppa Parravano Yield In Chemical Reactor Engineering James J. Carberry Acetylene and Hydrogen from t h e Pyrolysis of Methane John Happel and Leonard Kramer Reaction Rate Modeling in Heterogeneous Catalysis J. R. Kitlrsll and R . Mezahi Segregation Effects in Pseudolaminar Flow Reactors W. M. Edwards end D. I. Saletan T h e T h e o r y of Oscillating Reactions Joseph Higginr T h e Concept of Ditfusion in Chemical Kinetics Thor A. Bsk and Edward R. Fisher Stochastic Mixing Modeis for Chemical Reactors F. J. Krambeck. R. Shinnar. and S. Katr Turbulent H e a t Transfer to a Nonequilibrium, Chemically Reacting Gar P. L. T. Brian and S. W. Bodman

To order, till out the C D Y P O ~below AMERICAN CHEMICAL SOCIETY Special I I T Y B S Sales It155 Sixteenth Street, N. W. Washinaten, D. C. 10036

I

P1ea.e mend-cnpien of APP1i.d Klnetics and Chemical Reaction Enain.erlng at $7.50 each. a 4 p a o e * ( O ~ t z l u i l hIndex. Clothbound. An laEC Reprint. (3rd ihEC DIsii8on Summer Symposium)

c]Checkensiosed (to American

Chemlfal SDCiei~l.

Send bill.

MAlLiNG ADDRESS

Douglas C. Crocker 10 Ironwood Dr. Rochester, N . Y . 14616 VOL. 6 2

NO. 6 J U N E 1 9 7 0

7