. m
An Essay on "Significance"
"Significance" as applied to scientific data has two meanings. One of them refers to the magnitude of an effect relative to the experimental error. Well-developed statistical methods are available for assessing the significance of this diffeience. t of the ~ r o h l e mdeals Another and eauallv i m ~ o r t a naspect '%s an effect sufficiently large to warrant with the publication and further study?" This essay is limited to the second and more subjective facet of significance. Since scientists continually judge the importance of their own results as well as the results of others, the topic deserves considerable attention. Opinion of significance can spell success or failure for naoers n s can even influence de. . and erant a.~.~ l i c a t i oand cisions to enact major swial programs. Consider the effect of salt on the absorption of light hv a dye dissolved in water. (This example, like all other? herein, is hnwd on work published in the literature; actual journal references are irrelevant.) It is found that 1 h! salt shifts the ahsorption wavelength of the dye by 8 A. Since the spectroohot&eter determines waveleneth accuratelv to within 0.2 h, there exists a meaningful dilfcrence between the hehavior clf the dye with and without thesnlt. Now it might have happened that the wavelength shifted only 1 .& upon adding salt. Althoueh this difference is much less than 8 A. the. scientist could &elude equally well that salt perturbs absorption by the dve. Assume next that the 1M salt induces a 0.4 .&shift. The effect still exceeds the experimental error, hut the change "seems" small. In order to accentuate the effect, the scientist could raise the salt concentration to 5 M (the solubility limit of the salt in water). In this manner he could achieve, for example, a psychologically more satisfyingO.8 .&. Clearly, as the magnitude of an effect gravitates toward the experimental error, some point will be reached a t which the scientist no longer considers his effect "large!' He may even fail to report his results and abandon the ~roiect. . . On the other hand, a scientist of different temperament may ruhmit these same data for puhliratim. The journal editor may or map not reject the attempt; much depends on the skill with which the article is written and on the philosophy of the editor. What derermines the point a t which an-effect is no longer considered "interesting"? When is a factor of 2 worthy of note, and when is it trivial? Such auestions are not easilv answered but are nonetheless worthy of examination. A scientist obtained a value of 1.251 grams per liter for the density of nitrogen prepared from ammonia. He then prepared nitrogen by removing oxygen, carbon dioxide, and water vapor from air. The density of this atmwpheric nitrogen was 1.257. The ex~erimentwas r e ~ e a t e dseveral times with no variation in the results. But the iaws of chemistry dictate that nitrogen should have the same density no matter what its source. Thus, the 0.006 gram per liter difference must he significant. Two reasonable hypotheses (among others) account for the discrepancy: (1) The ammonia-derived nitrogen was contaminated by hydrogen which is lighter than nitrogen. (2) The nitroeen from air was contaminated hv some eas heavier than nitrogen. The first possihility was ruled out by various negative tests for hvdroeen. The alternative was tested hv reacting. a sample of atmospheric nitrogen with red-hot magnesium metal to form magnesium nitride. After the reaction was over, there remained behind a small amount of gas heavier than nitrogen, thereby confirming the second hypothesis. This is in fact the way Lord Rayleigh and William Ramsay discovered
-
-
-
argon. The point is that these scientists knew that their density effect was interesting because i t violated certain expectations (namely that nitrogen from any source should have the same density). The word "significant" meant to Rayleigh and Ramsay any density difference greater than the experimental error. Often, however, the criteria for significance are less obvious. For example, when an organic chemist publishes a reaction, he includes a %yield. A yield of 60% means that 40% of the starting material did not react or else was diverted to some undesired product. Suppose an investigator discovers how to improve a reaction yield from 60% to 70%. Does this finding merit the main theme of an article? Most chemists would reply negatively. Although the work is sound and an improvement over the past, one senses a lack of impact. After all, a chemist could secure all the product he needs by scaling up the auantities (albeit at greater expense). Moreover, manv 'hemiits would claim that the work provides no new insigh& intochemical behavior and that it manifests littlecrentivity. It is unlikely, theretore, that the work could he published in a maior chemical iournal. NOW that the failure stems neither from faulty datanor incorrect conclusions but rather from suhjective opinions. Success in science clearly depends on "non-scientific'' value judgements not unlike those received by a novelist from newspaper critics. Science and the humanities are much more alike in this regard than many scientists admit. How could the chemist with a 10%vield imorovement assist others in evaluating hiiwork? ~bvidusly,he'could succinctly describe the reasons for carrvine out the ex~eriment.If his original goal had been to elevitethe yield h i 10%(or 20% or whatever). then this should be so stated and iustified. If the 10% increase was an unexpected finding (which does not necessarily impair its potential value), then this would also be brought out in the discussion. But presenting a 10%change in the absence of any baseline, any motivated expectation, leaves the reader wiih no sense of import. Thir &)vers a basic rule which should if possible beapplied to thediscovery of all new effects: relorr darn roo corrfull\ sorcilird drlini, , tion of significance. Reviewers of the work may disagree over the particular definition, hut at least there would be a means for judging success. The next example describes the work of a scientist who followed the rule. I t was found that 50,75, and 100 grams of a certain chemical increased the average of plum . production . trees hy 5$, 6.' and 6%, respertirely.'rhescientist estimated that applicatim of the nutrient at the 50 100 cram level would hecome economically feasible only if were increased by 2-3% or more. In this context the nutrient has a significant effect on plum trees. Some time ago a leading journal of chemistry published an article which illustrates the problems in assessing significance when realistic criteria are not specified. The article was selected rather arhitrarilv as there are manv like it. I t was demonstrated that high concentrations of an additive %gnificantlv inhihit" a crrrain reaction. Thus. 0.40 h l additive reduces the reaction rate from 112 to 82. ~ o t h i is n revealed ~ about the cause of the inhibition, The Daper does mention, however, that the additive also exists in h;iu~ogical systems and mirht cause inh~hitimsthere a i well. Smce no biological cxpeGments were actually carried out, the ~ i g n i f i c a n ~ ofethe ~
~~
~~
-.
Volume.57, Number 5, May 1980 / 351
work rests solely on the chemical data indicated. Did this paper merit special attention? Apparently, the authors and two reviewers thought i t did. In truth, one cannot judge the significance because there is no basis for comparison. The published data are trivial relative to a change from 112 to 0.82. Yet one could imaeine circumstances under which a decrease from 112 to82 would attract great interest. Unfortunately, the authors claim significance without defining the word. Why isa change from 112 t o r 2 worthy of publication? What change did the authors expect or hope for? Would they have similarly puhlished a change from 112 co 102? The readrrs need answers to these questions in order t o grasp the impact of the results. Biological systems are controlled homeostatically and can tolerate onlv small variations. A 10%temnerature chance in general damages a mammal much more than it does an eGine. Yet the narrow limits of hioloeical data do not nreclude difficulties assessing their significance. The limitsserve only to decrease the percent change worthy of acclaim. For example, a new drug that lowers blood cholesterol 2-fold would he considered highly effective (whereas a drug that lowers the level only 5% would probably he ignored). The point at which the region of interest begins depends on the objectives of the investigator (i.e. his concept of significance).A pharmacologist who discovers a compound that reduces the number of leukemia cells in the blood by 80% could conceivably report a "large" effect of the drug on cancer cell populations. Alternatively, he could define a "significant drug" as one which reduces cancer cells below the level of detection. Accordingly, he would regard this drug as uselqss. It is easy to understand the temptation to ignore the rule to define significance; by doing so, research results do not get cast out of existence. T o further drive home the need for framing research data, I now cite a piece of work, chosen arbitrarily from the recent biochemist^ literature. in which sienificance lies in a con" textual void. Metal ions are known to assist the conversion of ATP into phosphate by a liver enzyme. A group of investigators discovered that the effect of metal ions depends on their size (the larger the metal, the greater the effect). Thus, the relative production of phosphate per unit time under standard conditions is 1.5, 1.8, and 2.2 for Li+, Na+, and Cs+. No one can claim that this effect is not new, real, and publishable; its illuminative value is another hatter. Since the reader is not told what values were expected and why, he benefits ouly superficially from the information. The above metal ion data were presented graphically as shown in the figure where Y = enzyme activity and X = metal ion radius. Since the exoeriments were conducted carefullv, there is nodoubt that Cs+ increaseseluymeactivity more than Na7, and Nu' more than l.i7. Although enzyme activitv depends on metal ion size, it is also seen that knzyme activity does not depend uery much on metal ion size (i.e. the data seem rather insignificant). In order to make this last statement, I asked three key questions about the figure: (a) By what factor does Y varv over the exoerimental ranee of X ? 2-fold? 10-fold? lo4-fold? This question immediate& focuses attention on the scale of the graph. In the figure, the Y scale has been expanded enormously to accomodate an overall change in the data of only 50%. (b) How is Y affected by variations in parameters other than X ? The answer to this question provides a "feel" for the sensitivity of enzyme activity to change.
352 / Jourml of Chemical Education
0
l
1.0
0.5
1.5
2 .O
X The effect on enzyme activity (M of the radius of added metal ions (XI.
Now it is known that pH, temperature, substrate structure, etc.. can ~ e r t u r henzyme activity by orders of magnitude; in contrast,'the 50%metal effect in the figure appears small indeed. (c) Would the explanation for the figure have to he modified if Y were to experience a much larger change than that actually found (e.g. 500% rather than 50%)? A negative answer reflects adverselv on the aualitv of the information provided by the graph. he data-in the figure can he "explained,"for example, by favorable metal binding to the enzyme prior to the reaction; the large ions hind more effectively than small ones. Clearly, this rationale would remain unaltered even if the enzyme activity changed 10 times as much as that seen in the figure. In the absence of a theoretical framework within which to judge significance, the figure loses much of its meaning. Aflatoxin B1 ultimately gives cancer to 50%of animals fed dosages of 1X 10W6g/kg animal. Saccharin has the same effect at dosaees of 1.0 e k e animal. If a sienificant carcinocen is definedYasone which causes cancer inany dose, whatever its size. then saccharin should be banned from general use. If,on the bther hand, the danger point is set a t dosages of 10W g/kg animal or less. then the saccharin data do not indicate significant carcinogenicity. From the point of view of a scientist. p r r h p r definition is acreutable. It is nor satisfactory, howwer. to claim that a compotkd induces (or does not induce) significant occurrence of cancer without defining the meaning of significance. Of course, in this instance the definition is rather arbitrary; such is not the case, for example, with the powerful exper-iments performed t,v Rayleigh and Ramsay. One final point about significance should he made. In the arts and humanities a person can create a brilliant piece of work and still be totally ignored. Nothing forces anyone to take note of a novelist's work. This is less true in science. A scientist ignores significant discoveries of others ouly at his own oeril. Professional success devends on digesting information from all laboratories, both prominentand obscure. And with a little effort, the significance of this information can he defined clearly and judged fairly. F. M. Menger Emory University Atlanta. GA 30322
-
u