Reproducible bad data for instruction in statistical methods - Journal of

Reproducible bad data for instruction in statistical methods. Thomas H. Richardson. J. Chem. Educ. , 1991, 68 (4), p 310. DOI: 10.1021/ed068p310. Publ...
0 downloads 9 Views 2MB Size
Reproducible Bad Data for Instruction Thomas H. Richardson The Citadel, Charleston, SC 29409

An important part of the undergraduate quantitative analvsis course is an introduction to the statistical analvsis of experimental data. Students readily understand basic statistical orocedures and apply these to the work done in the laboraiory. unfortunate&, some of the finer details of statistical theow are considered but briefly or are overlooked entirely. o n e such topic is the treatmknt of spurious data-the identification and removal of outlying replicates. The treatment of spurious data is generally limited to a discussion of the "q-test" (1-4)as a mathematicalmethod of identifvine and removine those reolicates that do not belone , to the small sample that purports to represent the population beinc" considered. Chauvenet's criterion t.5-7l . . as aonlied .. to the Gaussian error function is also suggested as an evaluation techniaue. However, a comment is eeneallv made that commonsense arguments for rejecting doubtfulhata should be eiven oriority over mathematical tests. ?o the-student, textbook exercises for data rejection are just collections of numhers. Attitudes include "anyone can dream up bad numbers" and "it's obviously bad, why test it?"The pedagogical problem is to provide controlled outlier data for student instruction and oractice that does not aDpear to be artificial. The goal is th'at this data will have been measured by the student him- or herself so he or she will "know" i t is "good" data consisting of "real" numbers, not somethine that the instructor has created. In addition. the sample t i a t the student measures must not seem tb he "fixed", it must yield reproducible "doubtful" results when the student repeats measurements. A laboratory exercise has been developed that meets these requirements and gives data suitable for illustrating data rejection procedures and basic statistical calculations. A typical laboratory assignment a t the start of the quantitative analysis course is instruction in the use of the analytical balance. This work involves weiehine .. .. an assortment of itemsand using the data tusuppurt the lecture room presentations un statistical techniaues and data analvsis methods. Avariety of populations have been used for this purpose, one of the more successful is U S . one-cent nieces: the masses of these coins vary over a reasonably narrow &ge, there is minimal expense involved in preparing and performine the experiment; students are expbsed to the concept that Fommonolace items can be "chemically interestinn", and there is a reasonable expectation of a normally distrrbuted population. These samples of pennies allow students to practice the calculation of an average and standard deviation as well as the confidence interval and other statistical quantities on "real" data. When the class data is pooled, the behavior of a lareer samole can be studied. while the incornoration of hisiorical data allows the investigation of a very large sample that can better aooroximate the true oooulation. .. In recent years, however, events have occurred that allow oennies to be not onlv , a eood .. examole for the illustration ofbasic statistical procedures but also permit the demonsrration of methods for haodline auestionable data. In 1982. the United State Mint changeb 'the composition of one-Lent

-

-

~

310

Journal of Chemical Education

~~~~~~~~

pieces from a brass alloy of 95% copper and 5% zinc to a composite consisting of a zinc core with a copper overlayer. Aside from the date on the coin, there is no visual distinction between the brass and zinc populations; the older brass pieces continue to circulate in significant numbers. There is, however. an easilv measured mass difference between these ~~~~-~ two populations; the brass variety has a nominal mass of 3.1 g a s compared to 2.5 g for the zinc coins. This difference in mass between the brass and zinc populations is used as the basis of the laboratorv exercise. which illustrates the handling of spurious data. ~urthermore,there is a general lack of student awareness of this change in composition of onecent pieces; consequently, there is an inherent assumption that pennies constitute a single population.' ~

~

~~

~~

~

Experimental Procedure Samples of 10 pennies consisting of nine coins taken from one population and one from the other are prepared in advance of the laboratory period. Each coin should have a differentdate and mintmark combination so the student can identify a specific coin far checking a doubtful measurement. The date distinction among coins is ignored once any measurement problems are resolved. If as many samples are enriched in brass coins as are enriched in zinc coins, the subsequent analysis of pooled data will have comparable numhers of members of each population. Students are directed to determine the combined mass of the set of coins and to calculate a group average by dividing this total mass by the count. The mass of each individual coin is measured and recorded together with its date and mint mark. The median is evaluated. Suspicious measurements can be examined with the qtest or Chauvenet's criterion and failures reweighed; if the coin still fails,the student omits this replicate from his subsequent statistical calculationsfor the mean, standard deviation, and confidence interval. The class data are now pooled and used in subsequent classroom presentations and discussions. lnterpretatlon and Discussion As the work progresses, doubts about strange values are voiced. The students are instructed to reweigh any suspicious coins; if there is no change in the measirement, they are told t o accept the seeming spurious result: "that's what you get, that's what you record." The significant learning aspect from this discussion is that the data obtained are to be the data recorded, even if (or especially if) the data appear to be wrong. The student can easily calculate and comDare three convenient measures of the central tendencv of a sample-the group average, the median, and t h e m e a n l a n d understand the advantages and disadvantages of each. The various measures of precision and accuracy are also easily calculated. Standard methods for the comparison of averstudents immediateages and variances are applied (1,2,4); ly identify the problems of comparing dissimilar items. As

Presented in part at the April 1989 meeting of the Southeast Association of Analytical Chemists, Charlotte, NC. ' This peculiar property of US. one-cent pieces has been utilized as the basis of an "isotope" experiment. Wade Freeman at the Ninth Biennial Conference on Chemical Education, Boseman, 1986.

StatlsIIW lor Zlnc and Bras One-Cent Plecer'

replicates kept rejected by "3-s" average mass, grams standard deviation relative deviation

71°C

hras-

255 7 2.5102 0.0271 0.0108

254 8 3.0808 0.0328 0.0106

chi-squarecalculation (after 31 test) fw Gaussian behaviw 0.5 slgma4 9 dotC 6.458 4.604 t o sigma': 3 dofr

14.163 3.269

lnol oats set no a,ara~lstrom me smor see lootnote in !ex,

'lest oracket m e o w me ranpe lrom -3 9 o to 7 3 r o aood tne mean C ~ - n r a n cot hsaaom a cho.somrs ~~-~ .~ . rmtls! s arestar man me aearses of traaaom can be used as a basls for failure of the test 17). The carrerponding persentage points range from 25 to 90% probability of fallure. ~

Frequencyplot of brass cent pieces compared to aGaussian calculated forthe experimental average and standard deviation. The histcgram is for ail experlmental data, while the calculated curve is far data that were retained aner e "3.5" test.

appropriate, a student performance ranking is established from these comparison tests and the individual statistical mlcnlntions. ~ - - - ~ ~ With the pooling of the student data comes the realization that the spurious replicate that was excluded from one student's calculations appears to "belong" to another student's sample. Better students quickly realize that one-cent pieces, which were assumed to be a single population, are in fact a mixture of two distinct populations; a more valid nonmathematical procedure repl& the q-test as a discriminator of "bad" replicates. This point becomes much more evident when the pooled class data is combined with collected historical data2 and a histogram prepared; the bimodal distribution of the two populations is immediately apparent. The student readily comprehends data rejection by reason of the failure of the replirate to belong to the population being studied. The instruction in statistical methods continues with an analysis of this large data set. Averages and standard deviations are calculated for the brass and zinc populations. The "3-s" test (8) or other suitable criteria (9) are then used to identifv and reiect natural outliers and improve the calculated ave;ages &d standard deviations. The two population

distributions are c o m ~ a r e dto Gaussian distributions calculated from the experimental statistical parameters (8); this is shown for the brass pieces in the figure and summarized for both populations inthe table. A &ual comparison of the histogram and the calculared Gaussian curve suggests a moderate but probably acceptable mismatch, while chisquare comparison calculations suggest that this agreement is marginal (7,8).The important point to he brought to the students' attention is that an enormously large sample is po~ulation exhibneeded to demonstrate conclusivelv that a . . its a Gaussian or any other distribution. Despite this possible doubt. a final ex~erimentalexercise is to weigh several fresh pennies to testthe predictive properties of the Gaussian distribution. Student response t o this combined laboratory and lecture instruction in statistical methods has been positive. Although the historic data are given to the students, the students do understand how these numbers were obtained. An enhanced level of student confidence and comprehension enters into the discussion of experimental statistical analysis and other aspects of data measurements and reliability. Literature Clted 1. ~ ~R.L. h c tdi e o l s t o f~i r t i c s l o r~Anoly~icol ~ Chamisfs: ~ Van Noatrand ~ Reinhold: , No*. York, 1987. 2. Bsuer, E. L. AstolialicolManuollor

Ch~mists.Academie:Ne*.Y~rk. 1971. . J.Ano1. Chsm. 1951,23,636636. 3. Dean. R. B.:Oiron,W 4. Youmsna, H. L. Stotistimlor Chemistry; Memill: Columbus, OH, 1973. 5. Beers, Y.Introduction to the Theory olErro?; Addison-Wesley: Reeding, MA. 1957. 6. Blaedai, W. J.:Meloehe.V. W.;Remsay. J. A.J. Ckem.Edue. 1951,28,64M47. 7. Taylor. J. R. An Introduction toEmor Analysis; University Science Mill Vslley, CA.

For a copy of this data set, send a postage-paid self-addressed envelope and an iBM PC forfnaned3.5- or 5.25-in. disk to the author.

Modern Industrial Spectroscopy The 36th annualpragram &Modem Industrial Spectroscopy to be offered by Arizona StateUniversity, August 5-16, 1991, is designed far chemists and others from industrial labarataries that make use of spectrographic equipment

(ph,otagraphic,direct reading, and plasma). This intensive course of lectures and practical laboratory work serves to tram personnel to stafftheseinstallations. The program includes basic theoreticalconsiderations,hands-oninstrumental training, and the interpretation of results. Four hours of lecture each morning will serve ta present the theory, instrumentation, and applications of a variety of optical emission techniques including arc, spark, and plasma. Each student will spend every afternoon working in the laboratory under the direct guidance and supervisionof experienced technical personnel. The instructional staff includes members of the Department of Chemistry at Arizona State University augmented by guest lecturers from industriallaboratories. Enrollment in the courseis limited,and sufficient equipment is available to insure each student adequate time for personal operation of the instruments. The cost for the program is $1400. Far completeinformation,includingdescriptivebrochure,please write Jacob Fuchs, Director,Modem Industrial Spectroscopy,Department of Chemistry, Arizona State University, Tempe, AZ 85287-1604.

Volume 68 Number 4

April 1991

311