Statistical Evaluation of Class Data for Two Buret Readings Adon A. Gordus The University of Michigan, Ann Arbor. MI 48109 In recent years, we have combined instruction in reading the meniscus levels in sealed hurets with statistical evaluation of the resultant class data. Two portions of a discarded huret. each about 10 cm in leneth. were sealed at one end. more than half filled with distilred hater, and then each was sealed at the ton end ( I ) . These "hurets" were mounted on a ring stand using a buret holder and cards with red markings were attached behind the hurets to assist in reading the meniscus (2). Freshmen honors students in a quantitative analysis course were individually shown how to use the meniscus illuminator and were asked to read each of the burets by estimating the volume to 1/100 mL. The individual values as well as the difference in the two readings were recorded for each student. These data serve to illustrate hoth random and systematic errors in the measurements and are characteristic of data obtained in titration measurements where the volume of titrant is determined by the difference in two huret readings. The data from two classes of students are shown in the figure; the means and standard deviations are given in the tahle. The histogram distributions of these data are approximately Gaussian in shape and illustrate typical random error. The standard deviations for the individual huret readings are each 0.022 mL, although values from other classes of students have ranged from 0.017 to 0.027 mL. It is possible, however, to illustrate a number of additional statistical functions besides the simple arithmeticmean and standard deviation. Propogation-of-error calculations can he performed, hased on the data for hurets A and B and compared with data for the differences in the huret readings to show the presence of systematic errors. If the scatter in the differenre data were a direct reflection of the sratter (uncertainties) in the individual hllret readings, thm, by propogation of error ( 3 ) the , predirred standard deviation for thedifferencedata! would be the square root of the sum of the squares of the individual standard deviations: (0.0222 0.0222)1'2 = 0.031 mL. The standard deviation of the student data, 0.023 mL, is clearly much smaller and can he easily explained as due to systematic errors. Students learning to use a huret (or any new apparatus) often introduce systematic error in the measurements. Some students will tend toview hoth hurets from an angle above the meniscus (or below the meniscus) rather than level with the meniscus. Thus, hoth readings will he high (or low). However, the error and increased scatter caused hv these viewing errors will tend to cancel when the differences in two suchreadings are considered. In addition, there is the human tendency to estimate last dieits m .nreferentiallv as zero or five. not realizine that all 10 digits are equally probable (4). This is seen in ;he data for huret A where readings of 44.05 and 44.10 are enhanced and readings of 44.06 and 44.11 are less than expected. A similar, although not as pronounced, skewness in data occurs for
+
4. TO b e arecise. ~~.infinite-sized data samoles should be used so that o, standard deviation values are obta~ned.However. vev linle error is intrwuced when sample sizes aregreater than about 50-100. ~
376
~~
7
~~
Journal of Chemical Education
Histogram distribution of readings of buret A and Gaussian curve finedto the same meanand standard deviation:44.092 i 0.022 mL. Histogram distribution ofreadings of buret B and Gaussian curve finedto the same mean and standard deviation: 13.733 i 0.022. Histogram distribution of difference in individual readings, buret A - buret B, and Gaussian curve fined to the same mean and standard deviation: 30.360 f 0.023.
Statistical Data lor Buret Readings
Number of measur~mems,N Arilhmetic m a n , x Standard deviation, 6, Chi-square Statistic. x2 Chi-square probability, exZ)
-B
Buret A
Buret B
A
161 44.092 0.022 31.7 99.99%
161 13.733 0.022
161 30.360 0.023 3.9 13.4%
8.2 58.6%
huret B, centered around the readings 13.70 and 13.75. These biases in the individual huret readings become minimized in the difference data and, therefore, also contribute to making the standard deviation of the difference data less than the vredicted . ~ronoeation-of-error value. .These data also allow construction of Gaussian curves fitted to the individual means and standard deviations and allow chi-square staristical rumparisons of the fitted curves with the histoeram data to determine if the differences are " statistically significant. The Gaussian curves were calculated in the usual way (5). For example, there were 12 student readings of 44.11 mL (huret A); this value spans the interval 44.105-44.115. In units of standard deviations relative to the mean, this interval represents the range (44.105-44.092)/0.022 = 0.591 to (44.115 - 44.092)/0.022 = 1.046. From a tahle of the area of the Gaussian (Normal) curve, i t is found that 0.1295 of the total area is between 0.591 and 1.046 standard deviations. Thus (0.1295) (161) = 20.85 data readings would occur a t 44.11 mL if the distribution of the data were ~erfectlv Gaussian. Similar calculations were made for all other readines and resulted in the Gaussian curves shown in the figure, each having the mean and standard deviation of the corresponding histogram. The chi-square statistic allows determining the signifi-
cance of the difference between the observed histogram data and the theoretical Gaussian distribution ( 5 , 6 ) .Since x2 = x [ ( F - fj2/FI, where F is the theoretical frequency and f is the observed frequency, we need only make use of the data already calculated for the fitted Gaussian curve. The 40.11 mL data of huret A, for example, result in a X2 contrihution of (20.85 - 12)2/20.85 = 3.76. Statisticians advise that the data for the tails of the histogram be grouped together in calculating the x2statistic so as not to contribute too heavily to the overall x2 value. Accordingly, the data for the figure were divided into nine increments, of which seven were 0.01mL increments in the center of the diagram and the remaining two were the tails. For buret A, this represents the groupings: 4 4 . 0 5 , the seven readings hetweeu 44.06 and 44.12, and 244.13. The difference in two data sets are usually considered statistically significant if there are less than five chances in 100 (the 95.0%confidence level) of such a difference. However, for the Gaussian comparisons of the type considered here, the 99.0% or even 99.9%confidence level is more a .. ~ n r .o ~ r i a t e rfi). As i;i seen in the table, the data for buret A result in a x2 value that corresnond.i to the 99.99% confidence level and. therefore, shows a staristirally significant difference. The orher two > b n l u e s are considerablv less: 5R.65 and 13.4%. The statistically significant difference between the histo-gram and the Gaussian curve for the data for huret A is primarily 3 resulr of an excessive number of last-digit estimates d zeros and fives. Excess zero and fives are less of a problem in the data for buret U,mainly because the average wlur is centered at an intermediate value: 13.73, and the normal ooodation data for the 13.70 and 13.75 readines of much greater than the normal population ldata buret for the 44.05 and 44.15 readings of buret A.
Another statistical comparison can also be made. For instance,-N = LOR freshmen chemistry students had a distribution of x s, = 9.561 f 0.02:) in rending anuther sealed b u ~ e t whereas M = 103 honors freshmen had a distrihurion of.y r sy = 9.570 f 0.018 in reading the same buret. The comparison statistic (7) is t = [(i- j)/spI[MNI(M + N)]li2 where sp is the "pooled standard deviation" and s; = [s:(N - 1) sY(M - 1)]/(N M - 2). For these data, sp= 0.0205 and t = 2.1. For u = 206 - 2 = 204 deerees of freedom. the t value for the 95.0% confidence level is1.99. Therefore,' these two sets of data (just barely) show a statistically significant difference since the calculated t value exceeds the table value. In this articular case. the increased scatter in the data for the non-honors freshmkn is aresult primarily of alarger number of zero and five last-digit readinps. This is only one of ihe c o m p k o n s that could be made. Others could involve juniors vs. freshmen, engineering vs. science students, tall students vs. short students (to evaLate viewing angles), data from the beginning of the course vs. data from the end of the course (to evaluate the effect of experience), etc.
+
+
Literature Cited 1. Harris, W. E.; Krafochvil, 8. Teaching Infroduetory Analyricol Chemistry: Saunden: Philadelphia. 1913: p 26 2. Harris, W. E.; Kratoehvil, B. Chrmicol Sepeporolionp end Meoruremenfr: Bachground and Pmcadurzslor Modern Anolysir: Saunders: Philadelphia, 1974; p 22. 3. Mandel, J. In Tmoriss on Andylirol Chemialry, 2nd ad.; Kolthoff, I. M.: Elving, P. J., Eds.:Wiley: New York. 1978;Vol 1. Part I. Chapter 5, p 279. 4. Lairinen. H.A. Chemical Andyair; Mecraw-Hill: New York,1960:p 554. 5. Nolson, L. S. 3. Chem Educ. 1956,33,126.
are
Hunt: Dubuquo, IA, 1983; p 233.
Volume 64 Number 4 April 1987
377