Population versus Sampling Statistics: A Spreadsheet Exercise

Population versus Sampling Statistics: A Spreadsheet Exercise. Ken Overway. Department of Chemistry, Bridgewater College, Bridgewater, VA 22812. J. Ch...
0 downloads 12 Views 238KB Size
On the Web edited by

JCE WebWare: Web-Based Learning Aids

William F. Coleman Wellesley College Wellesley, MA 02481

Population versus Sampling Statistics

Edward W. Fedosky University of Wisconsin–Madison Madison, WI 53715

A Spreadsheet Exercise Population versus Sampling Statistics: A Spreadsheet Exercise by Ken Overway, Bridgewater College, Bridgewater, VA 22812; [email protected] Keywords: Audience: Upper-Division Undergraduate. Domain: Analytical Chemistry. Pedagogy: Computer-Based Learning. Topics: Mathematics/Symbolic Mathematics, Quantitative Analysis. Requires Microsoft Excel

When scientists draw random samples to be measured they expect their results will be accurate, assuming all systematic errors have been removed from the experiment. Unlike systematic errors, random errors cannot be removed from the experiment – only reduced. If several replicates are measured for each sample, random errors are mathematically minimized and are relegated to affecting precision, not accuracy. For some students, the difference between accuracy and precision is not clear enough for this to make sense. The solution is for students to interact with the statistics, which requires the laborious generation of multiple sets of random numbers, numerical comparisons, and graphical presentations of the data. The purpose of the spreadsheet exercise presented here is to remove the hurdle of constructing and generating such an interaction. The spreadsheet provides students with a self-led exercise that reinforces the statistics of sample and population distributions. The sample distribution is the link between the measured random samples and the hidden probability that governs the process. For any given homogeneous sample, the true value of the target parameter is measurable, but what the analytical chemist must do is to accurately determine this in the most cost-effective way, i.e. the least number of samples. Further, when experimental results do not agree with the theoretical model or with a quality control sample, chemists need a nonarbitrary way of determining whether the mismatch is an artifact of low precision or if there is indeed a systematic difference between the two samples, methods, or instruments. This is done

through hypothesis testing, which involves the sample standard deviation, a student’s t value, and the number of replicates, N, measured according to the equation t 

x  N s

N

where – x is the sample mean, μ is the expected value, and s is a random variable. This exercise allows students to set up a numerical example, where the parameters of the population distribution are defined, and measure the proximity of the sample distribution to the population via a statistical analysis. Since no systematic error is present by design, the two distributions come from the same source but often do not appear as such. By directly interacting with the data numerically and visually through graphs, students can see how the number of replicates profoundly shapes the conclusion of the experiment. Supporting JCE Online Material

http://www.jce.divched.org/Journal/Issues/2008/May/abs749.html Full text (HTML and PDF) with links to cited URLs Supplement Find “Population versus Sampling Statistics: A Spreadsheet Exercise” in the JCE Digital Library at http://www.JCE. DivCHED.org/JCEDLib/WebWare/collection/reviewed/ JCE2008p0749WW/index.html.

Figure 1. Screen shot for the main page of the spreadsheet.

© Division of Chemical Education  •  www.JCE.DivCHED.org  •  Vol. 85  No. 5  May 2008  •  Journal of Chemical Education

749