Analysis of Variance in Analytical Chemistry - ACS Publications

Comparison of conventional and robust regression in analysis of chemical data. Gregory R. ... Optimization of thermo-mechanical performance of epoxy n...
1 downloads 10 Views 6MB Size
Roland F. Hirsch

Report

Chemistry Department Seton Hall University South Orange, N.J. 07079

Analysis of Variance in Analytical Chemistry Analytical chemists usually work with techniques and methods which contain many sources of error. If the total variability in a particular case is higher than desired, then the significant sources of error must be identified and controlled. Analysis of variance is a statistical technique for estimating the importance of one or more factors suspected of contributing significantly to the total uncertainty in a given situation. Although analysis of variance (anova) is a well-established technique |nearly half of Youden’s 1951 text (1) is devoted to it], it seems to be used quite rarely by chemists. This article is intended to encourage more regular application of anova by showing how it works using specific cases in analytical chemistry. The actual computations will not be described since they are presented in many statistics texts, several of which are listed at the end of this article. Rather, the emphasis will be on choice of the proper model for the situation, interpretation of the anova, and the advantages and limitations of the technique. In anova the primary purpose is to test the hypothesis that a factor does not contribute added variability to a set of data beyond that caused by all other factors (the residual error). If this hypothesis is found invalid, then the size of the contribution from this factor can he estimated and appropriate steps

can

be taken in succeeding

experiments to keep it under control. The usual procedure for anova involves identifying the factor(s) to be studied, designing and carrying out experiments in which data are collected for at least two levels of each factor, apportioning the variance of the entire

set of

the data among the

sources

technique of sample introduction (2), it was realized that the potentially high precision of automated sampling and data acquisition could not be attained if one could not count on obtaining reproducible results from one sample cup to the next. A test was designed to estimate the contribution of between-cup variability to the overall precision. Ten cups were selected at random, and aliquots of a standard so-

anova

works

best be unexample. In the

can

derstood through an development of a rapid, fully automated atomic absorption analysis sys-







Table I. Automated Delves Cup Determination of Lead By Use of Peak Areas

(2)

Cup 1

2

3

3.104 3.055 2.908 3.053 2.893 2.864 2.919

3.126 2.823 2.758 2.809 2.667

3.084 2.953 2.896 2.811 2.915 2.896

2.760 2.992

no.

5

6

7

3.196

3.120 3.077 2.926 2.944

2.886 2.794 2.719 2.677

4

3.060 2.983 2.940 2.782 2.844 2.888 3.010 2.831 2.823 2.857 2.843 2.919 2.974 3.019 3.031 2.952

2.785 2.902 2.958 2.935 2.825 2.863 2.893 2.890

3.031 2.752 2.899 2.889 2.928 3.034 2.770 2.846 2.880 2.850

10

9

3

2.982 3.252 3.110 2.937 2.933 2.933 2.909 2.944 2.984 2.781 2.943 3.073 2.736 2.944 2.836 2.859

2.855

3.099 3.016 3.020 2.972 3.036 2.880 3.013 2.901 2.975 2.971

Anova table Source of variation

Single-Factor Analysis of Variance How



lution of lead were run until nine replications had been made with each cup. The signal observed consisted of an absorption peak caused by the lead in the sample being volatilized into the light beam. The raw data and anova table are shown in Table I. Each of the items in the anova table will now be described. The sources of variation are the specific factor or factors being studied (here the cups) and

of

variation, and interpreting the results of these computations.

all other sources, pooled and called the residual (or replication or measurement) error. The degrees of freedom (df) are, as customary, one less than the number of groups, here 10 1 = 9, and the sum of (one less than the number of data within each group), here 8X10 = 80. The total sum of squares (SS) is the total of the squared deviations of the individual values from the grand mean, ( Y;J Y)2- The SSam,mt, c.llps compares the mean value foreach_cup with the Y)2, and SSrKi,dUil| grand mean, (Y, compares the individual readings for a cup with the mean value for that cup, ( Yjj Yj)2. The mean squares (MS) are the ratios of sums of squares to degrees of freedom. The experimental F ratio () is the MSmmm„ IUVJ M,SreKjciUfii- Finally, the expected values of the mean squares [E(MS) | are given in the last column, with n~

tem using the Delves Cup

Among cups Residual error Total

df

9

80 89

SS

MS

Fs

E (MS)

0.192227 0.935060

0.0213585

1.827 ns

a2 + 9