Nonconstant variance regression techniques for ... - ACS Publications

Comparison of conventional and robust regression in analysis of chemical data .... Theil–Sen nonparametric regression technique on univariate calibr...
2 downloads 0 Views 753KB Size
2310

Anal. Chem. 1980, 52, 2310-2315

(22) Schilla, G. H.: Morrison, G. H. Anal. Chem. 1977, 49, 1529-1530. (23) Leta, D. P.; Morrison, G. H. Anal. Chem. 1980, 52, 514-519. (24) Leta, D. P.; Morrison, G. H. Anal. Chem. 1980, 52, 277-280.

RECEIVED for review July 25, 1980. Accepted Septemer 16,

1980. This work was supported by the National Science Foundation under Grant No. CHE77-04405 and through the Cornel1 Materials Science Center and by the Office of Naval Research.

Nonconstant Variance Regression Techniques for Calibration-Curve-Based Analysis John S. Garden,' Douglas G. Mitchell, and Wayne

N. Mills

Division of Laboratories and Research, New York State Department of Health, Albany, New York 12201

Many callbratlon curves are calculated by a least-squares curve-of-best-fit procedure, which assumes that variance Is Independent of concentratlon. This assumptlon Is not valid for many analytlcal methods, and lnapproprlate use of the least-squares procedure may degrade preclslon. Such loss of preclslon Increases wlth lncreaslng varlablllty of varlance. Error from this source can be reduced by uslng a welghted least-squares procedure. Equations are presented for calculating the regression equatlon, flagglng potential outllers, and calculating confidence bands around predicted sample concentratlons. For several analyses the welghted least-squares procedure ylelded useful Improvements In precldon compared wlth the conventlonal least-squares callbratlon.

With the widespread introduction of computers, manual procedures to describe calibration curves of best fit have been rapidly replaced by least-squares computations. The conventional least-squares procedure works well with uniformly precise straight-line data. It is much less satisfactory when the assumptions on which the least-squares procedure is based are not valid. Specifically: (a) The mathematical model is assumed to be appropriate to the data, but this is often untrue. For example, many calibration curves are straight lines at low concentrations and curve toward the concentration axis at higher levels. Simple first and higher order regression equations often do not satisfactorily fit these types of data. The problem can be solved by limiting the dynamic range to straight-line concentrations, by using a multiple-curve technique ( I ) , by manual plotting, or by developing an appropriate mathematical model (2). (b) Variance is assumed constant over the entire dynamic range. This assumption too is often incorrect. Replicative measurements at various standard concentrations often show increasing range with increasing concentration. Nonconstant variance is implied by the fact that for many analytical methods, relative standard deviations are reasonably constant over a considerable dynamic range. In these cases direct application of the conventional least-squares technique can produce gross errors. For example, suppose that the true calibration equation is signal (Y) = concentration (X) and that variance of measured signals is proportional to concentration of analyte. A single calibration run may result in the signals listed in Table I, all of which are within 2 standard deviations of their true values. A

Table I . Comparison of Conventional and Weighted Least-Squares Procedures with Hypothetical Data re1 error predicted concn 1 3

6 10

30 60 100

true signal (mean ?r s)" 1 f 0.80 3 * 1.38

6 i 1.96 10 * 2.54 30 f 4.38 60 * 6.20 1 0 0 2 8.00

hypotheticaI signal

concn,b %

conventional

weighted

1 3

124

11

0

5

35 13 4

4

11

3

34 5 5 54 7 5 92 8 5 a True relationships: signal = concentration; variance a concentration. Computed relationships: (conventional least squares) signal = 1.22 + (0.91 concentration); (weighted least squares) signal = 0.15 + (0.95 concentra[predicted value - true tion). s = standard deviation. value] x 100/true value. ~

conventional least-squares computation results in the equation Y = 1.22 + 0.91X, whereas the weighted least-squares procedure described in this paper produces the equation Y = 0.15 + 0.95X. The latter is much closer to the true equation ( Y = X), and it is much more appropriate at low concentrations. For example, a sample yielding a signal of 1has a predicted concentration of -0.24 and relative error of 124% by conventional least squares and of 0.89 and 11%, respectively, by weighted least-squares computations. It is reasonable to expect that much analytical data will not show constant variance, nor would we expect variance to be a simple function of concentration. Random error is caused by noise, and noise sources may be a function of signal or concentration or other factors. Shot noise, arising from photomultiplier detectors, is proportional to the square root of the signal. Noise due to fluctuations in the output of the light source is directly proportional to the signal. Noise produced by the instrument electronics may exhibit constant variance, or variance may change with signal in various ways, depending on the nature of the circuit. Noise arising from sample cell turbulence or flame noise is proportional to the concentration of the analyte. Many analytical methods have additional noise sources of their own. Noise introduced a t the read-out stage, e.g., from the recorder deadband or the width of the tracing, is commonly small but need not be insignificant. If a recorder or other analog device is used, noise from this source is constant within each range, but between

0003-2700/80/0352-2310$01,00/0 0 1980 American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 52, NO. 14, DECEMBER 1980

Table 11. Algebraic Equations for First-Order Regression Calculationsa parameter variable-variance data

2311

constant-variance date

-

-

slope, b I

Y = bo + b , + E ; E N(0,a2/w) Y = bo + b,X + E ; E N(O,o*) ZWXlZW rX/n ( z w Y . z w X 2 - Z w X . z w X Y ) / [ z w . x w X 2 - ( X W X ) ~ ] ( ~ Y . x X- ~x X . Z X Y ) / [ n . z X 2- ( Z X ) ~ ] ( n . z X Y -- z X . ~ Y ) / [ r z , x X- ~( z X ) ’ ] ( Z w.I: W X Y - XWX.Z w Y ) / [Z:W.ZWX2 - ( Z W X ) 2 ]

signal band, A Y

2 2

model -

mean, X intercept , b o

regression band

1

(X, -

(2F)”Zs[;

X)2

+

-4 -

X)Z

Z(Xi -

X)’

(X,

1/2

a Weights: w i = l / s i * ;si= standard deviation at concentration, i. Standard error of estimate: s2 = SSresidual/(n - 2 ) . Sum of squares of residual: z w Y z - b , x w Y - b 1 xw X Y , E Y 2 - b,r Y - b , r X Y . Band around predicted concentration: substitute Y values (mean signal t A Y ) in Y = b o + b , X -t band around regression. Solve for X , .

ranges it varies directly with the full-scale reading. If the readout is digital, the noise is half of the least-significant digit or bit; as with analog devices it is constant within ranges. When signals are converted to absorbances, the relationship between noise and concentration becomes even more complicated. There has been recognition in the chemical literature that variance is usually not constant in chemical analysis (3-6). Franke et al. (3)have employed a standard deviation function directly proportional to concentration, and Smith and Mathews ( 4 ) assumed a standard deviation directly proportional to signal. Although either of these is often an improvement over constant variance, standard deviation as a function of concentration or of signal rarely passes through the origin and is often not even a straight line. Schwartz ( 5 ) described procedures for calculating regression equations and confidence limits by using data with nonconstant variance. He concluded that an analyst who ignores nonconstant variance will not sacrifice much statistical reliability in the midrange of the curve. However, confidence limits near the extremities of the curve are likely to be severely in error. The solution to this problem is to use a weighted least-squares procedure. In this paper, we present (i) criteria for selecting use of weighted least-squares procedures, (ii) procedures for calculating weighted regression equations and confidence intervals around predicted concentrations, and (iii) results for several practical applications of weighted least-squares procedures.

CRITERIA FOR SELECTING WEIGHTED LEAST SQUARES Before introducing an analytical method into routine analysis, the analyst should carry out replicate analyses of standards covering the concentration range of interest. Variations in variance with concentration can be most readily observed by plotting residuals against concentration. Figure 1shows such a plot for the determination of lead in blood (7). A conventional least-squares curve o,f best fit is calculated and used to calculate residuals (Yi - Yi), where Yi= signal at concentration Xi, and Yiis the signal a t X ipredicted from the regression equation. Results shown in Figure 1 are typical for many analytical methods; residual range and standard deviation increase with increasing concentration. This suggests that precision will be improved by using a weighted leastsquares procedure. Note also that the same plot can be used to detect an inappropriate model. For example, the means of the residuals shown in Figure 1 are 4 . 7 5 , 0.34, 0.51, and -0.44a t concentrations of 0,20,65, and 105, respectively. This “curved” pattern suggests that a nonlinear model would be more appropriate to these data. Before proceeding with use of weighted least squares, the analyst should consider the possibility that large variations in variance could be due to nonoptimum instrument operating

-

I

I

-.I, 4-

‘086:

23

55

105

‘ I 30)

I1 8 4 i

[ 2 17)

BLOOD L E A D COhCENTRA’ION k

1,ug

PB/dl)

S I G NAL S T A N D A R D D E V I A T I O N )

Residuals plotted against concentration for the determination of lead in whole blood. Flgure 1.

conditions. Correction of problems such as a noisy lamp in fluorometry will usually yield greater improvements than better statistical techniques. A weighted least-squares curve-of-best-fit computation requires estimates of the ratio of standard deviations at each concentration plus the acquisition of appropriate computer programs. This will increase analysis costs and will only be worthwhile if the additional data quality is required. Specifically, the following conditions would favor use of a weighted least-squares procedure: (1)The precision of the analytical measurement is a limiting factor in overall data quality. This is likely to be true for the analysis of major components in well-defined, stable samples. (2) Overall data quality is barely adequate for the application. (3) Standard deviations vary considerably with concentration. (4)Unit measurement costs are low, as with many automated instruments. (5) Accurate data are required near the lower extremity of the calibration curve.

CALCULATION OF REGRESSION EQUATIONS AND CONFIDENCE INTERVALS In this section, procedures for calculating first-order regression equations and confidence intervals are presented. Second and higher order equations require use of matrix algebra present in the Appendix. First-order equations are summarized in Table 11. Least-Squares Curve of Best Fit. Standards of concentration XI, X 2 , ..., X, are analyzed, producing signals Yl, Y2,..., Y,. A linear equation with two parameters having (unknown) true values Po and PI will be fitted. With this first-order model, the set of measurements can be expressed as n equations of the form (1) Yi = po &Xi €i where e, is the residual, i.e., the difference between the measured and predicted values of Y , at concentration i. (This and other equations developed in this paper are summarized in Table 11.)

+

+

2312

ANALYTICAL CHEMISTRY, VOL. 52, NO. 14, DECEMBER 1980

The least-squares procedure obtains estimates of parameters

Po and PI by minimizing the s u m of squares of residuals, Eft. These estimates, bo and bl, are unbiased; i.e., the expected value of residuals, E ( q ) ,is zero. For the most precise estimates of the p’s and to enable the use of t and F significance tests, the residuals must be normally distributed and have constant variance. If variance is not constant, this approach can still be used, but the data must first be weighted so that weighted residuals have a constant variance, u2, as follows. Let variance of residuals equal u12at X1, uZ2at X 2 , ..., u; a t i.e., Var(ti) = ai2. Weight each set of replicate measurements with a weighting factor, will2, where wi = u2/u:. (To simplify later equations, will2 is used rather than wi.) Thus the first-order weighted model is now

x,,;

= wi’/2po

Wi’/2Yi

+wiqlxi+

(2)

The variance of the weighted residuals will2 q, can be shown to be constant by using the formula Var(aY) = a2 Var(Y), where a is a constant. Thus Var(w;J2ci) = wi Var(ci) = a2/ai2ai2= a2 The weighted least-squares estimates of Po and

bo =

PI are

C w y c w X 2 - CWXCWXY C w C w X 2 - (CWX)2

bl =

CWCWXY - CWXCWY E w E w X 2 - (CWX)2

(3)

Alternatively, the following formulas are more convenient for computing purposes:

bl =

C(wX - X)(wYC(wX -X ) Z bo =

P - b1X

y, (4)

where X = C w X / C w is the weighted mean value of standard concentrations. Determination of Weighting Factors. The weighting factor, w ,at each concentration, is inversely proportional to the variance at that concentration. It is only necessary to determine the ratio of the variances at the different concentrations (and to assume that this ratio remains constant during subsequent analyses). Incorrect estimates of variances will not degrade precision providing their ratios are correct. If the sources of noise in the analytical method are well-known, it may be possible to predict variations in variance with concentration. However, in most cases, this information must be obtained experimentally. At least ten replicate measurements should be made at each standard concentration and used to calculate standard deviations. These will usually increase irregularly with increasing concentration and can be smoothed by using a least-squares procedure. The logarithm of these standard deviations should be approximately normally distributed, so a log (standard deviaton) vs. concentration curve is obtained by using a second-order (unweighted) least-squares procedure. The weighting factors will2 are the reciprocal of the smoothed standard deviation at each concentration. Flagging Potential Outliers. The weighted residuals, w1I2e,should be normally distributed with constant variance and hence can be used to test for outlying measurements. For example, a calibration point whose weighted residual exceeds, say, 3 standard deviations (or any other selected confidence interval) can be flagged as a potential outlier. Alternatively, weighted residuals can be plotted against concentration. Potential outliers are identified by inspection after considering

the number of measurements available. The decision to retain or reject such points should be based on the analyst’s knowledge of the analytical method. Note that potential outliers cannot be identified by a self-consistent technique without a weighting procedure. If variance increases with increasing concentration, the analyst must set a more tolerant quality criterion for high concentration standards. The weighting procedure does this automatically. Confidence Bands. Confidence bands around predicted sample concentrations are excellent measures of analytical precision. For a 90% confidence band, we can be 90% confident that the “true” value lies within the band, subject to the assumptions underlying the calculation. There are several types of confidence bands, from which a choice must be made according to the application. Since we are concerned with calibration from standards of known concentration, whose signals are measured, the independent variable for our purpose is concentration, and the dependent variable is signal. The regression equation gives signal as a function of concentration. We can use this equation in the forward direction to predict the signal which will be produced by some future sample of known concentration. Use of the regression equation in the reverse direction, that is, to estimate concentration given a value for signal, is technically called discrimination, but the word prediction is commonly used for this application also. We shall use it so in this paper. We wish to analyze an essentially unlimited number of unknown samples, although in practice the number will be restricted by such factors as instrumental drift, which can reduce the validity of a calibration curve over time. These limiting factors are not considered directly in the regression statistics. Having selected the appropriate case (unlimited discrimination; 8, p 125), we proceed to develop the confidence band about the predicted concentration, allowing for two contributions to error: uncertainty in the sample signal reading and uncertainty in the regression line. If the true local variance u: at the level of a sample signal Yowere known, the confidence interval on the signal would be Yof Zuo/m112,where Z is the lOO(1- a / 2 )% value of the normal distribution and m is the number of sample replicates. The true value uo is not known, so the interval about Yomust be estimated from available data. If a large number m of sample replicates is analyzed, the interval may be estimated directly from the sample as Yo f ts/m1I2,where Yo is the average of the m signal replicates, s is the standard deviation of the replicates, and t is the lOO(1 - a / 2 ) % value of the t distribution for m - 1 degrees of freedom. For reasonable numbers of sample replicates, a better estimate of the band can be obtained from the calibration data by the formula Yof Zuo/m1I2. Although uo is not known, its upper limit, providing the calibration curve does not exhibit significant lack of fit to the assumed model, is

with 100(1 - a / 2 ) % confidence, where n is the number of standards and p is the number of predicators (two for the fmt-order case). The factor so = S W O ~ / ~where , s is the standard error of estimate and 1/wo1/2is the local standard deviation as estimated when establishing the weights, making so the local standard deviation estimated from the calibration curve. x* is the lower ( a / 2 ) % value of the x2 distribution for ( n - p ) degrees of freedom. Thus the band on the sample signal is

(5)

ANALYTICAL CHEMISTRY, VOL. 52, NO. 14, DECEMBER 1980

Note that this is a very conservative estimate of the band on the sample signal. Real analyses will be subject to other sources of error which cannot readily be treated statistically. This would include errors such as bias due to variations in sample matrix and instrument drift. These unquantifiable sources of error can be at least partially compensated for by conservatively estimating quantifiable sources of error. The confidence band about the calibration curve a t sample concentration X o is given by (8):

for a first-order unweighted least-squares equation. F is the lOO(1 - a/2)% value from the F distribution for (2, n - 2) degrees of freedom and X is the (unweighted) mean standard concentration. The corresponding equation for weighted least squares can be derived by using matrix algebra or deduced from eq 6:

Y = bo

+ b1X f (2F)lI2s

(XO-

2313

Table 111. Typical First-Order Regression Calculations for Three Standards Concentration, X: 1.0,3.0, 10.0 Signal, Y : 0.9,1.0, l.i,2.9,3.2,3.5,8.5,9.5,10.5 QI = 0.1,o*q5Z = 1.64,o ’ m F , , , = 3.78,0*95F2,, = 4.74, 0.05 X 7 2 - 2.17

parameter mean concn, X signal giving X = 1 band around signal, Y at X, = 1 band around regression at X, = 1 band around predicted concn at X, = 1 local standard error of estimate at X, = 1,so

nonconstant constant variance variance Y := 0.0212 Y = 0.2134 + 0.9954X + 0.9328X 1.278 1.017 0.315

4.667 1.146 0.315

0.186

0.355

+0.50 --0.55

+ 0.64

0.107

0.107

-0.88

m2

where X = EwX/zw,the weighted mean of the standard concentrations. The band of eq 7 is simultaneously valid for all values of Xo. Now that we have produced confidence bands about the sample signal and about the calibration curve, we can combine them to yield the band about the predicted concentration. This is done by intersecting the sample band with the curve band and taking the limits of the area of intersection, projected onto the concentration axis, as the limits of the confidence band about the predicted concentration. (This procedure is illustrated in ref 1 and in ref 8, p 126.) For nonconstant variance, confidence limits on a predicted concentration are obtained by calculating the limits of the band on the signal, using eq 5 , and substituting these values for Y in eq 7. Equation 7 can then be solved for the required X values by trial substitution for X or, in the first-order case, solving the quadratic equation for X. Comparison of Precision of Weighted a n d Unweighted Least-Squares Procedures. Nonconstant variance techniques are used to improve precision, which is most conveniently expressed as the width of the confidence band around each predicted sample concentration. Equations 5 and 7 yield confidence bands for the nonconstant variance case, and eq 5 and 6 yield appropriate bands if variance is constant. However, eq 6 is derived assuming constant variance and will yield incorrect results if variance is not constant. Thus the effects of weighting cannot be compared by using these two sets of equations. Johnston (9) gives appropriate equations for calculating variances of b when the constant variance is incorrectly assumed. The appropriate mathematics are summarized in the Appendix. R E S U L T S AND DISCUSSION Typical Calculations. As a simple demonstration of this approach, it was applied to three arbitrary standards (Table 111). The standard deviation of signals was assumed to be proportional to concentration. Weights wi = l/s: of 100,11.11, and 1.0,corresponding to standard deviations of 0.1,0.3,and 1.0,respectively, were used. The sums EX,C w X , etc. were calculated and substituted in the equations listed in Table 11. An a value of 0.1 was used for 2, F , and x2. All results were obtained by simple manual calculations, except for computation of bands around the predicted concentrations. These were obtained by calculating the limits of the signal band from eq 5 and substituting these values for Y in eq 7 for variable variance or in eq 6 for constant variance.

For these data, the variable-variance calculation resulted in a much better fit to the true curve (X= Y) than did the constant-variance calculation. This improvement in fit was most striking at low concentrations. At Xo= 1,for example, the signal which produced the required concentration was over eight times as far from the true value ( Y = 1) for constant variance as it was for variable variance. Also, the confidence bands at this value of X were narrower for variable variance than for constant variance. Johnston (9) shows that the greater the departure from constant variance, the greater the benefit in using a weighted least-squares procedure. This is an obvious point, but it emphasizes the need to study the variations in variance with concentration before selecting a calibration procedure. Improvement Obtained by Using Weighted LeastSquares: Real Data. The nonconstant variance calibration approach was applied to two sets of typical analytical data: determinations of copper in water by microsampling cup atomic absorption spectrometry (IO)and determination of iron in water by conventional atomic absorption spectrometry with an air-acetylene flame. The following steps were carried out: (i) Calibration standards were chosen to encompass typical dynamic ranges for practical analyses. (ii) Standard deviations were estimated a t each concentration as described above. (iii) Typical calibration curves were selected for each analysis, and confidence bands around several predicted concentrations were calculated. Relative confidence bandwidths, defined as (upper band - lower band) X 100/(2 X predicted concentration), were calculated. The imprecise copper determination was assumed to require triplicate sample analyses, while the precise iron determination required single measurements. In each case, two bands were calculated (Figure 2): bands obtained by erroneously assuming constant variance and bands obtained by correctly using weighted least squares. Use of weighted least squares yielded a trivial improvement in precision for the iron determination, and a useful improvement for copper. In both cases, the improvement in precision was greatest at the extremities of the curve ( 2 ) . The greater improvement in precision with copper is probably due to the more rapidly changing standard deviation. For iron, the standard deviation increases by a factor of 9.6 over a dynamic range of 100. Corresponding figures for copper are 4.4 and 16, respectively. As a further test, we randomly selected data for 20 calibrations of a determination of lead in blood by Delves cup atomic absorption spectrometry (7). Whole blood standards containing 20, 65, and 105 kg of lead/dL were analyzed in

2314

ANALYTICAL CHEMISTRY, VOL. 52, NO. 14, DECEMBER 1980

coefficients the p terms of the model. The variance matrix for residuals is defined by

[: j 2

Var(e) =

=

rW11/w2.. O

]mz

'I / w n

oi,

The latter matrix is the weight matrix V, thus Var(t) = VC?. A matrix P i s defined such that P' P = V , where the prime signifies the transpose of the matrix. To meet this condition, P has the form 1/2

p =

LW1

..'

0

112

'/w2

1/w,

1/2

and its inverse P-' is 0

2

I

4

CONCENTRATION ( @g/m L)

Figure 2. Relative confidence bandwidths and smoothed standard deviation functions for two analyses: (A)constant variance; (0) weighted least squares; (0)standard deviation function.

duplicate and the data used to calculate a first-order calibration line using both weighted and unweighted least-squares procedures. The value of the Y-axis intercept, theoretically zero, was used as a "nonstatistical" test of precision. The weighted least-squares procedure yielded a mean intercept of 0.59 with a standard deviation of 0.80. This roughly corresponds to 1.2 f 1.6 pg of lead/dL. The unweighted procedure yielded an intercept of 0.56 with a standard deviation of 1.11. Thus the weighting procedure improved precision by about 30% a t the zero lead level. Conventional least-squares procedures are susceptible to two major sources of error: the mathematical model may be inadequate and the assumption of constant variance is usually not justified. An analyst requiring accurate routine analysis should consider using a weighted least-squares procedure. Precision will always be at least a little better, paricularly at low concentrations. This improvement is achieved a t the cost of a more complex computer program and additional method development. Precision can also be improved by other means, for example, by improving the analytical procedure or by replicate analysis. For high-volume high-quality routine analysis, the weighted least squares will often yield a useful improvement in precision. ACKNOWLEDGMENT We are indebted to J. F. Gentleman of the Statistics Department, University of Waterloo, Ontario, Canada, and to M. S. Zdeb of the Office of Biostatistics, New York State Department of Health, for helpful discussions during this work. APPENDIX M a t r i x Equations for Weighted Least-Squares Calculations. These equations may be used to derive the algebraic equations cited in the text and should be used for second and higher order equations. Calculation of valid confidence bands around calibration curves, where data are erroneously assumed to have constant variance, does not appear to be described in the statistical literature, though Johnston (9) begins to develop the argument. Draper and Smith (11) have an excellent chapter on matrix methods in regression analysis. The least-squares model in matrix form is Y = XB + c, where Yis an n X 1matrix of (n)signals, Xis called the design matrix and is an n x p matrix of the n standard concentrations transformed according to the p terms of the model. ,8 is a p X 1 matrix of p parameters p (Po, pl, etc.) which are the

The weighted regression model has the form

P'Y = P'W

+ P'c

(8)

The least-squares estimate of ,8 is given by (11)

b = (X'X)-'X'Y and the corresponding weighted least-squares estimate

b = [(P'X)'(P'x)]-'(P'X)'(P' y) = (X'V'X)-'X'V' Y (9) substituting for P-' Y in eq 9 yields b = (X'V'X)-'X'P''(P'xj3+ P'c) = (X'V'X)-'X'V'Xj5+ (X'V'X)-'X'V'c = j3 + (X'V'X)-'X'V'E The variance of b is the expected value of the square of the difference ( b- B). Var(b) = E ( [ b - B ] [b-j3]') = E([ (X'V'X)-'X'VI€] [ (X'V'X)-lX' V'E ] '} Using the theorem ( A B C ) = C'B'A' and noting that (X'V-'X)-' and V-' are symmetric, this becomes Var(b )

=E((X'V'X)-'X'V'ee'VIX( X'V'X)-') = (X'V'X)-'X'V'E(ce') V'X(X'V'X)-' (10)

since X a n d Vare constants. The expected value of ct' is Vou2where V ois a matrix of the same form as Vbut containing the actual relative variances of the observations. Matrix Vcontains the relative variances assumed by the model. Since u2 is a scalar multiplier. Var(b) -

(X'V'x)-'X'V'V, V'X(X'V'X)-'a2

(11) The variance of b is minimized when V = V', that is, when we select the correct relative variances for the model. In this case eq 11 reduces to Var(b) = (X'V'X)-la2 (12) If the analyst does not recognize that his data have variable variance, he will estimate by a constant-variance least-squares procedure; that is, he will set all weights a t unity ( V = V-' = I ) , where I is the identity matrix. The estimates b of the parameters j3 produced in this way will be unbiased, but the variance Var(b) will be higher than if the correct relative variances were assumed. Since V-' = I but Vo does not, eq 11 becomes in this case Var(b) =

(X'X)-'X'VoX(X'X)-la2 (13)