I
by W. J. Youden National Bureau of Standards
STATISTICAL DESIGN A
W O R K B O O K
F E A T U R E
E C
What Is a Measurement? A hidden complex of circumstances operates on every recorded observation
EVERY DISCUSSION about measurements starts out with an actual or hypothetical collection of repeated measurements of the same quantity. T h e distribution of these measurements about their average usually can be well described by a simple function of the standard devi ation of the measurements. T h e function assumes that only random errors are operating in the system. Generally, some care has been exer cised to find a collection that does illustrate the well known normal distribution. In any real situation, there are usually only a few meas urements of t h e same quantity— often only one or two. ^VLMOST
If larger numbers of repeat meas urements of a single quantity are available, they are frequently the re sult of an interlaboratory study or a sequence extending over a lengthy period of time. T h e consequence of this broadening of the base for the measurements is the introduction of various disturbing elements that also operate to influence the observed results. A casual examination may show that most of the data do cluster near the average with only a few showing large deviations. There is a general attitude that rather large collections, 100 or more, may be re quired to detect departures from the normal distribution so most ob servers do not attempt even a casual examination. Detecting Complexities in Small Amounts of Data
The most revealing Way that your columnist has found · to detect com plexities in a measurement depends upon using measurements made on two different quantities. T h e use
of two quantities would seem only to make the problem more compli cated; however, the second material greatly simplifies the study of the situation. Consider two quantities, X and Y, on which paired measure ments, xt and yt, are made under a range of circumstances. Circumstance Quantity Χ Quantity Υ
Ci C2 C3 . . . Cn χι Χι x3 . . . xn >Ί y2 y3 . . . y n
T h e several circumstances may refer to different laboratories, dif ferent operators, successive weeks, or different sets of equipment. Each pair of values (x{, y{) determines a point on a graph with the same scale for X and Y. T h e two quantities, X and F, should be similar in nature and not different in magnitude by more than 10 or 20%. This limita tion excludes, for the moment, still other possible complexities such as responses that depend on the magni tude of the quantity or the nature of the material. These matters will be taken u p again later. The number of points need not be large. As few as eight points .will often reveal some striking com plexities in the data. Consider two extreme situations. Imagine first that the various circumstances differ in no way in. their influence on the measurements and that the only source of variation is t h e random error associated with each measure ment. T h e point corresponding to average χ and average y is plotted, and the graph paper is divided into four quadrants by a horizontal and vertical line through this point. The random errors are just as likely to. be positive as negative, so the four pairs of random deviations + + , I/EC
h, , ^ , corresponding to positions in the four quadrants, are all equally likely. Under the sup position that only random errors are present, the points should be divided about equally among the four quad rants in a circular pattern centered on the centroid of the collection of points. The second extreme situation is not usually contemplated. I n this case one imagines that each of the circum stances, Ci, d, . . . , C„, has associ ated with it a particular bias or systematic error that is added (or subtracted) to both of the quantities X and Y. Further suppose that the random errors are all zero, so that if a second measurement were made on X (or Y) under any. chosen cir cumstances, Cf, then the two values would agree exactly. T h e net re sult of these assumptions is a linear relationship between Y and X. All the points must lie on a straight line of unit slope passing through the centroid of the collection. T h e slope must be μ η ^ because, by definition, any particular circumstance, Ct, transferred the same systematic error to both X and Y. These are contrasting patterns in deed, and they provide a basis for anticipating what will be observed in any collection of real data where there are random errors and where the various circumstances are also associated with individual systematic errors. T h e circle becomes a n ellipse with the line of unit slope serving as the major axis. This unit slope line is in fact the limiting case as the ellipse gets longer and nar rower. A continuous spectrum of possibilities can now be imagined, ranging from the circular form to
W
ORKBOOK FEATURES
81 A
I/EC
STATISTICAL DESIGN
•
A Workbook
Feature
position because there is usually no way to detect any general tendency to plus or minus systematic errors. Further Complexities in Data ο
I
I
DIBASE: SQUARE ROOTS PER· C E N T
Feo
When plotted by per cent only, these data originally showed a marked de parture from unit slope; however, when square roots of the percentage are used, there is conformance to unit slope
the pronounced linear relationship between Y and X. Observe that, in the course of the transition from the circular to linear form, the points migrate so that an ever-increasing proportion of them are .found in the upper right and lower left quadrants. T h e propor tion of points in these two quadrants is related to the relative magnitudes of the systematic and random com ponents in the measurements. For example, if the systematic com ponents contribute as much, on the average, as the random component, we may expect two thirds of the points to fall in the upper right and lower left quadrants. This propor tion rises to about four fifths if the systematic errors average twice as large as the random errors. The August and October columns [ I / E C 50, No. 8, 83 A, No. 10, 91 A (1958) J exhibit three charts, each of which had about 30 points. Percentage of points in the two quadrants was 8 0 % for C a O determinations in cement, 68% for compressive strength tests, and 9 0 % for loss on ignition on cement. This shows the dominant part played by systematic errors in some of the most common types of measurement. T h e reader must remember that the centroid itself is no doubt more or less displaced from its correct 82 A
T h e concept of a measurement presented above postulates that re peat measurements made under a given set of conditions are composite in character. Each of the measure ments of a given quantity contains within it the unknown quantity be ing measured. Each measurement has a second component or bias, the same for all, and this component de pends upon the circumstances under which the measurements were made. Finally there is a random com ponent. Apparently all real meas urements are at least this compli cated. T h e presence of bias is effectively demonstrated by the graphical tech nique for plotting the pairs of meas urements obtained by measuring two different quantities. T h e best estimate of the magnitude of any one of these biases is given by the interval, along the line of unit slope, between the centroid and the foot of the perpendicular from the point to the line. T h e points are displaced off the line of unit slope by reason of random errors present in the measurements Xi and ji. T h e displacement is just as apt to be above this unit slope line as below it. Therefore, if a ruler is placed perpendicular to this line, and slowly pushed along the line, it should encounter points above and below the line in the same sort of random sequence as heads and tails fall when tossing a coin. Just as long runs of heads (or tails) would arouse suspicion in matching coins, so will unduly long runs of points on one side of the line provide grounds for suspecting some addi tional complications in the data. There are two chief ways in which such complications arise, and either of them requires additional caution in interpreting the data. Perhaps the most common problem arises when the random errors get larger as the magnitude measured in creases. It is, of course, desirable to know if this state of affairs exists. If the two magnitudes differ greatly and the random errors are different for the two quantities, the slope of the best line through the points
INDUSTRIAL A N D ENGINEERING CHEMISTRY
departs from unity. In consequence, there will be long runs of points be low and above the unit slope line. T h e usual remedy for this is to take the logarithms, or the square roots, or some other transformation of the data. T h e logarithms are used when the standard deviations are propor tional to the magnitudes measured; and the square roots when the squares of the standard deviations are pro portional to the magnitudes. In Geological Survey Bulletin 980, complete rock analyses on a granite and a dibase rock are given for 30 laboratories. T h e ferrous oxide content of the dibase rock was around 9%, while just about 1 % was present in the granite. A line through the plotted points showed a marked departure from unit slope. T h e figure shows a reasonable con formance to unit slope when the square roots of the percentages are used. This particular example is of special interest because another type of complication also appears to be present. T h e points apparently con sist of two families. This separation into two groups is caused by the re sults on the granite. No separation appears in the dibase results. Pos sibly, two different analytical pro cedures were used which gave similar results on the dibase but, for some reason, gave different results on the granite. Statisticians refer to this situation as an interaction between rocks and methods. Whenever the points do appear to form colonies, there is a possible interpretation that the difference between the materials depends upon the method of meas urement. Sometimes this means that different methods are used or that the laboratories have evolved dif ferent interpretations of one method. All the above explains why a good description of a method of measure ment includes a careful statement of the materials, the range of magni tudes, and the circumstances under which the method will give satis factory results.
Our authors like to hear from readers. If you have questions or comments, or both, send them via The Editor, l/EC, 1155 16th Street N.W., Washington 6, D.C. Letters , will be forwarded and answered promptly.