Anal. Chem. 1986, 58, 1593-1595
1593
Data Compression Technique for Tables of Measurements Lawrence D. Fields and Stephen J. Hawkes* Department of Chemistry, Oregon State University, Corvallis, Oregon 97331 A lack of effective communication can present serious difficulties in any discipline. An example of this problem in the sciences is our double standard about the reporting of measurement data. We say that we should always include an estimate of the uncertainty when we report a measurement. (For our purposes, measurement uncertainty is the sample standard error of the mean.) Yet we tend to forget this when we publish our data in tabular form. Often the uncertainty of a measurement is not given directly and must be inferred from the number of significant figures used to represent the quantity. This oversight is understandable because tables could become rather cluttered if we treated the measurement and its uncertainty as separate entries. In some circumstances,the present practice is tolerable. A chromatographer reading a table of GC retention indexes will probably have some feeling for the limitations of the measurements, even if the uncertainties are not stated explicitly. However organic chemists may be using these same tables to assist in the identification of newly synthesizedcompounds. They will want to compare their calculated retention indexes with the literature values. In this case, the question is: “How close is close enough?” We propose a solution to this problem in the form of a new notation-significant figure code-which is the middle ground between stand alone significant figures and the direct reporting of uncertainty. Significant figure code combines a measurement and its uncertainty into a single quantity. Significant figure coded quantities look like and read much like ordinary base 10 numbers.
SIGNIFICANT FIGURE CODE Before we describe the actual significant figure coding technique, we will consider stand alone significant figuresquantities in which the uncertainties are not given directly and must be inferred from the number of significant figures used to represent them. Stand-alone significant figures are very compact and easy to read, but they are not very specific about the estimated uncertainty of a reported measurement, because the base ten number system allows the uncertainty on the last significant digit to be anywhere within a 10-fold range. In order to retain the maximum amount of useful information, we will use a i 2 to *20 rounding convention; i.e., a quantity is rounded, such that the magnitude of the uncertainty on the last reported digit of the number is greater than or equal to 2 and less than 20. We should mention that there is no consensus about the best rounding convention for stand-alone significant figures. For example, Skoog and West ( 1 ) recommend f0.5 to f5. Significant figure code could be modified to accommodatethis convention. However this would create some ambiguities. (See Appendix.) The significant figure code algorithm has two steps. First the mean value is rounded according to the f 2 to f 2 0 rounding convention and expressed in scientific notation (if there would be any ambiguity about the location of the last ‘significant’digit without scientific notation). Then a one-digit number is tacked onto the end of the mantissa. This final digit is the uncertainty category, and it has absolutely nothing to do with the value of the measurement itself. The rounding step guarantees that the uncertainty on the last significant digit is between f 2 and f20. With significant figure code, the uncertainty interval from 1 2 to f20 is par-
titioned into ten uncertainty categories, each represented by a single digit. See Table I. Table I uses a semilogarithmic scale. This allows us to resolve the estimated uncertainty within a 10O.l = 1.26-fold range (not a confidence interval), regardless of the uncertainty category. This is not possible if we use a linear scale. Table I1 is an analysis of the significant figure coded number “8.39”. According to Table I, the uncertainty on the “3” (in “8.3”) is between f15.9 and f20. Therefore “8.39” in significant figure code translates to “8.3 with an uncertainty which lies between i1.59 and f2”. Since the “8.3” is a rounded value, the actual sample mean could be anywhere in the open interval (8.25, 8.35). Significant figure code may be confused with ordinary base 10 notation, since the two look alike. When a table is constructed using significant figure code, it is best to state this explicitly at the outset. If a significant figure coded quantity is mistaken for an ordinary base 10 number, then the information about the uncertainty cannot be accessed. However the example in Table I1 also demonstrates that the magnitude of the significant figure coded number does not suffer appreciablyfrom the misinterpretation. The misinterpretation error is 0.09, as compared with the measurement uncertainty whose absolute value is at least 1.59. In this case, the misinterpretation error is 0.09/1.59 = 6% of the uncertainty. Therefore this nomenclature is very forgiving to the reader of a data table, who has never heard of significant figure code and mistakes it for ordinary base 10 notation. Example. Suppose that the mean value of a series of measurements is 36.28 mV and that the standard error of the mean is 0.7 mV. The uncertainty on the “8” in “36.28” is f70. Since this is outside of the f 2 to i 2 0 uncertainty range allowed on the last significant digit, we must round off the “8”, leaving “36.3”. The uncertainty on the “3” in “36.3” is f 7 , which corresponds to uncertainty category 5 in Table I. Therefore the significant figure code representation for the series of measurements is “36.35 mV”. Table I11 gives more examples of how the mean and uncertainty from a series of measurements are expressed in significant figure code. Our simple version of significant figure code has two boundary conditions. First this notation cannot be used for an exact number (having zero uncertainty). However we could adopt the convention that a ”.” at the end of a number denotes exactness. For example, 1 in. 2.54. cm. This notation is currently used by some to indicate the significance of any terminal zeros in a number. For example, “300.” has three significant figures. However there can be no ambiguity about the significance of any zero in a significant figure coded number. Therefore the exactness convention could easily replace the terminal zero convention for a period at the end of a significant figure coded number. The second boundary condition for simple significant figure code is that it cannot be used to represent a nonzero quantity whose uncertainty is much greater than the absolute value of the mean. To code the result of a calculation involving several significant figure coded numbers, four distinct operations are required: (1) decomposing each significant figure coded number into the mean and its standard error (decoding); (2) the calculation itself; (3) a propagation-of-uncertaintyanalysis; (4) combining the results of (2) and (3) into a single significant
0003-2700/86/0358-1593$01.50/00 1986 American Chemical Society
1594
ANALYTICAL CHEMISTRY, VOL. 58, NO. 7, JUNE 1986
Table I. Relationship between Uncertainty Category and the Estimated Measurement Uncertainty for the f 2 to 120 Rounding Convention uncertainty category
exact
0 x10’ 1
21
I
1
I
approx 2
x 1/ 0 2
2
xi03 1
I
2.52
I
3
xi04
I
3.17
I
4
xi05
3.99
5
I
I
5.02
x 110 6
6
x 110 7
I
6.32
I
7.96
10.0
7
X l1 08
I
12.6
8
x 10.9 1 9
I
15.9
20 1
I
20
estimated measurement uncertainty on the last significant digit
should be adequate, considering that for a confidence interval having a 95% probability of bracketing the standard error of the mean within a 1.26-fold range, about 150 measurements are needed (3), which is more than the usual sample size in analytical work. If either greater resolution of the uncertainty or more compactness is needed for a given application, then customized techniques can be constructed, by partitioning the f 2 to f20 uncertainty interval into either more or fewer uncertainty categories. Thus significant figure code expands into an array of techniques. There is a more sophisticated technique-decimal coded binaries ( 4 ) (not the same as binary coded decimals)-which compresses measurement data with greater efficiency than the simple version of significant figure code presented in this paper. Decimal coded binaries can be programmed into the HP15C scientific calculator and into 8-bit laboratory microcomputers (e.g., the AIM 65), which are used to interface with chemical instruments commonly utilized in quantitative analysis. Ordinarily when a measurement is reported, the uncertainty of the measurement should also be reported. However those who omit uncertainties from data tables have valid reasons for doing so, and no amount of pontification is likely to correct this practice. With significant figure code, the “omitters” can have brevity and readability in data tables, without leaving measurement uncertainties in limbo. However significant figure coding is primarily a form of data compression; it requires less space in memory devices and less time for telecommunicating than would be required to store or transmit a measurement and its uncertainty separately. Moreover conventional data compression methods can be used
Table 11. Anatomy of the Significant Figure Coded Number “8.39” uncertainty category
numerical value
Table 111. More Examples mean
uncertainty
significant figure code
87 654 321 87 654 321 87 654 321 87 654 321 87 654 321 87 654 321 87 654 321 87 654 321 87 654 321 87 654 321
11 56 280 1400 7 000 36 000 180 000 900 000 4 500 000 22 000 000
87654321.7 87654324 8.765431 X lo7 8.765438 X lo7 8.76545 X lo7 8.7652 X lo7 8.7659 X lo7 8.776 X l o 7 8.83 x 107 9.0 x 107
figure coded number (coding). This same approach is the basis for two “vest-pocket”algorithms, which are employed in determining the correct number of digits to use in expressing the result of a calculation whose components are represented by stand-alone significant figures (2). With significant figure code, the estimated uncertainty can be localized within a 10O.l = 1.26-fold range a t worst. This
Table IV. Relationship between Uncertainty Code and Estimated Measurement Uncertainty (V,) with the f0.5 to f5 Rounding Convention uncertainty code
I
O
I exact approx
I
10’ -
0.5
I I
0.5
~ 103
102
I
104
~
105
I
106 ~
107 I
1 0 8~
1 0I 9
I~
-
-
-
-
-
-
-
-
5
2
2
2
2
2
2
2
2
2
I
I
I
I
I
I
I
3.15
3.97
I I
1.26
1.00
0.79
0.63
2.51
1.99
1.58
I
I
5
estimated measurement uncertainty
Table V. Relationship between Uncertainty Code and Estimated Total Uncertainty ( U T )with the f0.5 to 1 5 Rounding Convention uncertainty code
l o 1 1 1 2 1 3 / 4 1 5 1 6 1 7 1 8 1 9 1
0.58
0.69
0.84
1.04
1.29
1.61
2.01
estimated total uncertainty (approx)
2.52
3.17
3.98
5.01
I
1595
Anal. Chem. 1986, 58, 1595-1596
in tandem with significant figure code. Significant figure code has the potential to improve the written and electronic communication of tables of measurement data among scientists and engineers-especially across disciplinary boundaries.
APPENDIX Thus far we have considered only those uncertainties which stem from random errors of measurement. Rounding error also contributes to the uncertainty. An example of a pure rounding error is that which arises when a is rounded to 3.14159. If we restrict ourselves to a very liberal rounding convention (like f 2 to f20), then this oversight will not cause serious problems. However with the more popular f0.5 to f 5 convention, the rounding error can inflate the uncertainty appreciably. The estimated total uncertainty UTof a reported measurement is calculated as (5)
UT = (Urn2+ R2/3)lI2 where Urnis the estimated measurement uncertainty (which is the standard error of the mean for a large number of measurements) and R is the maximum possible rounding error (f0.5 on the last significant digit). The percent inflation (of uncertainty) is defined as % inflation = 1OO(UT - Urn)/Urn (2) Suppose that the measurement uncertainty on the last reported digit of the mean value is k0.5. By eq 1and 2, the inflation would be approximately 15%; i.e., the total estimated uncertainty would be 15% more than the estimated measurement uncertainty. Therein lies the ambiguity of the popular k0.5 to k 5 rounding convention. Is the reported uncertainty just a measurement uncertainty, or is it the total uncertainty? With
the f0.5 to f 5 convention, there can be an appreciable discrepancy between the two. With the f 2 to f20 rounding convention, the maximum inflation is just over 1%; the measurement uncertainty is essentially the same as the total uncertainty. Tables IV and V illustrate the ambiguities of the f0.5 to f 5 rounding convention vis-a-vis significant figure code. Table IV shows the relationship between the estimated measurement uncertainty and uncertainty code for the f0.5 to f 5 rounding convention. Table V is for the estimated total uncertainty. In Tables IV and V, note the large discrepancies between the estimated measurement uncertainty and the estimated total uncertainty in the lower uncertainty codes. If one formats data, using significant figure code in conjunction with the f0.5 to f 5 rounding convention, then one must also be explicit about what kind of uncertaintymeasurement uncertainty or total uncertainty-is contained within the SFC numbers. (We recommend using measurement uncertainty, since the absolute limits (6) for the uncertainty codes can be expressed compactly, as in Tables I and IV.) This distinction is not crucial for the f 2 to *20 rounding convention.
LITERATURE CITED (1) Skoog, Douglas A,; West, Donald M. Fundamentals of Analytical (2) (3) (4) (5) (6)
Chemistry, 3rd ed.; Holt, Rlnehart, and Winston: New York, 1976; pp 78-80. Flelds, Lawrence D.; Hawkes, Stephen J. J . Coil. Sci. Teach., in press. Natrella, Mary G. Experimental Statistics: National Bureau of Standards Handbook 91; US. Government Printing Office: Washington, DC, 1966; pp 2-12. Fields, Lawrence D. M.S. Thesis, Oregon State University, 1985. Sheppard, W. F. Proc. LondonMathematicalSoc. 1898, 29, 11, 369. ASTM E29-67, 1980.
RECEIVED for review December 19,1985. Accepted February 18, 1986.
Vortex Cooling for Subambient Temperature Gas Chromatography Thomas J. Bruno Thermophysics Division, National Bureau of Standards, Boulder, Colorado 80303 Subambient temperature gas chromatography, which has been used since the mid 1950s, is receiving renewed attention because of the advantages it provides in the analysis of very volatile species. Much of this interest stems from the need to determine trace quantities of priority pollutants in air samples. Subambient temperature gas chromatography generally refers to column operation at temperatures between -100 and 0 O C ; however, many separations are greatly enhanced at temperatures no lower than -40 O C (I). In this short ,note, an extremely simple yet very effective approach to subambient column temperature operation is described. Most of the studies done with subambient column temperatures involve the use of liquefied gases (cryogens), such as liquid nitrogen, as the cooling medium (2). Other approaches, such as the use of the Peltier effect or the JouleThompson effect, have received far less attention. In practice, the cryogenic fluid is introduced into the column oven through a microprocessor-controlledsolenoid valve. This valve regulates the flow of cryogen into the oven and thus provides some degree of temperature control. There are many disadvantages associated with the use of liquefied gases to produce low temperatures in chromatographic equipment. The large
volumes of coolant typically required necessitate the use of large Dewar containers that are bulky, heavy, and expensive. The low temperature and high vapor pressure of most of the liquid gas cryogens pose potential explosion and burn hazards (3)*
DISCUSSION In the author's laboratory, chromatographic column temperatures as low as -40 O C are obtained routinely using an arrangement based upon the Ranque-Hilsch vortex tube. A detailed discussion of the operation of the vortex tube is provided elsewhere (4); thus only a brief description will be given here. The vortex tube (a commercial unit) is shown schematically in Figure 1. A source of compressed air (at 0.007 MPa, 100 psi pressure, with a flow rate of 0.34-0.42 m3/min, 12-15 standard ft3/min) is applied to the inlet nozzle of the tube, whereupon it is discharged tangentially through the tube by the vortex generator (see inset in Figure 1). This discharge will produce a high-frequency air vortex along the inside circumference of the tube, leaving the center of the tube virtually empty. At the end of the tube (left-hand side in Figure l), a needle control valve allows some of the air to
Thls article not subject to U.S.Copyright. Published 1986 by the American Chemlcal Society