Practical Considerations for Digitizing Analog Signals P. C. Kelly’ and Gary Horlick Department of Chemistry, University of Alberta, Edmonton, Alberta, Canada, T6G 2G2
Quantitative measures for the quality of digital data can be obtained by attempting to regenerate an analog signal from digital data. The resulting signal can be compared to the original analog signal. Regeneration may be carried out by use of simple operations on the Fourier transform of a continuous representation of digital data. The effects of sampling interval, sampling duration, quantization, digitization time, aperture time, and random variations in sampling interval are examined. The maximum sampling interval and minimum number of samples needed to digitize triangular, exponential, Lorentzian, and Gaussian peaks for given values of maximum absolute error are tabulated.
RAW DATA OBTAINED from most analytical instruments are in the form of analog electrical signals. It is frequently necessary to convert these data into a form that is more convenient for qualitative and quantitative interpretation. The conversion of raw data into more useful forms typically involves mathematical operations and these are often so complex that a digital computer is an essential part of the process Consequently, the raw data must be digitized, that is, the continuous analog signal must be converted into a set of numbers. This conversion is most often carried out with some type of electronic digitizing system such as a digital voltmeter or a small computer data acquisition system. However, it is important to note that the considerations presented in this paper are equally valid for manually digitized data as read off of strip chart recorder tracings or even meters. The set of numbers resulting from the digitization process must be a n accurate and precise representation of the original analog signal. Without a careful consideration of the effects of digitization, a digitizer can easily be the weakest link in a complete instrumental analysis system. This paper is concerned with parameters of a digitizer that determine the accuracy and precision of digital data. Digital data should contain as much of the relevant information in the original signal as practically possible. Whatever drfinition of information is used, if the digital data can be converted back into a n analog signal that can be exactly superimposed on the original signal, then the digitized signal is an accurate representation of the original signal. Any difference between the two signals may be called digitization error. It is important to note that the conditions for accurate digitization depend on the information that is known about the form of the original signal and the type of information to be gained from the signal. For example, a signal that is known to be a straight line can be characterized by only two points, perhaps a few more if noise is present. T o prove that a signal is a straight line, however, is quite another problem. The same holds true for peak shapes. Prior knowledge that a peak shape is Gaussian will reduce the number of points necessary to characterize the peak as the points can be fitted to this known functional form. The accuracy and precision criteria developed in this paper are based on the more difficult problem of digitizing a signal about which there is little prior information. Usually something is known Present address, Department of Chemistry, University of Georgia, Athens, Ga. 30601. 518
ANALYTICAL CHEMISTRY, VOL. 45, NO. 3, MARCH 1973
about the form of the signal so the criteria developed here may be slightly conservative. Digitization involves two basic steps. The analog signal must be sampled a t finite intervals of time and each sample must be quantized into a finite number of digits. For example, a digitizer using a sample-and-hold input samples the signal and then the stored sample is quantized. In practice, quantization may occur before sampling. The signal may be continuously quantized by a voltage-to-frequency converter, for example, and then sampled by counting the frequency for a finite period of time. The sequence of the two steps is unimportant; they are mathematically commutable. The parameters of a digitizer discussed below include sampling interval and duration, quantization level and range, digitization and aperture times, and jitter. For consistency, the signal will have units of volts and will be expressed as a function of time in seconds. Errors inherent to the process of digitization will be treated here. Electronic accuracy, precision, and stability of particular types of digitizers are treated in References 1 and 2. SAMPLING INTERVAL The effect of sampling can be interpreted most clearly when the sampling process is represented by a linear mathematical operation. Bracewell (3) shows that a reasonable representation of a sample can be obtained by multiplying the signal by a delta function having infinitesimal width and unit area. The resulting delta function, although infinite in height, has an area equal to the value of the signal a t the sampled point. The process of sampling is depicted in Figure 1. The signal in Figure 1A is multiplied by the series of delta functions spaced At seconds apart in Figure 1B. The resulting series of delta functions, shown in Figure lC, may be called the digitized signal. The height of each arrow in Figure 1C is proportional to the area of the corresponding delta function. Normally, samples are taken a t constant intervals of time. Some signals, a n isothermal gas-chromatogram for example, might be sampled more efficiently at constantly increasing intervals. The effect of a varying interval may be handled as a constant interval problem by suitably transforming the time axis. Having a mathematically amenable representation of digital data, we are now able to consider methods by which a digitized signal can be converted into an analog signal comparable to the original signal. The method used to regenerate an analog signal from a digitized signal can affect digitization errors. For example, the analog signal could be approximated by horizontal straight lines drawn from one sample to the next. A better method would use sloping straight lines to join the sample points; better still, use a quadratic curve, then a cubic, and so on. (1) D. F. Hoeschele, Jr., “Analog-to-Digital and Digital-to-Analog Conversion Techniques,” John Wiley and Sons, New York. N.Y., 1968. (2) H . Schmid, E/ec/ro/lDesigu. 16 ( 2 5 ) , 49 (1968). (3) R. Bracewell, “The Fourier Transform and its Applications,” McGraw-Hill, New York, N.Y.. 1965.
LSECONDS
1
FOURIER
1
TRANSFORM
IF
El
ID LHERTZ Figure 1. Schematic representations of sampling in time and frequency domains. The symbol X represents multiplication and * convolution Each of these methods will give different sampling errors depending on the properties of the signal being sampled. These polynomial interpolation methods are suitable for signals with a n impulsive first, second, third, etc., derivative. An analog signal, o n the other hand, has smooth derivatives up to the order of infinity. Whatever method is used to reconstruct a signal, sampling errors should decrease as the sampling interval is made smaller. As the sampling interval decreases, the signal varies less and less over a period of one sampling interval. The interpolating function, therefore, can more easily form a n accurate representation of fhe signal from more closely spaced samples. Consideration of the amount of variation in a signal over periods of time leads to the representation of signals by sine and cosine functions with different periods or frequencies. The sine and cosine representation of a signal is given by the Fourier transform, the ordinate giving amplitude density (volts/hertz) and the abscissa giving the frequency (hertz) of the cosinusoidal components of the signal. An important property of the Fourier transform is that the inverse Fourier transform will give the original signal exactly. The Fourier transform of the signal in Figure 1A is represented in Figure 1 0 . (In Figure 1 and following figures, the negative frequency and the imaginary parts of Fourier transformed functions have been omitted for clarity of presentation.) The Fourier transform of the sampling function is shown in Figure 1E. This function has the same form as its representation in the time domain but the spacing of the delta functions is now l/Af. In the Fourier domain the sampling operation is a convolution of the respective functions (represented by the symbol *). The result of this convolution (Figure 1F) is the replication of the Fourier transform of the original signal at intervals of 1jAr. This function (Figure 1F) is exactly equivalent to the sampled set of data in the time domain (Figure 1C) as they are directly related through Fourier transformation. Consider now how the original signal can be recovered from the sampled set of data. Since the sampled set of data (Figure 1C) and its Fourier transform (Figure 1F) are exactly equivalent, the recovery operation can start from either function. Recovery is quite simple and is depicted in Figure 2 . The Fourier transform of the sampled set of data (Figure 2 4 is simply multiplied by a box-like truncation function (Figure 2B) that abruptly truncates at the point l / (2Ar). This generates a function (Figure 2C) that should be exactly like the Fourier transform of the original signal. Thus the original signal (Figure 1A) can be easily obtained by inverse Fourier transformation of the function in Figure 2 C
Figure 2. Schematic representation of recovery of a signal from samples
HERTZ
V
Figure 3. Error due to undersampling depicted in the frequency domain to yield the function shown in Figure 2F. In the time domain, the recovery operation is a convolution of the digitized signal (Figure 2 0 ) with the function sin (rt/Ar)/(at/At) (Figure 2E). After all these operations, the signals in Figures 1A and 2Fshould be superimposable. However, it is apparent from Figures 1 and 2 that, if the sampling interval, At, is too large, the recovery will be inaccurate. Convolution entails the addition of the ordinates of the replicates in Figure 1F. If the Fourier transform of the original signal is non-zero a t frequencies greater than l / (2Ar), the replicates will overlap and add as in Figure 3A. Replicates are shown as dashed lines in Figure 3A. The regeneration operation (top part of Figure 2) applied to Figure 3A results in a function (Figure 3B) that cannot accurately be transformed back into the original signal. Therefore, only signals having Fourier transforms with zero density a t frequencies greater than 1/(2Ar) can be sampled without error a t a sampling interval At. This is the Nyquist sampling theorem; the critical frequency, 1/(2Af), is called the Nyquist sampling frequency. ANALYTICAL CHEMISTRY, VOL. 45, NO. 3, MARCH 1973
519
,
B
Figure 4. Sampling error curves ( B , C, D J for a typical peak-like signal (A). Sampling intervals are 0.075, 0.093, 0.32, and 0.69 second for A , B, C , and D
bound t o be ddlkent for different regeneration methods. Nevertheless, since the diRerence between errors produced by reasonable methods is unlikely to be great, oiily errors weduced by convolution with sin (nt/At)/(7rt/At) will be considered in this paper. An example of the errors that can result from undersampling is shown in Figure 4. Curve A is a sign$ regenerated from samples obtained by scanning the Mg 2850 h line emitted by a hollow cathode lamp with a Heath Model EU-700 spectrophotometer. The scale has been adjusted so that the peak has unit height (1 volt) and unit width at half height (1 sec) and is sampled at intervals corresponding to 0.075 sec. The same curve was regenerated using larger sampling intervals and an error curve was obtained by taking the difference between the regenerated curve and the “original” curve (regenerated from the most closely spaced samples). Error curves B, C, and D correspond to sampling intervals of 0.093,0.32,and 0.69 sec. The difference between the Fourier tranforms of the regenerated signal (Figure 3 8 , solid line) and the original signal (Figure 3B, dashed line), defined here as the digitization error due to undersampling, E(f), is depicted in Figure 3C. The inverse Fourier transform of E ( f )will give the time domain version of the error, e ( t ) , which would be analogous to the error curves in Figure 4. Two quantitative indications of the magnitude of e(t) can be calculated from a knowledge of E(f). It can be shown (3) that the maximum absolute difference between the original signal and that regenerated from the sampled signal is less than or equal to the area of the absolute value of E(f): Maximum le(t)l
I
eMAE
(1)
IE(f)’df
(2)
where MAE =m: J 10 0
IC
I
ID
05
I
10
SAMPLING INTERVAL (sec)
Figure 5. Sampling error indicators for typical signal
and e l [ A E stands for maximum absolute error. A second useful error indicator is the integral of squared differences between the original signal and the regenerated signal. This error indicator can also be calculated from the error curve, E ( f ) , by use of the power theorem of Fourier transforms (3):
From the Nyquist sampling theorem then, one can be certain that sampled data can accurately represent an analog signal provided the sampling interval, At, is made small enough so that the Fourier transform of the signal is zero above the Nyquist frequency, 1/(2&). However, the Fourier transform of many common signals, a Gaussian peak for example, approaches zero only at impractically high frequencies. The rigorous application of the above sampling rule results in an impractical sampling interval. It is useful, therefore, to consider the type and extent of errors that can result when the above rule is relaxed and the signal is “undersampled.” It is important to realize that the regeneration method arising from consideration of the Fourier transform, a convolution with sin ( ~ r / A r ) / ( ~ r / 4 ris) not the only method of regeneration. Interpolation by means of polynomials, cubic or spline functions in particular, has received much attention in the field of computing science. Morrey ( 4 ) used a quartic equation to find the maximum of spectrographic peaks. Although the Nyquist condition applies whatever the regeneration method, digitization errors due to undersampling are
where e R I s stands for root-integraI-square error. Papoulis ( 5 ) has shown that e R I S defined by Equation 4, like e.\IAE, is an upper bound of e(t). Therefore, the error parameters eLI.%E and e R I S are both measures of the accuracy of the sampling process. Values of error indicators C l I A E (Equation 2) and e R I s (Equation 4) arising from undersampling Curve A of Figure 4 were calculated and are plotted in Figure 5. All values in Figure 5 are expressed as a percentage of the peak maximum. The upper solid line is ~ 1 . and 4 ~ the dashed line is e R I s . The values of sampling interval giving Curves B, C, and D in
(4) J. R. Morrey. ANAL.CHEM.. 40,905 (1968).
( 5 ) A. Papodis?Proc. IEEE. 54,947 (1966).
520
ANALYTICAL CHEMISTRY, VOL. 45, NO. 3, MARCH 1973
J-fim
e(t)Zdt
=m: J
iE(.f)iZdf
(3)
To relate the magnitude of the integral-square error to the magnitude of the maximum absolute error, it is convenient to multiply Equation 3 by 2/Ar and take the square root. eRIs
=
[’
At
J m
lE(f)lzd,f]l
’
(4)
I
Function
TRIANGLE
Table I. Common Forms of Peaks Form Fourier Transform
?I,
5
Triangle
10 , ti
Exponential Lorentz
exp[-2(ln 2) I (1 4P-1
+
>
It 1
1 ]
10
EXPONENTIAL
-
rSin(6)l2
L
nf
I
(In 2[1 f (.rrf/ln 2)2]]-1
Table 11. Maximum Sampling Interval Needed for a Given Accuracy Maximum error, 72 Sampling interval, sec ( % of peak Exponential Lorentz Gaussian Triangle height) 0.54 0.65 0.17 0.18 10 0.28 0.46 0.020 0.025 1 0.21 0.38 0,0022 0.0028 0.1 0.16 0.33 0 .OOO26 0.00032 0.01 ... ... 0.12 0.30 0.001
Figure 4 are indicated along the sampling interval axis of Figure 5 . The observed maximum absolute value of the difference between the regenerated curves and the original curve is also plotted in Figure 5 (lower solid line). Note that this observed maximum error is always less than either ear or e R I S , confirming that these are indeed upper limits. The data in Figure 5 indicate that a sampling interval of less than 0.6 sec would ensure a maximum sampling error (as defined by Equation 2) of less than 10% for a curve similar to Curve A in Figure 4. For a maximum sampling error of less than 1%, a sampling interval of 0.06 sec (17 points per width at half height) would be required. It is important to emphasize at this point that the seriousness of this type of digitization error can depend on the type of information to be extracted from the data. It must be remembered that the error indicators discussed above (CRIAE, e R I S , and maximum are just that; error indicators. They were developed under the constraint that the accuracy of digitization would be determined by the accuracy with which the digital data could be converted back to an analog signal that could be exactly superimposed o n the original signal. As such they indicate how much the regenerated peak could deviate from the original peak and they are expressed as a percentage of the peak height. However this does not necessarily mean that these are the percentage errors to be expected in measuring peak parameters such as position, half-width, or area. The percentage error, for example, to be expected in determining the area under the peak in Figure 4 for each sampling interval is certainly much less than the percentage errors stated above. In order to facilitate the choice of a sampling interval for a particular signal, the magnitudes of the above sampling error indicators (e.\fAE and eRIS) have been calculated for commonly occurring triangular, exponential, Lorentzian (Cauchy), and Gaussian peak shapes. The mathematical forms of these curves in both the time and frequency domains are given in Table I. The heights (in volts) and widths a t half height (in seconds) are normalized to unity. The sampling errors calculated by use of Equations 2 and 4 are plotted in Figure 6 as a function of the sampling interval. As in Figure 4, the upper solid lines are eJ1AE and the dashed lines are CRIS. The sampling intervals needed to ensure a maximum error (ChIAE,
I
1
OD01
001
01
1
w
L Y
I
I
I
2 Q w
0
19 0
0.5
SAMPLING
0.5
INTERVAL (sec)
Figure 6. Errors due to undersampling of common peaks expressed as a percentage of the peak maximum
,expressed as a percentage of peak height) of less than a specified value are given in Table 11. If a signal contains several peaks, the sampling interval required for the most narrow peak generally must be chosen. Also, if the experimental peak is a combination of the above peaks, the sampling interval needed for the required accuracy will be between those for the separate types. The above sampling error indicators have been calculated with the assumption that a sample has been taken exactly at the peak maximum. However, if the sampling function consisting of the set of delta functions in Figure 1B is shifted by a fraction of a sampling interval, the samples and consequently the sampling errors will be different. It is convenient here t o introduce the symbol 4 to represent shift in the sampling function as a fraction of a sampling interval, 4 = 0 giving a sample at the peak maximum. If the sampling interval, Af, is chosen so that the amplitude of the Fourier transform of the signal is relatively negligible at 1/(2Ar), then the amplitude should certainly be negligible at twice this frequency. Assuming this, and that the signal is symmetrical about its maximum as is true for the functions listed in Table I, it can be shown that for 4 = 0 Maximum le(t)( 5 and that for 4
2
=
where e R I S is defined by Equation 4. Equations 5 and 6 are applications of a n inequality given by Papoulis (5). Although the value of the maximum error indicators, eYAE and C R I S , are independent of 4 , the value of the actual error, e(r), ANALYTICAL CHEMISTRY, VOL. 45, NO. 3, MARCH 1973
521
signal is assumed. Kishimoto and Musha (6) have shown that, if only the height of a Gaussian peak is needed, the sampling interval must be less than 0.15 of the width at half height for an error of 0.1 %. However, by use of the method described here, the required sampling interval is between 0.36 and 0.42 of the width at half height. The apparent contradiction arises because in Reference 6 the peak height is calculated by fitting a quadratic curve to only three samples nearest the maximum while we use all the samples. Their use of fewer samples necessitates the smaller sampling interval.
A
T/ 2
0
-TI2
B
"
,.................1........
IrIKLI...
d.
SAMPLING DURATION
L-
-_c
T/ 2
Figure 7. Effect of truncation on recovery of a signal from samples (schematic)
IO
1
100
1000
TIME (peak-wtdth units)
Figure 8. Comparative shapes of common peaks normalized to unit height and unit width at half height is a minimum for all values of t when 4 = 0. When 4 = I,$ (sampled points spaced equally on each side of the peak maximum), it is easily shown that the equality in Equation 1 holds. That i s Maximum
le(t) = e J I A E
for 4
=
(7)
Also for 4 = l/2, the maximum error occurs at the maximum of the peak. When 4 = 0, the maximum error occurs down along the side of the peak. The effect of shifting the origin of the sampling function on the maximum absolute sampling error for moderate undersampling was simulated on a computer for the functions listed in Table I. For 4 = 0, the actual maximum absolute sampling error was approximately equal to liqeuAE and for 4 = I/?, it was approximately equal to e11.4~. At other values of 4, the error was between these two limits. That shift can cause the magnitude of the maximum error to vary by a factor of two is suggested also by Equations 5 and 6. Thus, slifting the origin of the sampling function with respect to the maximum of a symmetrical peak can cause the maximum absolute sainpling error to Gary by a factor of approximately two. The pairs of solid lines in Figure 6 indicate the range of maximum absolute sampling errors that can be expected from any value of shift. The upper solid line in Figure 6 is el12kE and the lower solid line is f?I\IAE. For asymmetric peaks, it can be expected that the range of maximum errors will be less than a factor of two and that the extrema will occur at shifts other than 0 and '1.. It was mentioned in the introduction that the sampling criteria developed here would be slightly conservative because a minimum of prior information about the shape of the 522
ANALYTICAL CHEMISTRY, VOL. 45, NO. 3 , MARCH 1973
In the above discussion of sampling, it was assumed that the signal was sampled from the infinite past to the infinite future. The effect of taking samples over a finite period of time will now be examined. The obvious effect is that the information in the parts of a signal not sampled will be lost. Sampling, therefore, should be continued until the information loss is insignificant. Such considerations are commonly trivial. Normally, one qualitatively establishes where the significant portion of a signal begins and ends, and then takes a few more samples to be sure. A more quantitative indication of where to begin and end sampling may be useful for some systems. It may be difficult to locate the points beyond which the signal is no longer significant in systems giving long trailing signals or oscillatory signals such as interferograms or free-induction decays as measured in Fourier transform spectrometry. The effect of taking only a finite number of samples of a peak-like signal and then attempting to reconstruct the signal from this finite set is shown in Figure 7 . The signal is sampled at a correct sampling interval but only from -T/2 to +T/2. The signal is assumed to be zero beyond these points. The set of samples and the envelope of the peak are shown in Figure 7 A . Figure 7 B is an expansion of Figure 7 A near the right truncation point. The dotted line is the original signal, the solid line is the truncated signal, and the dashed line is the regenerated signal obtained by convolving the samples with sin(nt/At)/(nt/Ar). It has been assumed that the signal is nearly constant in the vicinity of the truncation point. Two properties of the error due to truncation are evident from Figure 7B. (a) Without any prior information about the signal, the truncation operation will be reversible only if the signal is zero outside the sampling range. (b) The maximum error occurs near the truncation point and decreases in both directions away from the truncation point. Also, the maximum absolute error between the original signal and the reconstructed signal is 1.032 times the value of the signal at the truncation point and is independent of the sampling interval. In general, truncation error due to a finite sampling duration is localized to the vicinity of the truncation in contrast to the more widespread errors caused by undersampling. The value of the signal at the truncation points should be a good approximation of the maximum error due to truncation for the peaks listed in Table I. An idea of where these points occur can be obtained from Figure 8, which is a log-log plot of the positive time portion of some of the peaks listed in Table I. A maximum truncation error less than for example, would require sampling to be continued 1.6, 5, and 16 peak widths at half height on each side of the peak maximum for Gaussian, exponential, and Lorentzian peaks. Error bounds for more general signals that are asymmetric or do not approach a relatively constant value near the trunca(6) K . Kishirnoto and S. Musha. J . Chromafogr.Sci.. 9,608 (1971)
A
Table 111. Minimum Number of Samples Required for a Given Accuracy Maximum error, (72of peak height) Triangle Exponential Lorentz Gaussian
z
10 1
Figure 9. Quantization as an operation on the probability density function of a signal tion point are given by Papoulis (5). These bounds depend on sums of the samples of the signal outside the sampling range. They are time domain analogs of the bounds obtained in the frequency domain for errors due to undersampling. The importance of the effects of truncation depends on the final use to be made of the digitized signal. For example, in Fourier transform spectrometry, the Fourier transform of the signal is the final desired form. In this case, truncation of the original signal (interferogram) can strongly affect the line shape and resolution of the final optical spectrum (7). The criteria for minimal undersampling and truncation errors for peak-like signals can be combined to calculate the total number of samples required for a given accuracy by reasoning as follows. For peak-like signals, the maximum errors due to undersampling occur near the maximum of the peak, whereas the maximum errors due to truncation occur near the truncation points. The two different types of errors occur at different times in the signal and are practically independent. With the viewpoint that all parts of a signal are equally important, it is reasonable to set the same limits for truncation errors as for sampling errors. Therefore, the least number of samples required for a given overall maximum error can be calculated from the information presented in Table I and 11. These are reported in Table 111. The numbers in Table I11 apply only where it is desired to regenerate the entire signal within a certain error bound. Generally, however, large errors in the tail of a peak, where truncation error occurs, are more acceptable than large errors near the maximum of a peak, where undersampling error is concentrated. The data in Tables I1 and I11 can be used t o calculate the number of samples needed for various combinations of error. For example, suppose that it was desirable to sample a Lorentzian peak to 0.01 but that a 0.1 % truncation error was acceptable. The required number of samples is 150 X 0.21i0.16 o r 200 compared to the 630 needed for 0.01% in both undersampling and truncation errors. If only the height or position of a peak is needed, the numbers in Tables I1 and I11 are probably too large. O n the other hand, if the signal is corrupted by highly correlated noise, these numbers may be too small. Kelly and Harris (8) describe a method for calculating limits based o n the amount of information lost by undersampling and truncation. In addition to accounting for the effect of correlated noise, their method includes the effect of the type of information to be gained from the signal. QUANTIZATION
The second basic step of digitization is quantization. The two main variables in quantization are the quantization level,
0.1 0.01 0.001
6 40 360 3200
20 330 4500 5 1000
...
...
6 36 150 630 2600
3 6 9 11 14
q, and the dynamic range, R.
In quantization the y-axis is broken into quantization levels of width q volts. F o r any input voltage within the dynamic range of a digitizer, the output will be a representation of the input voltage to the nearest integral multiple of q. The error in each sample due t o quantization can be expressed as
e,(nAt)
=
y,(nAt) - y ( n A t )
(8)
where e,(nAt) is the error due to digitization a t the sample point nAt, y , is the digitized signal, and y is the original signal. If the digitizer rounds off to the nearest quantization level, /eqi will be less than or equal to l/. q volts. If it rounds off to the next lower quantization level, e q will be greater than o r equal to zero but less than q volts. Since the effects of nearest or next-lower round-off are essentially the same, only round-off to the nearest quantization level will be considered here. It is important to note that the errors introduced into the digitized signal by quantization depend upon the relative intensity of the noise in the original signal with respect to the size of the quantization interval. If, for example, the signal is a constant, noise free, dc level, the quantization error will be determinate and the quantized value of this signal could be in error by as much as 1/2 q volts. Fortunately, but in a sense somewhat ironically, real signals always have a random or noise component that can prevent the occurrence of the determinate error mentioned above. If the intensity of the noise is great enough relative to the size of the quantization interval, noise can prevent sequences of determinate error contributions of the same sign and consequently greatly reduce the overall effects of quantization errors. The study of quantization errors, therefore, reduces to the study of the effects of signal noise on quantization errors. The presence of noise allows an accurate determination of a constant voltage to less than 'I2 q. It will be shown below that the usual effect of quantization is the addition of a small amount of noise to a signal. The essential properties of noise are best described by a probability density junction (pdf) from which the probability that a signal voltage lies within a given range can be obtained. A pdf may be thought of as arising from a graph of the relative number of times that a signal voltage lies within specified ranges. A pdf obtained from a signal such as that in Figure 9A is illustrated in Figure 9B. Quantization may be viewed as an operation on the pdf of noise (9). The operation consists of sampling with delta functions at intervals of q , but because of round-off, the delta functions will not give the value of the pdf at the sampling point but the area of the pdf within i q (or within 0 to q) of the sampling point. Figure
-
(7) G. Horlick, ANAL.CHEM.. 43 (8), 61A(1971). (8) P. C. Kelly and W . E. Harris. ibid..43,1170 (1971).
(9) B. Widrow. Trutw. Amer. Inst. Elect. Engrs., 11, pt. 11, 555 (1960). ANALYTICAL CHEMISTRY, VOL. 4 5 , NO. 3, MARCH 1973
523
9C illustrates the result. The quantization operation may be described then as a convolution with a box function of width q followed by multiplication with a sampling function consisting of a series of delta functions spaced at intervals of q. As in sampling, we now attempt to make a reconstruction. Unlike sampling, however, since random noise is being considered, it is necessary to reconstruct only the pdf of the noise rather than the signal itself. We again make the reconstruction with cosinusoids. The quantization equivalent of the Nyquist sampling theorem may be stated as follows. The pdf of noise may be completely recovered without the help of any prior information provided the Fourier transform of the pdf of the noise convolved with a box function of width q is zero outside the frequency range i l i ( 2 q ) . The Fourier transform here is carried out along the y-axis so the frequency is actually a spacial or voltage frequency expressed in units of reciprocal volts. A pdf can be characterized by its first few moments well enough for most practical purposes. Moments may be calculated from derivatives of the Fourier transform of the pdf evaluated at zero frequency. Therefore, the Nyquist criterion need only be half-satisfied to recover moments. That is, the Fourier transform of the pdf must be zero outside the range iliq. In many experimental situations noise has a normal or Gaussian pdf. As shown above in the section on sampling, however, a Gaussian curve cannot satisfy the Nyquist criterion because its Fourier transform approaches zero only at infinity. Nevertheless, moments of noise can be calculated from quantized data with prior information that the noise is normal by essentially inverting the process of quantization. If it is assumed also that the noise is stationary so that the moments of the noise do not change in time, the pdf of the noise will be completely characterized by its mean and its autocovariance function ( I O ) . The mean may be assumed to be zero with little loss of generality. An autocovariance function, expressed as CJT) in units of volt squared, gives the correlation o r degree of dependence between points separated by T seconds. The separation, T, is often called lag. The autocovariance at zero lag is the variance, u 2 ,of the noise. It can be shown that the mean of quantized normal and stationary noise is zero. The mean of non-normal noise may not be zero. If the quantizer rounds-off to the next lower quantization level, the mean for normal noise will be - q/2. The autocovariance function of quantized normal and stationary noise is given by c,(T) = CJT)
+ 40-2
m
n=l
( - 1 1 ~ exp[--2(~c/q)z]
+
+ m21]
2~(7)nm
(9)
where the prime on 8 ' means omission of the value at n = 0. In Equation 9, p ( ~ ) called , the autocorrelation function, is a normalized form of the autocovariance defined by
third may be considered as noise introduced by quantization, Two important facts are apparent from Equation 9. One is that quantization noise adds to signal noise. The other fact is that the relative intensity of quantization noise depends primarily on the ratio of the standard deviation of the signal noise to the quantization level. Using only the most significant terms in Equation 9 the following approximate formulas for the autocovariance function of quantization noise, cJT), can be obtained for given values of the autocorrelation function of the noise, p(r). cy(.)
'v
-
' v 'V
+
q2/12 - (4u2 q2/s2)e-w,p ( 7 ) = 1 -4u2e-u + qZ(e-2u - 2e- '")/(2r2), ~ ( 7 = ) -4c2e-w
,
- qZ(e-Zw - &-3w)/2~2, p ( 7 ) = -',I2 --qz/12 - (40-2 - qz/n2)e-w, p ( ~ = ) -1
where w = 2 ( r 0 - / q ) ~ . It can be shown that the derivative of CJT) with respect to p ( 7 ) is equal to i m when p ( 7 ) is equal to i l . Therefore, the noise independent parts of quantization noise ( i q 2 / 1 2 in Equation 11) occur only when the autocorrelation function is identically equal to f1. In most practical situations p(7) is identically equal to 1 when lag is zero and is seldom equal to i l when lag is any other value. Therefore, cy(.) has a component at zero lag that is in effect a delta function. The power-density spectrum of quantization noise (the Fourier transform of cQ(r)with respect to T), consequently, has a white or flat component that extends to infinite frequency. When the data are sampled, however, only the portion between the negative and positive Nyquist sampling frequencies is observed. (There is no overlap or aliasing error because the autocovariance function, being a delta function, is not affected by sampling.) Moreover, unless the sampling interval is so small that adjacent samples are highly correlated [ p ( ~ h.11, ) the effect of quantization will be independent of the sampling interval. Apart from the noise-independent terms ( * q Z / l 2 ) , the terms in Equation 11 do not vary rapidly with small changes in the value of ~ ( 7 ) ;hence the equation can be used to sketch in the approximate autocovariance function of the quantization noise when the autocovariance function of the signal noise is known. When the standard deviation of the signal noise is large enough, the terms other than qz/12 can be neglected. If the standard deviation of the signal noise is onehalf of the quantization interval, the largest of these terms is about 9 % of the q2/12 term. Therefore, under most experimental conditions the effect ojquanrization is to add white or uncorrelated noise ro the signal noise. The variance of quantization noise is 42/12 (volt 2), where q is the quantization interval in volts. Dynamic Range. The dynamic range of a quantizer is simply the range of signal amplitude that can be accurately quantized. It is convenient to describe the dynamic range, R,in terms of the number of quantization levels, R/q. The number of quantization levels is often expressed as a number of bits, B, where
+
=
log, (Rlq)
524
e
ANALYTICAL CHEMISTRY, VOL. 45, NO. 3, MARCH 1973
(12)
because the cost of a quantizer is closely related to the number of bits. The effect of a limited range is, of course, trivial; the bottom or top of a signal will simply be cut off. But in considering the effect of a limited range there are two factors which are
(10) C. W. Helstrom, "Statistical Theory of Signal Detection,"
Pergamon Press, London, 1968.
(11)
-4u2e-w
B
The first term in the right side of Equation 9 is the autocovariance function of the signal noise. The second and
'iz
= 0
( 1 1) B. A. Rudenko, J . Chromarogr. Sci., 10,230 (1972).
often overlooked. One is that if the top of a peak-like signal is cut off, there may be enough information in the observed part to enable a n adequate reconstruction of the whole peak (11).
The other factor arises from the fact that, as seen in the last section, the effects of quantization depend on the intensity of the signal noise relative to the size of the quantization interval. If the standard deviation of signal noise increases with signal level, a quantization level adequate for low level signals may be unnecessarily fine for higher level signals. For example, consider a system such as the output of a photomultiplier tube where the noise increases as the square root of the signal level. Suppose that a dynamic range of 0 to 1000 mV is required and that the minimum standard deviation of the noise is 0.5 mV (minimum signal of 0.25 mV). A reasonable quantization level is 1 mV (twice the standard deviation of the noise). This means that a 10-bit quantizer is required. However, if the square root of the signal level is digitized, the range is reduced to 0 to 33.3 mV1i2. The minimum standard deviation of the square-root of the signal is 0.5 mVli? and thus only a 5-bit digitizer is needed for the square-root signal compared to 10 bits needed for the original signal. Thus if a constant quantization level with respect to minimum standard deviation is acceptable, 5 bits may be saved by a square-root transformation of the signal. Also if desired or necessary, the 5 bits saved in resolution can be traded off for an increase in the sampling rate as the digitization time of most analogto-digital converters is directly related to the number of bits. DIGITIZATION TIME
In the discussion of digitization up to this point, it has been assumed that a digitizer can sample and quantize the signal instantaneously. Real digitizers cannot do so and in this section the effects of a finite digitization time on the resulting digitized signal are discussed. It is useful at this point to have a qualitative feeling of the operation of several modern digitizers. The operation and properties of common digitizers are discussed by Malmstadt and Enke (12). Hoeschele ( I ) and Schmid (2) also discuss the accuracy and the design of digitizers. Digitization time is the time from the start of sampling to the appearance of the digitized signal at display or output terminals of the digitizer. Digitization time determines the minimum sampling interval of a digitizer. Conceptually, digitization time can be broken down into two parts, a sampling or aperture time which is the time during which the signal is “observed” a s explained below, and a quantization time, which is the actual time required to quantize a sample. I n a digitizer consisting of sample-and-hold amplifier followed by a successive approximation analog-to-digital (A-to-D) converter, these two times are distinctly separate. However in many digitizers they are not. For example, in a digitizer utilizing a voltage to frequency (V-to-F) converter, the aperture and quantization times can be considered to be equal. With a dual slope integrating A-to-D converter, the aperture time is constant but the quantization time depends on the magnitude of the signal. For a successive approximation A-to-D converter, which compares the input voltage with a reference voltage obtained by successively dividing the range by two, the aperture time is difficult to define but the quantiza(12) H. V. Malmstadt and C. G. Enke. “Digital Electronics for Scientists.” W. A. Benjamin, New York, N.Y . . 1969.
tion time is constant, The duration of the aperture time of a digitizer can markedly affect the accuracy of a digitized signal. Thus it is important to consider the effects of a finite aperture time o n the digitization process. The sampling operation for an ideal digitizer was assumed to be equivalent to multiplying the signal by a sampling function consisting of evenly spaced, infinitely narrow, delta functions. However, in order to take a sample of a continuous signal, a real digitizer must “observe” the signal for a finite period of time. This real sampling function can be called an aperture function and its width the aperture time. The effect of an aperture function on a signal is analogous to the effect of a slit function on a spectrum. As the aperture time increases, the signal as viewed by the digitizer is smoothed o r averaged over the aperture time. More generally, sampling with a finite aperture time may be represented by convolving the input signal with the aperture function and then sampling the smoothed signal with a n ideal digitizer. Signal distortion caused by the smoothing effect of an aperture function can be decreased by decreasing the aperture time. Some degree of smoothing may be desirable, though, if the signal contains undesirable high frequency components such as noise. Smoothing effectively attenuates high frequency components that would otherwise cause aliasing errors. Digitizing with a voltage-to-frequency o r any integrating type A-to-D converter with an aperture time of 0.5 second, for example, will attenuate frequency components near 60 H z by a factor of 100. If the aperture function is known and the aperture time is constant, distortion caused by the use of a wide aperture function may be corrected by deconvolution (13). One of the most versatile digitizing systems is a fast sampleand-hold amplifier followed by a successive approximation A-to-D converter. This combination provides a short aperture time that minimizes signal distortion and a short quantization time enabling relatively rapid sampling. The smallest aperture time at the present state of the art in sample-and-hold amplifiers is about 100 nsec and many successive approximation A-to-D converters can make a 10-bit conversion in about 10 psec although faster models are available. Thus a 10-psec sampling interval is easily obtained with such a system. In addition this digitizer can be used for much longer sampling intervals. The effective aperture time can be set by appropriate electronic filters before the digitizer o r a specific number of fast acquisitions can be integrated to define the aperture time. The aperture time and the effect of aperture time of a successive approximation A-to-D converter used without a sample-and-hold input amplifier is difficult to define. A typical 10-bit digitizer might take 1 psec per bit for a total digitization time of 10 psec. However, the result is not an average value of the input signal over 10 psec. If the input signal varies over more than one quantization interval during digitization, errors much larger than those due to quantization may result. In the worst case, the error can be as great as the difference between the value of the signal at the start of digitization and its value at the end. F o r example, consider a 5-bit successive approximation digitizer with a range of 0 to 32 volts and an input signal increasing linearly at a rate of 4 volts per digitization time. Suppose digitization was begun when the input was 15 V. The first approximation would be that the signal is less than 16 V. The final approximation would still consider the signal as less than 16 V. At that time, however, the signal would be 19 V and the digitized signal would be in error by 4 V. (1 3) P. C. Kelly and W. E. Harris, AKAL.CHEM., 43, 1184 (1 97 1).
ANALYTICAL CHEMISTRY, VOL. 45, NO. 3, MARCH 1973
525
Figure 10. Schematic representation of effect of jitter in sampling interval ( j ) and aperture time (0)
To ensure accurate digitization with a successive approximation A-to-D converter without a sample-and-hold amplifier, the ratio of sampling interval to digitization time must be large. For a sinusoidal signal, A sin (27rft), the maximum error with this digitizer is A sin (27rftd), where A is the amplitude a n d f i s the frequency of the signal, and td is the digitization time. Thus for a signal with a Nyquist frequency of 1000 Hz, and hence sampled at 500-psec intervals, the maximum error for the highest possible frequency component of the signal (1000 Hz) is 6 % of its amplitude if a stand-alone successive approximation A-to-D is used that has a IO-psec digitization time. This corresponds to a sampling interval to digitization time ratio of 50. This error will become progressively less for the lower frequency components that make up the signal. Digitizers using voltage-to-frequency converters generally require relatively long aperture times in the range of 0.1 to 10 sec. The long aperture time is needed to measure the output frequency of the V-to-F converter. The output frequency is counted and the resulting digital value is an average of the input signal over the aperture time. Unless the signal is changing slowly, distortion due to averaging may be severe. Nevertheless, if large aperture times can be accepted, this digitizer has the advantage of a large dynamic range. If the V-to-F converter counts at the rate of 10; Hz/V, a IO-volt input signal will give IO6 counts in 1 second which is equivalent to about 20 bits. The same converter would give only 1000 counts or IO-bit resolution in 1 msec. The above treatment of aperture and quantization time concerns the use of some common digitizers for the digitization of analog signals that are continuous in time. Several experimental systems of concern to analytical chemists are such that the parameter of interest is stepped in time. For example, some commercial monochromators use steppingmotor scanning systems. In a stepped system, the aperture time with respect to the parameter being stepped is zero. Digitizers with large aperture times can be used in stepped systems, therefore, without distorting the signal represented with respect to the stepped parameter.
at a precisely known time can give rise to what may be called jitter error when in fact the sample was taken a t a point slightly displaced from the assumed time. If a signal is constant, jitter would have no effect. If the signal was not constant, however, an error would occur that is approximately equal to the slope of the signal times the displacement. The displacement time in units of seconds may be considered as the value of jitter a t the assumed time, j(r). The standard deviation of jitter error would then be equal to the slope of the signal times the standard deviation of the jitter, uJ. Usually uI is a small fraction of the sampling interval specified by the Nyquist sampling theorem. If a digitizer has a finite aperture time, however, jitter error may be proportional to the magnitude of the signal rather than the slope. It is worthwhile, therefore, to examine in detail the effect of jitter as well as the effect of random variations in aperture time. The aperture function of most digitizers can be approximated well enough for most purposes by a box-like function. A sample at time t will then be the integral of the signal from time ( t - t a ) to t , where tu is the aperture time. This integration period is represented by line A in Figure 10. Subject to jitter in A t , j ( r ) , and jitter in t u , cr(r), a sample at time r will be the integral of the signal from time [t - fa j(t) a(t)] to time [t j ( r ) ] . In Figure 10, the integration period affected by jitter is represented by line B. Provided j and 01 are small relative to t,, error due to jitter, e3,will be
SAMPLING INTERVAL JITTER
A final practical problem associated with all types of digitizers and sampled data systems is sampling interval jitter. Sampling interval jitter is simply defined as random variation in the sampling interval (5). Normally in practice, one is forced to assume that the sampling interval is constant. Assuming a sample is taken 526
ANALYTICAL CHEMISTRY, VOL. 45, NO. 3, MARCH 1973
+
+
+
It is reasonable to assume t h a t j and a have Gaussian distributions with zero means and standard deviations u j and uu. Assuming also that y , j , and a are independent of each other, jitter will result in noise added to the sampled signal. Under these conditions, the autocovariance function of jitter noise, cj(r),will be
where 6(r)is a delta function. In Equation 14 it is assumed thatj(t) a n d j ( t t u ) are uncorrelated so that the autocovariance function of j (and a as well) may be taken to be a delta function. If noise in a signal is correlated with jitter (through mechanical vibration for example), a more elaborate model than that used to obtain Equation 14 may be needed to adequately explain the effects of jitter. A special case of correlation betweenj and Q arises when the aperture time of an integrating digitizer is approximately equal to the sampling interval. Then jitter in aperture time is equal to the difference between jitter in sampling interval of adjacent samples. Assuming that f a = At and the mean of the signal, 9,is relatively great (about ten times the standard deviation of the signal noise), the autocovariance function of jitter noise will be
+
c j ( r ) = ($)2[2S(r)
- 6(r
+ At)
- 6(~ - At)]
(15)
The Fourier transform of Equation 15 gives the power density spectrum ofjitter noise, cj(,f) = (g)2[1
- cos(2~Ao)l
Equation 16 provides a practical means for measuring the standard deviation of jitter, uJ. If the aperture time and sampling interval are approximately equal, Equation 16 shows that the noise spectral density due to jitter increases from zero at zero frequency to a maximum at the Nyquist sampling frequency. Therefore, by digitizing a constant signal, u 2can be determined from the component of the powerdensity spectrum of the signal that both increases with the
mean of the signal and varies with respect to frequency according to Equation 16 (13).
RECEIVED for review June 5, 1972. Accepted November 8, 1972. Financial support by the National Research Council of Canada and the University of Alberta is gratefully acknowledged.
Determination of Organic and Total Lead in the Atmosphere by Atomic Absorption Spectrometry Larry J. Purdue, Richard E. Enrione, Richard J. Thompson, and Barbara A. Bonfield Dicision of Atmospheric Surceillance, Encironmental Protection Agency, National Encironmental Research Center, Research Triangle Park, N.C. Concentrations of total lead in the atmosphere are determined by passing air through a membrane filter for collecting particulate lead and then through an iodine monochloride solution for collecting material which contains lead, probably organic lead species. The lead collected on the membrane filter is determined directly by atomic absorption analysis of acid extracts of the filters. The lead collected in the iodine monochloride solution is determined by extraction with ammonium pyrrolidine-dithiocarbamate and methyl isobutyl ketone followed by analysis on the atomic absorption spectrophotometer. The method is applicable to lead analyses when the organic lead concentrations range from 0.1 to >4.0 pg of organic lead/m3 of air. An atmospheric sampling study for organic lead indicates that the average level of organic lead at these sites i s about 0.2 pg/m3.
METHODSI N CURRENT USE for determining trace amounts of total lead in the atmosphere involve passing the atmospheric samples through a membrane or glass fiber filter for collecting particulate lead and then through a suitable absorbing reagent for collecting any lead compounds that pass through the filter. For the purposes of this paper, any lead compounds that pass through a membrane or glass fiber filter of 0.45-micron pore size are defined as organic lead. This includes volatile organic lead compounds such as tetraethyl and tetramethyl lead. Particulate lead collected by this procedure can be determined by direct analysis of acid extracts of the filters by either atomic absorption or emission spectrometry or by colorimetric dithizone procedures. The organic lead can be collected by use of solid scrubbers such as iodine crystals ( I ) or activated carbon (2), or by absorption in a 0.1M solution of iodine monochloride (ICI) (3). Organic lead collected on these absorbers has been determined by colorimetric dithizone procedures, which are time-consuming, technically complex, and often lack the sensitivity required for atmospheric sampling. This paper describes a method developed for use in the Division of Atmospheric Surveillance of the Environmental Protection Agency (EPA) for analysis of total lead in the atmosphere. The organic lead collected in IC1 solution is chelated ( 1 ) “Tentative Method of Test for Lead in the Atmosphere.” ASTM Book of Standards, Volume 23, American Society for Testing and Materials, Philadelphia, Pa., 1970. (2) L. J. Snyder, ANAL.CHEX,39, 591 (1967). (3) R . Moss and E. V. Browett, Aiinlyst (Loiido/i),9, 428 (1966).
with ammonium pyrrolidine-dithiocarbamate (APDC) and extracted into an organic solvent for analysis by atomic absorption spectrophotometry. Iodine crystals were not evaluated as a collection medium in this study because the repetitive shaking required is not practicable because of 24hour unattended sampling procedure used in the National Air Surveillance Network ( 4 ) . Activated carbon, a suitable collecting medium for 24-hour unattended sampling, could possibly be used as the collecting medium for this method with appropriate adjustments to the extraction procedure. EXTRACTION OF LEAD FROM ICL SOLUTION Determination of the amount of lead present in the IC1 absorbing solutions involves reducing the iodine with sodium sulfite and hydroxylamine hydrochloride, and adjusting the pH to 3.6 in order to transfer the APDC-lead complex quantitatively to an organic solvent. APDC was chosen as the complexing agent and methyl isobutyl ketone (MIBK) as the organic solvent because these compounds are widely used in organic solvent methods for atomic absorption spectrometry (S). MIBK is considerably soluble in reduced IC1 solutions (approximately 3z VjV). Methyl N-amyl ketone, which is less volatile and less soluble in reduced IC1 solutions (approximately 0 . 6 z VjV) may also be used as the organic solvent for this method. EXPERIMENTAL
Reagents. IODINEMONOCHLORIDE (1.OM). T o 800 ml of 25% (WjV) potassium iodide solution: add 800 ml of concentrated hydrochloric acid and cool to room temperature. Slowly add 135 grams of potassium iodate while stirring the solution vigorously and continue stirring until all free iodine has dissolved to give a clear orange-red solution. Cool to room temperature and dilute to 1800 ml. IODINEMONOCHLORIDE (0.1M). Dilute 100 ml of 1.OM iodine monochloride to 1 liter. This reagent is stable for an indefinite period under ambient laboratory conditions. HYDROCHLORIC ACID-NITRICACID MIXTURE.Add one volume of distilled constant boiling (about 19 hydrochloric acid to four parts 40% distilled nitric acid.
z)
(4) G . B. Morgan, C. Golden, and E. C . Tabor, J A P C A , 17, 300 (1 967). (5) W. Slavin, “Atomic Absorption Spectroscopy,” Interscience Publishers, New York, N. Y . . 1968. p 75. ANALYTICAL CHEMISTRY, VOL. 45, NO. 3, MARCH 1973
527