An introduction to signal processing in chemical measurement

Chemists: An Undergraduate Course in Simulations, Data Processing, and Visualization. Charles J. Weiss. Journal of Chemical Education 2017 94 (5),...
0 downloads 0 Views 6MB Size
An Introduction to Signal Processing in Chemical ~easurement T. C. O'Haver

University of Maryland. College Park, MD 20742 The interfacing of analytical instrumentation to small computers for the purpose of on-line data acquisition has now become almost standard practice in the modem chemistry laboratory. Usingwidely available, low-cost microcomputers and off-the-shelfadd-in components, it is now easier than ever to acquire large amounts of data quickly in digital form. In what ways is on-line digital data acquisition superior to the old methods such as the chart recorder? Some of the advantages are obvious, such as archival storage and retrieval of data and nost-run replottine with adiustable scale expansion. ~ v e more i imhrtant: howeve;, is the possibility of performing post-run data analysis and signal processing. Computer-based numerical methods can be used to reduce noise in data signals, improve the resolution of overlapping peaks, compensate for instrumental artifacts, test hypotheses, optimize measurement strategies, diagnose measurement difficulties, and decompose complex signals into their component parts. These techniques can often make difficult measurements easier by extracting more information from the available data. Many of these techniques are based on laborious mathematical procedures that were not practical before the advent of computerized instrumentation. It is important for all persons making measurements on chemical systems to appreciate the capabilities and the limitations of these modem signal processing techniques. The purpose of this paper is to give a brief introduction to some of the most widely used signal processing techniques and to give illustrations of their a~~lications in chemical measurement. In the chemistrv .. curriculum, signal processing may be covered as part of course on instrumental analvsis (1.2,.electronics for chemists (31, laboratory interfaiing (41,Aemometrics (5), or computer applications (6).

a

Srnoothlng In many experiments in physical science, the true signal amplitudes (y-axis values) change rather smoothly a s a function of the x-axisvalues, whereas many kinds of noise are seen as rapid, random changes in amplitude from point to point within the signal (7).In the latter situation it is common practice to attempt toreduce the noise by a process called smoothing. In sm&thing, the data pointsof signal are modified so that individual points that are higher than the immediately adjacent (presumably b&ause of noise) are reduced, and points that are lower than the adjacent points are increased. This naturaUy leads to a smoother signal. As long as the true underlying signal is actually smooth, then the true signal will not be much distorted by smoothing, but the noise will be reduced.

a

The simplest smoothing algorithm is the rectangular or unweighted sliding-averagesmooth; it simplyreplaces each point in the signal with the average of m adjacent points, where m is a positive integer called the smooth width. For a three-point smooth (m = 3):

forj= 2 ton - 1and where S, is thejth point in the smoothed signal, Ybthejth point in the original signal, and n is the total number of points in the signal. Usually m is odd. The reduction in random noise is approximatelythe square mot of m. The triangularsmooth is like the rectangular smooth, above, except that it implements a weighted smoothing function. The smooth width m is the half-width of the triangle. For a five-point smooth (m = 5):

(2)

for j = 3 ton - 2. Equations are similar for other smooth widths. This is eauivalent to two passes of an three-point rectangularsmooih. It is left tothebsertoselect the siooth width that eives the best trade-off between sienal-to-noise improvemekt and signal distortion. The opt?mum choice depends upon the width and shape of the signal and the digitization interval. For peak-type signals, the critical factor is the smoothing ratio, the ratio between the smooth width m and the number of points in the half-width of the peak (8).In eeneral. increasina the smoothine ratio improves the signal-to-noise ratio-but causes a rgduction in amplitude and in increase in the bandwidth of the peak. If it is important to retain the true peak height and width, then smooth ratios below 0.3 should be used. However, in quantitative applications, the amplitudereduction is not so important, because calibration is based on the simals of standard solutions or of standard addition solutions, and it is naturally assumed that the same signal processing operations will be applied to the samples and to the standards. In those cases smooth widths from 0.5 to 1.0 can be used to improve the signal-to-noiseratio further. An example of smoothing is shown in Figure 1. The left half of this signal is a noisy peak. The right half is the same peak after undergoing a triangular smoothing algorithm. The noise is greatly reduced, while the peak itself is hardly changed. Smoothing increases the signal-tenoise ratio and allows the signal characteristics (peak position, height, (continued on page A1481 Volume 66 Number 6 June 1991

A147

width, area, etc.) to be measured more precisely. There are manv aleorithms for smoothine simals:.thev" differurimarily in thiir tendency to distort The underlying signai and in their computation speed.

-

Differentiation

The symbolic differentiation of functions is a topic that is introduced in all elementary calculus courses. The numerical differentiation of digitized signals is an application of this concept that has many uses in analytical signal processing. The first derivative of a signal is the rate of change ofy withx, that is, dy/dx, which is interpreted a s the slope of the tangent to the signal at each point. The second deriuatiue d2yldxZis a measure of the curvature of the signal, that is, the rate of change of the slope of the signal. It can be calculated by taking the f m t derivativeof the first derivative. It is commonly observed that differentiation degrades signal-to-noise ratio unless the differentiation algorithm is carefully optimized for each application. Numerical algorithms for differentiation are as numerous a s for smoothing and must be carefully chosen to control signal-to-noise degradation (9). Aclassic use of sewnd differentiation in chemical analysis is in the location of enduoints in potentlometric titrations. Figure 2 shows a pH iitration c k e of 20 mL of M solution of a very weak acid (K, = 2 x with M NaOH (with volume in milliliters on thex-axis and pH on the Y-axis).

1

F ,.

,. ..,, ,,. ,.,

, .,.,,

.,

,. ....-.-, .--... .

..,...

The endpoint is usually assumed to be the point of greatest slope; this is also an inflection point, where the curvature of the signal is zero. With a dilute weak acid such as this, it is difficult to locate this point precisely from the original titration curve. The second derivative of the curve is shown in Window 2 on the right in Figure 2. The zero crossing of the second derivative, which corresponds to the endpoint, falls a t 19.3mL; the "theoretical" endpoint is 20 mL. In spectroscopy, particularly in infrared, UV-visible absorption, fluorescence, and reflectance spectrophotometry, differentiation of spectra is a widely used technique, referred to as derivative spectroscopy (10-12). Derivative methods have been used in analytical spectroscopy for three main purposes: (a)spectral discrimination, as a qualitative fm~erurintinptechniaue to accentuate small structural diffe;e&es between neirly identical spectra; (b)spectral resolution enhancement. as a technioue for increasine the apparent resolution of dverlapping spectral bands in order to determine the number of bands and their wavelengths more easily; (c) quantitative analysis, as a technique for the correction for irrelevant background absorption and as a way to facilitate multiwmponent analysis. A very useful property of the derivatives of peak-type signals is that the amplitude of the nth derivative of a peak is directly proportional to the amplitude of the peak and inversely proportional to the nth power of its width (13). I t follows that differentiation may be employed as a general way to discriminate against broad spectral features in favor of narrow-peak components and that the discrimination will increase with derivative-order.This is the basis for the application of differentiation as a method of correction for background signals in quantitative spectrophotometric analvsis. It can be ex~ectedthat differentiation will in general help to discriminate relevant absorption from these sources ofbaseline shiR. An obvious benefit ofthe suuuression of broad background by differentiation is that ;ariations in the background amplitude from sample to sample are also reduced. This can result in improved precision or measurement in many instances, especially when the analyte signal is small compared to the background and if there is a lot of uncontrolled variability in the background. An example of the improved ability to detect trace component in the presence of strong background interference is

half is

the same peak after undergoing a smoothing algorithm.

rnyura r . IIIC styrw cr~alalt la trls pn auauurl w r v e ur a uww solurvx8 "8 a vary wean ac~o w w a strong vase, want vulurne in rr~w~rers on me x-axis and pH on the y-axis. The endpoint is more easily located as the zero crossing in the second derivative,d2 pH/d Pshown on the right

A148

Journal of Chemical Education

shown in Figure 3, in which absorbance (Ax 10.1) is plotted versus wavelength (nm x 102). In the figure, the spectrum on the leR is the optical absorption spectrum of an extract of a sample of oil shale, a kind of rock that is a source of petroleum. Samples of this type usually exhibit two absorption bands, at about 515 nm and 550 nm, that are due to a class of molecular fossils of chlorophyll called porphyrins. (Porphyrins are used as geomarkers in oil exploration.) These bands seldom stand d a backmound about because thev are s u ~ e r i m ~ o s eon sorption caused-by the extrading solvents a& by other com~oundsextracted from the shale. The broad. s l o ~ i n a backfp,und obscures the peak and makes qua"tit$tive measurement verv difficult. To obtain the soectrum of the shale extract without the background, one could simply subtract the spectrum of an extract of a similar but nonporphyrin-bearing shale if such a sample were available and if the background spectra of the porphyrin-bearing shale samples were reproducible in intensity. An alternative approach, which depends less on the stabili* of the background spectrum, is-based on derivative spectroscopy (14). The fourth derivative of this spectrum is shown on the right in Figure 3. The sloping background, because it is much broader than the analyte peaks, has been almost

comdetelv s u ~ ~ r e s s eand d the analvte peak now stands out Elearl$, fahlitating measurement.-~milierpeaks at 515 and 580 n m are now visible in the derivative s~ectrum. This use of signal differentiation has bewmeLkdelyused in auantitative s~ectroscoov.~articularlvfor aualitv wntroiin the pharmkeutical {idistry (15).Ih thatipplieation the analvte would tv~icallvbe the active inmedient in a pharmacmtical prep&ation, and the background interferences might arise from the Dresence of fillers. emulsifiers. flavoringor coloring agents, buffers, stabililzers, or othe; excipients.

expresses Y as a polynomial inX, for example as a straight line (Y=CI+c& where c ~ ithe s intercept and czis the slope), or quadratic (Y = cl + c X + c2?) or higher order polynomial. Although it is possible to estimate slope and intercept by visual estimation and a straight edge, the least-squares method is more objective and more general. In some cases a nonlinear relationship can be transformed into a linear one by means of a coordinate transformation (e.g., taking the log or the reciprocal of the data), and the least-squares method can be applied to the resultinglinear equation. For example, the points in Figure 4 are from computer-generated exponential decay with pmportional random noise added (X=time, Y = signal intensity). This signal has the expected mathematical form y = cl exp(czX),where clis the Y-value a t X = 0 and czis the decay constant. (In this example the "true" values of cland cz are both 1.0.) By taking the natural log of both sides of the equation, we obtain in (Yl = cl + C& This linear equation can be fit by the least-squares method in order to estimate eland cz (but only approximately,because the log transformation affects the weighting of the errors due to random noise). The best fit equation, shown by the solid line in the figure, is Y = 0.9897 exp(-0.98896X). (Continuedon page A150)

Window 4 26 pointr. X:

1.0e-1

2.6ei0

Y: 6.8e-2

1.OetO

Curve Fitting

The objective of curve fitting is to fmd a mathematical equation that describes the signal and that is minimally influenced by the presence of noise. The simplest and most common approach is the linear least-squares method, which is capable offmding the coefiicients ofpolynomial equations that are a %est fit" to the data. A polynomial equation

"-.".

......l..r-.." ,..,-.. l"","",,.("lll" l,.,-, Yr",."",""l ,1,1, data set (points)in order to estimate the decay constant.

Volume 66 Number 6 June 1991

A149

Fourier Filter The Fourier filter is a type of filtering or smoothing function that is based on the frequency components of a signal. It works by taking the Fourier transform (18, 19)of the signal, then cutting off all frequencies above a userspecified limit, then inverse- transforming the result. This is a special case of the more general class of techniques based on correlation (7, 16, 17). The assumption is made here that the frequency components of the signal fall predominantly a t low frequencies and those of the noise fan predominantly a t high frequencies. The user tries to find a cut-off freauencv that will allow most of the noise to be elimina&d'while not distorting the signal significantly An examole of the aoolication of the Fourier filter is ~ v e in n . Figu;e 5. ~~~

~~

.

Acknowledgment The figures in this paper are screen images from SPEC-

TRUM (S~cnalProcessine for Exoerimental Chemistw %&inb &d ~ e s e a r c h ~ ~ ~ v e r sMaryland), it~of a computer program developed to teach signal processing to chemistry students. The program includes many other signal processing operations in addition to ones described in this paper. SPECTRUM presently is available in a Maein-

12points. X:

.I..

Window 1 l.Oe+O 5.let2

. , . . . , . . . . , . . , , ,I,, 8

1

,

1.00

57poina. X:

Y: -2.6e+l

2.00

3.00

Window 1 O.Oe+O 6.4e+l

4.00

,

,

2.71

Literature Cited .

.

ph& 1 9 6 pp 13-76. 2. Christian, Gary D.; O'hilly, Jamea E. Imtmmrntd AMlyaia, 2nd ed.; AUyn d Bacon: Boston, 1986; pp846851. 3. Malmafadf,H o w d V.:Enke. C M e 0.;Horlieh, Gary.E k t m i e M m s v ~ ~ n t a fop Scientists; Benjamin: Menlo Park. 1974: pp 616870. 4. Gat-. Stephen C.; Beeker, Jordan. lobmtary Automntim wing the IBM PC: ~ ~ ~ t i - E~ ~ n ~:c m , NJ,~ 1989. I ~ ~ ~ ~ 6. Sharaf, Muhammad A: Illman, Debarah L.; Kowalald, Bmce R. Chamomrtrus: Wiley: New York. 1986. 6. Kmutll, Robert T.; Dltillo, John T.; Small, Gary W. 'Signal Pmessing Techniques for Remote lntrared Chemical Sensing'; in Compuf~rEdtmeod Anolyliiil SFt m m o v : Meuzelasr.HenkL. C. Ed.: Plenum: New Yark.. 1990:Val. 2. . I. H ~ ~ ~ ~ , ~ , G . ' cM Lm . A. 1m.44, ~ ~ I . 8UL. 8. Enke, C. G.; Nieman, T.A . h I . Chem. 1978,48,705A. 9. (YHaver,T.C.; Bedey, T.Anol. Cham. 1881,53,18761818. 10. O'Hawr, T. C.;Green, G. L. ilm. Lob. 1975,7,15. 11. 0'Hawr.T.C.A d . C h . 1879,519lA. 12. 0'Haver.T.C.Clin. Cham. 25 1879,1548 13. O'Hsver. T. C.PmcAMlyt.Diu Cham h. 1983.19,2?-26. 14. Freeman; David H. O'Haver, T. C. Emrm d F w 8 1980,688494. 15. Hargs,L G.;HoweU,J.A. Awl. Chem 19W, 60, 131R146R. 16. HieRje, G. M.Anol. Cham. lST2,44,81A. 11. Horlick, Gary.Ano1.Chem 1913,45.319. i s . norlia, ~ ~ ~ .cham. a n1812, ~ 44,943. r 19. Grlflithe. Peter R. TmmfDlm Techniqws in Chamistry; Plenum: New

57 uoine. X:

Window 2 O.Oe+O 2.6e+2

Y: 8.4e-2

3.54

I

0.00

5.00 x lo2

Y: 4.2e-2

tosh version (Mac I1 prefered) from Office of Technology Liaison, Lee Bldg., Room 2114, University of Maryland, College Park, MD 20742,301-405-4209, FAX 301-314-9569. A version for IBM-PC and compatibles running Windows 3.0 will be available in 1991.

1.71

,12poin?3.X:

0.50

1.00

1.50

Window 2 l.Oe+O 5.le+2

2.00

2.50 x lo2

Y: -l.le+O

4.1

1.00

Figure 5. The signal at the top left seems to be only random noise, but its power spectrum (top right) shows that high-frequency components dominate the signal. The power spectrum is expanded in the x and y directions (bottom left) to show more clearly the low-frequency region. Working on the hypothesis that the components above the 20th harmonic are noise, the Fourier filter function can be used to delete the higher harmonics and to reconstruct the signal from the first 20 harmonics. The result (bottom right) shows the signal contains two bands at about x = 200 and x = 300 that are totally obscured by noise in the original signal. A150

Journal of Chemlcal Education