Critical evaluation of curve fitting in infrared ... - ACS Publications

Maximum Spectrum of Continuous Wavelet Transform and Its Application in Resolving an ... Combined deconvolution and curve fitting for quantitative ana...
0 downloads 0 Views 993KB Size
LITERATURE CITED (1) G. C. Levy and G.L. Nelson, "Carbon-13 Nuclear Magnetic Resonance for Organic Chemists", Wiley-Interscience, New York. 1972. (2) J. B. Stothers, "Carbon-13 NMR Spectroscopy". Academic Press, New York, 1972. (3) T. D. Brown, Ph.D. thesis, University of Utah, 1965. (4) R. Hagen and J. D. Roberts, J. Am. Chem. SOC.,91,4504 (1969). (5) T. T. Nakashima and G.E. Maciel, Appl. Spectrosc., 26, 220 (1972). (6) J. L. Sudmeier and C. N. Reilley, Anal. Chem.. 36, 1698 (1964). (7) J. L. Sudmeier and C. N. Reilley, Anal. Chem., 36, 1707 (1964). (8) W. J. Horsley and H. Sternlicht, J. Am. Chem. SOC., 90, 3738 (1968). (9) J. Feeney, P. Partington, and G. C. K. Roberts, J. Mag. Reson., 13, 268 (1974). (IO) F. R. N. Gurd, P. J. Lawson, D. W. Cochran. and E. Wenkert, J. Biol. Chem., 246,3725 (1971). (11) K. F. Koch, J. A. Rhoades, E. W. Hagaman, and E. Wenkert. J. Am. Chem. SOC.,96, 3300 (1974). (12) I. Morishima, K. Yoshikawa. K. Okada, T. Yonezawa, and K. Goto, J. Am. Chem. SOC.,95, 165 (1973). (13) H. T. Clarke, H. B. Gillespie, and S. 2 . Weisshaus. J. Am. Chem. SOC., 55, 4571 (1955). (14) R. L. Dannley, M. Lukln, and J. Shapiro, J. Org. Chem., 20, 92 (1955). (15) D. M. Grant and E. 0. Paul, J. Am. Chem. SOC.,86, 2984 (1964).

(16) L. P. Lindeman and J. Q. Adams, Anal. Chem., 43, 1245 (1971). (17) J. D. Roberts, F. J. Weigert, J. I. Kroschwitz, and H. J. Reich, J. Am. Chem. SOC.,92, 1338 (1970). (18) H. Eggert and C. Djerassi. J. Am. Chem. SOC.,95, 3710 (1973). (19) E. Lippmaa and T. Pehk. Eesti NSV Tead. Akad. Toim., Keem., &ob, 17, 210 (1968); as cited in reference (3, p 144. (20) J. A. Pople and M. S . Gordon, J. Am. Chem. Soc.. 69, 4253 (1967). (21) P. D. Ellis, G. E. Maciel, and J. W. Mclver, Jr., J. Am. Chem. SOC.,94, 4069 (1972). (22) G. E. Maciel, J. L. Dallas, R. L. Elliott, and H. C. Dorn, J. Am. Chem. SOC.,95, 5857 (1973). (23) R. Ditchfield, Chem. Phys. Led., 15, 203 (1972). (24) R. Ditchfield, J. Chem. Phys., 56, 5688 (1972). (25) A. P. Zens, P. D. Ellis, and R. Ditchfield, J. Am. Chem. SOC.,g6, 1309 (1974).

RECEIVEDfor review June 13, 1975. Accepted July 18, 1975. Financial support of this work from the Army Research Office (Durham), National Institutes of Health, and National Science Foundation is gratefully acknowledged; the XL-100 NMR used here was purchased, in part, with funds from NIH and NSF.

Critical Evaluation of Curve Fitting in Infrared Spectrometry B. G. M. Vandeginste' and L. De Gaian Laboratorium voor Instrumentele Analyse, Technische Hogeschool Delft, The Netherlands

An investigation is undertaken to obtain quantitative results from infrared absorption bands through curve fitting. The influence of several critical parameters, such as the degree of overlap,,the number of nonresoived bands in the profile, and the determination of the base-line position, is evaiuated. Results of theoretical as well as experimental spectra indicate that a good fit does not guarantee a correct recovery of quantitative data. The peak-find procedure described gives the initial values for the parameters to be used In the curve fit program. The theoretical limit to detect overlapping peaks by the peak-find program is calculated. Conditions which are to be fulfilled in order to obtain reliable results from the fit of infrared data are formulated.

At present, curve fitting methods are widely applied for the determination of the area or the height of overlapping bands in a composite profile. With the development of computers, extensive calculation facilities became within reach and digital curve fitting methods became a standard method for data handling in chromatography, UV-VIS spectrometry, IR spectrometry and the like. Because curve fitting requires a mathematical model to describe the shape of the bands, a number of publications have appeared that give more or less useful models for various methods of analysis including IR spectrometry (1-7). Also, computer programs were published to perform the calculations, and the most complete publications in this field are the computer programs of Jones (8-10) for the processing of infrared data. Simultaneously, however, some investigations were undertaken into the possible sources of errors in curve fitting. Anderson et al. (11-13) analyzed the influence of the degree of overlap of Gaussian bands and concluded that accurate results ( ~ 1 %are ) obtained only if the distance between the two peaks is greater than twice the standard deviation. Smaller separations gave rise to errors due to small Present address, Department of Analytical Chemistry, Faculty of Science, Catholic University Nijmegen, The Netherlands. Author to whom correspondence should be directed. 2124

differences between the actual shape and the mathematical model used. Audo et al. (14, 15) investigated the influence of a shifted base line and the influence of small interfering peaks upon curve fitting results. Perram (16) and Davis (17) drew attention to the fact that the numerical decomposition of a structureless contour is not unique. Therefore the results of such a process provide no evidence as to the number or nature of the component contours present. Beacham and Andrew (18) concluded that if there are more than three bands in a profile with only two inflection points, many equivalent solutions can be expected from the curve fitting procedure. Pitha and Jones (19) warned against attaching a physical meaning to the parameters obtained by curve fitting because many sets of parameters can give a close fit of the profile. These conclusions, important as they are, are all qualitative in nature, and a true quantitative examination of the influence of possible sources of errors and an analysis of the limits of curve fitting have never been made to the authors' knowledge. The present investigation tries to bridge this gap, starting from simulated as well as experimental spectra. One of the important sources of errors is the number of bands in a profile with only two inflection points. When this number is too great, ambiguous results are obtained, even when the exact number of bands is known. Another aspect is the influence of the accuracy of the initial values of the parameters in the model upon the curve fitting results. Schwartz (20) investigated the influence of badly chosen initial values upon the progress of the calculation (divergence or convergence of the calculations), but not upon the results of those calculations. If those initial values cannot be formulated beforehand from theoretical knowledge, they must be derived from the experimental spectrum through peak-find procedures. It is then important to know the limiting ability of the peakfind program to detect overlapping bands in a complex system or to formulate a procedure deriving the exact number of peaks from the curve fitting results themselves.

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13, NOVEMBER 1975

Another possible error in curve fitting is an ignorance of the base-line position. Here we examined the results for two cases: a) the base-line parameters are included in the curve fitting procedure, and b) the base line is independently derived from a base-line detection procedure. Finally, the differences of the results obtained between curve fitting of simulated and real spectra give an indication of the influence of the difference between the actual shape of an IR band and the mathematical model used.

with b G = k b L , then an infrared spectrum with N bands has 4N 2 parameters, e.g., for each band, the parameters A L , ~umax,i, , bL,i, A G , ~and , finally the two base-line parameters. Calculations. Curve fitting procedures are based on the least squares method. A function is fitted through a set of data points. The parameters of the function are adjusted, so that the criterion

+

C (Ti - T i , c a l c d ) 2= minimal Mathematical Model. Collisions between molecules in liquids give rise to a Lorentzian shape infrared absorption band. However, other effects, such as rotational fine structure, hydrogen bridging, and the possible existence of multiple conformations, affect the shape of the absorption band, so that a true Lorentzian shape is not always encountered ( 2 ) . The mathematical model in curve fitting can be given in the form of an analytical expression or function, or in the form of a digital function. A digital function is used in the case where the analytical function becomes too complex for the description of the bands, e.g., asymmetric bands in chromatography (12). In this case, assuming that over a small region of the chromatogram, all peaks have the same shape, good results were obtained with a single digital function to describe closely spaced peaks. In infrared spectrometry, however, adjacent absorption bands can differ considerably in shape so that a single digital function is not useful. Over the years, a number of analytical expressions have been proposed for the description of infrared bands. Most of them are symmetric such as a Lorentzian function or combinations of Gaussian and Lorentzian functions. Asymmetric functions have been proposed only occasionally (1). Jones (21) made a thorough comparison of several functions for fitting an infrared absorption band. His conclusion was that the product of a Gaussian and Lorentzian function or the sum of a Gaussian and Lorentzian function were the most appropriate expressions to describe these simulated bands. For the sum function the expression is:

+

T ( v )= exp 2 . 3 ( A ~ [.t1 bL2(v - Y ~ , , ) ~ ] - ' AG exp -[(v = exp 2.3A(v) ( 1 ) where T ( v ) = Transmittance, A L = peak height of the Lorentzian function, vnlax = peak position, bL = 2 divided by the full width at half height of the Lorentzian function, bG = 2 m divided by the full width at half height of the Gaussian function, A G = peak height of the Gaussian function, and A ( v ) = absorbance. Pitha has shown ( 19) that, without significant deterioration of the results, a parameter reduction from five to four parameters in the above function can be obtained by taking a fixed ratio between the width at half height of the Gaussian and Lorentzian function e.g., (bG = 0.8 bL). For the fit of the base line, functions with one (a straight horizontal line (7)) to three parameters (14, 22, 23) (a parabola) have been proposed. We incorporated a function with two parameters in our model, so that an infrared spectrum with N bands is described by: i=l

Ab)

+ + p ( -~ CY

(3)

i

CURVE FITTING

(2)

where a = the base..line displacement in absorbance, p = the regression coefficient of the base line, and v o = the wavenumber at the beginning of the spectrum. If Ai is a sum of a Gaussian and Lorentzian function,

is fulfilled. For the complex functional formulation in Equation 2, Equation 3 leads to a set of nonlinear equations, which cannot be solved analytically (23, 24). Therefore, the usual approximation is made, to apply an iterative damped least squares method, that requires initial values for the parameters of the function involved. A calculation method developed by Jurs (25), in which the functions are transformed by adequate substitutions into functions which give a set of linear equations by applying Equation 3 is not useful in the case of overlapping bands. In the present investigation, the iteration method developed by Meiron (26) is used and the the curve fitting program is taken from Jones (7, 9), who investigated several mathematical methods for curve fitting. This program is based upon the iteration formula:

= xm

X,+I

- (B, + pC,)-'

X

G,

(4)

where xm+l = value of the parameter x of the model, calculated in the (m 1)th iteration. B, = matrix of the partial derivatives of the function using parameter values from the mth iteration. C, = as matrix B,, but with the off-diagonal elements zero. p = damping constant, and G, = matrix of the residual differences between observed and calculated data points from the mth iteration.

+

PEAK-FINDING Theory. In order to start the curve fit iteration, estimates are needed of the following parameters: peak position, peak height, ratio Gauss-Lorentz height and the halfband width. Obviously, the initial values can be taken from a peakfind procedure. Different methods have been proposed to detect peaks and valleys in a spectrum. Jones (8) uses a method based upon the first derivative of the spectrum. Others (27-29) propose peak-finding methods based upon the second or third derivative. Usually, however, in infrared spectra, the signal-to-noise ratio, which deteriorates upon differentiation of the signal, precludes the use of even the second derivative. Smoothed derivatives, calculated with the method described by Savitzky and Golay (30),can be used to advantage. Also, improvement of the signal-tonoise ratio can be achieved by Fourier Transformation of the spectrum. But even when Fast Fourier Transformation (FFT) methods are used, computer time becomes very long. We preferred a peak-find procedure based upon the location of the first moment of the negative area of the second derivative. In comparison with the minimum of the second derivative, this first moment is less influenced by noise. Moreover, from the first derivative, no estimate can be made of the halfband width and the possibility of detecting overlapping bands is less than when the peak-find procedure is based upon the second derivative. The first moment of the negative area of the second derivative is given by:

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13, NOVEMBER 1975

2125

where urn = position of the first moment of the negative area, vi = wavenumber of point i, Ti = Transmittance of point i , and j-k = wavenumber region, where the second derivative is negative. For symmetric peaks, the peak position coincides with the first moment of the negative area of the second derivative or with the center of gravity of this area. For asymmetric peaks, this is no longer true, but it is questionable whether the peak maximum forms a good indication of the position of such a peak. The known relation between the halfband width and the measured distance between the two inflection points (2 u ) of a symmetric band, enables us to estimate the halfband width of a band, e.g., for a Lorentzian function Au1/2 = 1.73 X 2 u. The absorbance of the band maximum is estimated by the conversion of the transmittance at the peak position found by the peak-find procedure. The Gauss-Lorentz ratio cannot be found from the peak-find procedure, but it will be shown that this ratio is not important so that the peak absorbance can be taken as AL, while AG can arbitrary be taken as zero (Equation 1). In the present procedure, based upon the second derivative, each profile with only two inflection points is considered as 8 single band. We can define a detection limit for the peak-find procedure as the smallest distance between two overlapping bands that will still be separately detected. Westerberg (29) calculated detection limits for two equally wide overlapping Gaussian functions, based upon the second as well as the first derivative. The latter is also the shoulder limit, which is defined as that separation where two zero points of the first derivative of two approaching bands coincide (no minimum between both peak maxima). In order to appreciate the present procedure, Westerberg’s calculations of the detection limit and the shoulder limit were extended to the case of two Gaussians or Lorentzians of unequal bandwidth. Calculations. Shoulder Limit. 1) Calculation of the shoulder limit for two overlapping Gaussian functions with unequal halfband widths. Two overlapping Gaussian functions can be represented as:

+

y = A G exp(-b2Gl(ln ~ 2)(x- u 1 I 2 ) AG*eXp(-b2c2(h 2)(x

RELATIVE

y = A L , / ( ~+ bL12(x- d 2+)A L ~ /+( ~b 2 L Z b

I

y‘ = y” = 0 or

= 7” = 0

(8)

Also the equation 7‘ = 0 has only two solutions. After elimination of R in Equation 8, an equation in { is obtained, which is difficult to solve analytically. Therefore, a simulation procedure is used to determine for given values of R , p and 6 the number of solutions of the equation 7’ = 0. For a set of (R,p) values, 6 is varied step by step starting with a high value, that gives three solutions for 7’ = 0, until 6 reaches a value giving just one solution for 7’ = 0. Within the interval of 6, where the number of solutions 2126

* ANALYTICAL CHEMISTRY, VOL. 47, NO.

-

~ 2 ) (9) ~ )

Applying the same substitutions as in the case of Gaussian functions, we find: 1 R (10) 1+ 1+ 6 % - (26/1 + ,PI12 Applying the condition 7’ = 7’’ = 0 as above, and after elimination of R$, we obtain a fourth degree equation in {, that can be solved with Bairstow’s iterative method (31). Figure 2 shows the results of those calculations, which can be interpreted in the same way as in the previous case. Detection Limit. The detection limit is obtained from the condition 7”‘ = 7” = 0. This leads to an equation in {of the sixth degree, for overlapping Gaussian as well as for overlapping Lorentzian bands, which is difficult to solve. Instead of this, a simulation procedure is used, based upon the fact that in the detection limit the equation 7’‘ = 0 has three solutions. For a given set R , p and 6, a simulation pro7=-

In the shoulder limit, an inflection point of the profile and the maximum of the minor band coincide. So:

DISTA

for 7’ = 0 changes from three to one, smaller steps are taken in order to determine 6 giving just two solutions with precision of 6 = hO.01. The results of those calculations are shown in Figure 1. The lines in Figure 1 define the area where two overlapping Gaussian bands are separately detected. For instance, the separate detection, based upon the first derivative of two overlapping Gaussian bands, with a width ratio p = 3 and a peak height ratio R = 0.3 is possible only when 6 > 3 or 1.56 < 6 < 1.7. Increasing the peak height ratio to 0.4, a separate detection becomes possible when 6 > 1.64. Thus, for a given peak width ratio, the detectability of the separate bands decreases rapidly when the peak height ratio falls beneath a certain value (e.g. for cp = 3, when R < 0.35) 2) Calculation of the shoulder limit for two overlapping Lorentzian functions with unequal halfband widths. The model for a system of two overlapping Lorentzian functions is given as:

- ~ 2 ) ~ (6) )

where AG, bG, and v denote, respectively, the height, twice the reciprocal width, and the position of a band. Applying the following substitutions: 7 = Y / A G ~R, = AGJAG~,p = bGz/bG1, 6 = [(VZ - v i ) ( b G 1 + ~ G Z ) I / ~l , = ~ G ( x - v l ) , Equation 6 becomes:

PEAK

Figure 1. Shoulder limit for two overlapping Gaussian functions with unequal halfband widths

13, NOVEMBER 1975

r+

RUATIVE

PEAK

DISTANCE-

RELATIVE

Flgure 2. Shoulder limit for two overlapping Lorentzian functions with unequal halfband widths

cedure calculates the second derivative and gives the number of solutions of the equation 9’’ = 0. The two bands are shifted with respect to each other (changing 6 step by step) until the value of 6 is obtained, that yields only two solutions instead of four for the equation 7’’ = 0. Reducing the step size, 6 is calculated to give just three solutions, within a precision of fO.O1. The results are presented in Figures 3 and 4. In comparison with Figures 1 and 2, we see that the second derivative gives an increased detectability. For instance, for two overlapping Gaussian bands with cp = 2, and R = 0.4, the second derivative enables a separate detection when 6 > 1.4 (Figure 3);the first derivative, however when 6 > 2.5 only (Figure 1). A considerable loss of detection capability of the second derivative is found when the peak height becomes smaller than a certain value, for a given band width ratio (e.g., compare the detection limit of two overlapping Gaussian bands with cp = 2.8 (Figure 3) for R = 0.2 and R = 0.3).

PEAK

DISTANCE

0

Figure 3. Detection limit for two overlapping Gaussian functions with unequal halfband widths

Rl1;

.a

0 .7

t OC .6 I-

S

w .5

S Y

d

;.4

.3

.2

EXPERIMENTAL Spectral runs were made on a Perkin-Elmer Infrared Spectrophotometer, Model 521, equipped with two shaft encoders (P-E 1/100 T C encoder 415-2008, and P-E 1/1000 encoder 415-2055), digitizing the wavenumber and transmittance scales. The wavenumber encoder is directly coupled to the axis of the grating and reads units and tenths of the wavenumber. The higher digits of the wavenumber are encoded by a simple counting device. The transmittance encoder is coupled to the optical comb and its thousand positions allow the transmittance to be read to the nearest 0.1%. Both encoders are of the “single brush” type and use a “minimal switching code” which was converted to the normal BCD code for output on teletype and papertype. Synchronization of the read-out of the transmittance-encoder and the wavenumber is maintained by a trigger signal. The system is shown schematically in Figure 5. Because the teletype requires one second to print the wavenumber and the corresponding transmittance, the scan speed is limited to an upper value determined by the encoding interval chosen (Table I). Because the scan speed affects the peak position, the scan speed knob of the instrument was replaced by a ten-turn precision potentiometer to increase the accuracy of the scan speed setting.

I I

RELATIVE

PEAK

DISTANCE&

Flgure 4. Detection limit for two overlapping Lorentzian functions with unequal halfband widths

In combination with our peak-find program, a reproducibility of 0.2 cm-* was obtained for the peak position. This agrees with the measurements of Jones (21)with an identical system. For the analog system, the reproducibility of the peak position was 2 cm-I. This higher value is caused by additional errors in adjusting the start-wavenumber and reading the peak position from

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13, NOVEMBER 1975

2127

2

1000- c o u n t encoder

t counter

Q buffer

I Graycode to BCD code translator

W A V E N U M 0 E A ICM''1-

Figure 6. Accuracy of the wavenumber in the 2000-800 cm-' region The vertical lines give the confidence interval of the frequency values

keyboard

papertape

Table I. Maximal and Practical Scan Speed as a Function of the Encoding Interval for a 1-Hz Printer Scan speeds, c m - l l r n i n Encoding interval, c m - l

0.1

Figure 5. Block diagram of the digitized Infrared Spectrophotometer

the recorder. The accuracy of the wavenumber was determined in the 2000-800 cm-I region, through comparison with the IUPAC (32)values for peaks of water, methane, carbon monoxide, and indene (33). For standard runs, this accuracy was within 1 cm-' (Figure 6). The vertical lines in Figure 6 give the confidence interval of the frequency values found (calculated from the confidence interval of our observations and uncertainties in the calibration peaks). Because the linearity of the optical comb defines the photometric accuracy and adjustment of the 0 and 100% T is difficult in a double beam instrument, the transmittance scale was calibrated with five standard rotating sectors (RIIC, LD-52, transmittances 6.3, 12.5, 25,50, and 71% within 0.2%). This calibration showed a linear relationship between the measured transmittance and the actual transmittance of the sectors. In practice it is, therefore, justified to perform day-to-day calibrations of the transmittance scale with only two sectors. To this order, each spectral scan is preceded with runs of the 25 and 71% transmittance sectors, the sample and reference cell being filled with solvent. The transmittance of the spectrum is then calculated as T = (71-25) Trneas T71

- T77 + 71 - T25

(11)

where T,,,, is the measured transmittance in the spectrum, Tn and T 2 5 are the transmittances measured for the two sectors.

RESULTS AND DISCUSSION Curve Fitting of Simulated Profiles. The curve fitting program was first tested on theoretical spectra, so that the mathematical model (Equation 2) is an exact description of the individual band shape. The main parameter tested is the degree of overlap of adjacent bands. i) The number of inflection points is twice the number of peaks. In this case all peaks are detectable by the peakfind program, and curve fitting of those profiles always gave good results (accuracy better than 0.5% of the area or peak absorbance). The speed of convergence of the iteration procedure and, hence, the computation time was influenced by the accuracy of the initial values of the parameters, estimated by the peak-find program. The accuracy of 2128

0.2 0.5 1 2

Maximal

Practical

6 12 30 60

4.8 9.6 24 48 96

120

the estimate of the peak position appeared to be the most important factor (Table 11). Other parameters such as the ratio of the Lorentzian and Gaussian height had only a minor influence. ii) The number of inflection points is less than twice the number of peaks. In this case the peak-find program is unable to detect all peaks, so that the exact number of underlying bands must either be known beforehand or derived from the curve-fit procedure. Table I11 shows the results for a composite system with a known number of absorption bands. Up to three bands are adequately recovered, but a four-band system introduces errors greater than 15% in the absorbance or in the area of the individual bands. It is important to note that in all cases the root mean square deviation (DIS) is less than 0.06%. Conversely, this means that a good fit of a composite band system is a necessary, but certainly not a sufficient condition for a good recovery of quantitative data. Even when the peak positions of the underlying bands are known and kept constant during the computation, misfits are found with an error of about 15%. Table I11 shows that this is true for band shapes that are a sum of Gaussian and Lorentzian functions as well as for pure Gaussian or Lorentzian functions. When the number of bands in the profile is not known, but less than four, it can possibly be assessed from the root-mean square deviation by curve fitting the profile with a varying number of bands. The results in Figure 7 show that the root-mean square deviation indeed diminishes as the number of bands included in the fitting procedure becomes equal to the actual number of bands in the profile. If the number of bands is increased beyond the actual number present in the profile, the root-mean square deviation

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13, NOVEMBER 1975

Table 11. Number of Iterations (n),Computation Time ( t ) ,a n d Root Mean Square Deviation of the Residuals (DIS), as a Function of the Accuracy of t h e Initial Parameters for a Two-Band System b = 3, R = 2 Deviations from true values, absorbance wits A AG

A",cm-l

A (A(v112))Cm

-1

Peak 1

Peak 2

Peak 1

Peak 2

Peak 1

Peak 2

Peak 1

Peak 2

n

t,sec

0 0 0 co.1 0 -0.1

0 0 0

0 0 0

0 0 0

0 0

+5 +10

+5 +10

6 6

-0.1

-0.2

0

+2.5 +2.5

+5 +5

0 0 0

0 0 0

5 5 9

+0.2

0 0 +10 0

+5

1-5

15

0

0

0

+0.2

4.1

4.2

~~~~

~

4

11 6

4 5

6 6

DIS, IT

0.3 0.3 0.3 0.3 0.3 0.3

x x x x x x

10-1 10-l 10-l 10-l lo-' 10-l

~~

Table 111. Accuracy of the Calculated Value for the Peak Absorbance ( A )and Band Area ( S )for Multiple Band Systems, w i t h k a u s s i a n , Lorentzian, a n d Mixed Lorentzian-Gaussian Shape, Respectivelya 11. Three-band system

I. Two-band system R = 0.5; a = 1; G + L AAl A1

AA2 A2

10.3

-1.5

RI2 = 1; 612 = 0.84; R 1 3 = 1; 623 = 1; G

A%

5

-1.5

t0.3

3

A&!-

3

A A3

DIS

A1

A2

A3

Si

0.3 x 10"

-1.5

+0.2

+5

-2.5

s2

S1

ASz

-

L

AS3

__

sp

-3

s3

DIS

45

0 . 2 x 10-1

111. Four-band system R12 = 1.1; A A1

G + L L G

q3=

1.9; R 1 4 = 1; J12 = 0.54; a 2 3 = 0.2;

a34 = 0.8

s3

s4

-

-

*Si -

AS2 __

A2

A3

A4

Si

s2

s3

-2 -1 3

+1 -1 7 -1 1

+9

-3 -3 3 -2 1

+3 +22 +13

0 . 3 x lo-* 0.6 x 10-1 0 . 3 x 10"

-2 5 -1 6

+6 415

0.6 x 10-l 0.2 x 10-1

-

A A2

A1

4 +15 + 12

-

4 1

A A3

A 4

4 +21 +5

+3 +16 +15

-2 7 -1 8

A

A

~

DIS

~

s4

W . Same as 111, but with peak positions fixed

-3 4 +6 L -5 +21 -1 3 +8 G +12 -1 0 All deviations are stated in 70,except for DIS, which is in Yon.

maintains its low value. The limiting value observed on these synthetic, noisefree spectra (i.e., DIS = 0.03% 2') is, of course, much smaller than can be expected for real spectra with a noise of, for example, 0.2% T. Therefore, the root-mean square deviation of the curve-fit procedure can only be used to estimate the minimum number of bands required to describe the composite profile. Consequently, curve-fit procedures are neither useful as peak-find routines, nor are they applicable when the exact number of bands in the profile is unknown. However, under certain circumstances, the maximum number of bands in the complex band can be assessed by factor analysis. Bulmer and Shurvell (34, 35) indeed have shown that if solutions of various concentrations of the compounds causing the complex band are available, factor analysis supplied with the accuracy of the measurements can predict the maximum number of bands in the complex band. In this way shape analysis and factor analysis complement one another. Influence of Minor I n t e r f e r i n g Bands i n the Profile. The influence of an underlying, not detectable, small band on the results of curve fitting a two-band system, was investigated. To this order, three two-band systems were simulated and disturbed with a small band with an area of 1.5% of the total area of the profile (4.5% of the small and 2.3% of the large band). Differences between the true and calculated area and absorbance of the two bands in the profile were calculated as a function of the position of the interfering band (Figure 8 and 9). The conclusion is that unacceptable deviations (up to 15% in area) occur, especially when the interfering peak is

-3 7

+12 -1 5

situated on the wings of a strongly overlapped two-band system. Again, in all cases DIS is less than 0.3% T . These observations strongly support our earlier conclusion that there is no sense in applying curve-fit procedures unless the exact number of bands is known. The fit may be excellent, but the derived band parameters can be seriously in error if only a minor unsuspected band is overlooked. Influence of the Base-Line Detection upon the Acc u r a c y of t h e Calculated Band Area a n d P e a k Absorbance. Because the transmittance scale is calibrated with sectors for sample and reference cells filled with solvent, it might be expected that the base line in the normalized spectrum is always located a t 100% transmittance. This is not borne out in practice and the true position of the base line must be determined. This can be done through the curve-fit procedure (Equation 2) or independently. In the former case, the approximate position of the base line must be derived from the recorded spectrum to start the curve-fit iterations. T o this order, all data points of the spectrum are collected into a histogram with intervals of 1% transmittance. The transmittance interval with the maximum number of data points is taken as the estimated base-line position with slope zero. The data in Table IV show that this procedure (floating base line) requires that a certain minimal spectral range, known to be free of absorption bands, must be sampled to obtain reliable results. In the alternative case, the base line is determined independently from a spectral run with sample and reference cells filled with solvent, after which this base line remains fixed during the curve fitting of the actual absorption spectrum. The data in Table IV demonstrate that in this case the de-

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13, NOVEMBER 1975 * 2129

I

S I

\

\ \peak I s y s t e m 3

; NUMBER

L

OF

PEAKS

IN

MODEL-

Flgure 9. Influence of an underlying, not detectable small band, on the accuracy of the calculated area of the bands of a two-band profile System 1: = 1.5

Flgure 7. Effect of the difference between the number of bands included in the fitting model and the actual number of bands in the profile on the root-mean square deviation

9.

20

1

I ,peak2,

3

R = 0.5,6 = 3. System 2: R = 0.5,6 = 2. System 3: R = 0.5.6

system3

1

p peak^, s y s t e m ,

L o

I

P

-_ R E i i T I V E "DlSTAflCE 0: T H E 2 1 N T E J F E R I N & PEAK O F THE T W O P E A K S Y S T E M

PEAL

TO 6THE

F l R S T b

Figure 8. Influence of an underlying, not detectable small band, on the accuracy of the calculated absorbance of the bands of a twoband profile System 1: R = 0.5,d = 3. System 2: R = 0.5,6 = 2. System 3: R = 0.5,6 = 1.5

mands on the sampled spectral region are relaxed. In fact, a correct result is still obtained for the peak absorbance of a single band if only the points extending to one-half bandwidth on either side of the maximum are sampled. These results are even more important in the case of a composite band system, that never reaches the actual base line (zero sample absorbance). If we want to use the floating base-line routine, we must sample the entire band system and a certain range next to it, even if we are only interested in one central band. This means that in many cases more bands must be curve-fitted than the ones in which we are actually interested. If, on the other hand, we wish to employ the fixed baseline routine, we must extend the curve-fitting to include extreme bands on either side of the range of interest that 2130

Flgure 10. Effect of the degree of overlap (6) and relative peak height (R) on the accuracy of the calculated absorbance of an experimental IR band

are free from overlap to one half the bandwidth out of their maximum. In this case, therefore, the number of additional bands to be fitted is generally less than in the previous case. Curve-Fitting of Experimental Spectra. The influence of the difference between the actual bandshape of an IR-band and the mathematical model used for the calculation of the parameters involved, was investigated for a system composed of two closely neighboring carboxyl bands. The differences between the peak absorbances and band areas of a single-band system and a two-band system were calculated as a function of the relative peak height ( R ) and the degree of overlap (6). As was to be expected, Figure 10 shows that the error in the calculated absorbance increases

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13. NOVEMBER 1975

Table IV. Band Area, P e a k Absorbance, a n d Base-line Position as a Function of Sampling Interval for the Cyanide Band at 2250 cm-1 of Benzylcyanide, for a Fixed and Floating Base Line, Respectively Sampled Floating base line

range, yo

(A"

+k

Y

Peak

Band

, 2 ) , k absorbance

3.5 3 2.5 2 1.5 1 0.75 0.5

area, cm

-1

position,a

6.845 6.952 6.943 6.953 6.320 7.752 3.641 3.865

0.335 0.337 0.336 0.337 0.337 0.349 0.271 0.277

Fixed base line

Base-line

Peak

/

Band

absorbance

area, c m - '

0.336 0.336 0.337 0.337 0.336 0.336 0.336 0.336

6.957 6.954 6.956 6.956 6.956 6.829 6.875 6.663

+0.002

4.001 4.002 -0.002 +0.008 -0.017 +0.059 +0.052

/

Table V. Limits a n d Applicability of Curve Fitting Curve fitting

procedure

band parameters

-A

5

recognitioi of the minimum number of bands Yes

r

the number o bands is

I

I

I5

-

Flgure 11. Absorbance ratios of the methylene and methyl bands vs. the number of methylene groups for n-alkanes, calculated with a fitting model including, respectively, four, five, and six bands, and compared with Zenker's results

mathema"":' model is ex

three o r less bands be:ween two inflection points

IO

NUMBER O F CH2 GROUPS

I mo!e than three bands between two inflection points

I

with increasing degree of overlap, and increasing relative peak height. An overlap of 6 > 2 and relative height R < 2 caused an error less than 5%. In the case of a greater overlap, e.g., 6 < 2, the difference between the absorbance found for a single band and the same band, interfered with the other band, was 10%or more. With reference to Figures 3 and 4, we may, therefore, conclude that for experimental IR-spectra, curve-fitting yields good results for bands that are sufficiently separated to be detected by our peak-find program. If the two bands approach each other so closely that they cannot be separately detected, curve-fitting still yields correct results on synthetic spectra (provided we know there are two bands present), but no longer on actual, experimental spectra. The reason for this is probably a slight asymmetry of true IR-bands, causing the Gauss-Lorentz ratio of the model function to change with the degree of overlap.

As another example, we selected the data from Zenker (36),who has shown that the absorbance of the antisymmetric CHz- and CH3-stretching vibrations of normal alkanes is linearly related to the number of carbon atoms. In order to derive this relationship for the individual vibration bands in the strongly overlapped composite system, Zenker used an ingenious solution. During measurement of the CHZ-bands, the CH3-absorption was compensated with hexamethyldisilazane in the reference beam and, for the measurement of the CH3-bands, cyclohexane was used to compensate for the CHpabsorption. Because the bands of the symmetric and antisymmetric CHz and CH3 stretching vibrations around 2900 cm-' are mutually separated with &values between 2 and 4, this band system should be amenable to curve-fitting. However, if only four bands are used in the fitting routine, incorrect results are obtained. This is due to the presence of two overtone bands. Figure 11shows that a correct linear relationship in good agreement with Zenker's results is obtained if the band system is fitted with five bands. Inclusion of a sixth band does not lead to further improvement. CONCLUSIONS The present study has shown that curve-fitting of infrared bands offers many phfalls and should therefore not be undertaken lightly. If the bands in the system do not conform to a few basic conditions, the results can be quite inaccurate even when the fit appears to be deceptively precise. For the convenience of the reader, a simple scheme is offered in Table V from which the limits of applicability of curve-fitting can be quickly perceived. The most important requirement is that the number of bands must be known, before curve-fitting is even contemplated. If this condition is fulfilled and themathematical model gives a correct description of an individual band, then accurate results may be expected when there are no more than three bands within two inflection points. If the mathematical model is only approximate, curve-fitting can only give acceptable results

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13, NOVEMBER 1975

2131

for free-lying bands, i.e., the number of inflection points must be twice the number of bands. If the base line is to be calculated in the curve-fitting routine, a sufficient number of base line points must be included in the sampled area. If the base line is determined independently, this requirement can be relaxed to the extreme bands of the system being free from overlap.

ACKNOWLEDGMENT The authors express their gratitude to J. H. Kelderman for helpful discussions about the computer programs and for his permission to exploit his peak-find procedure before publication. LITERATURE CITED (1) R. D. B. Fraser and E. Suzuki. Anal. Chem., 41,37 (1969). (2) K. S. Seshadri and R. N. Jones, Spectrochlm. Acta, Part A, 19, 1013 (1963). (3) R. P. Young and R. N. Jones, Chem. Rev., 71,219 (1971). (4) A. M. Kabieland C. H. Boutros, Appl. Spectrosc., 22, 121 (1968). (5) F. C. Strong Ill, Appl. Spectrosc., 23, 593 (1969). (6) J. Pltha and R. N. Jones, Can. Spectrosc., 11, 14 (1966). (7) J. Pitha and R. N. Jones, Can. J. Chem., 44, 3031 (1966). (8) R. N. Jones etal., Nat. Res. Counc. Can., Bull. No 11 (1968). (9) R. N. Jones et a/., Nat. Res. Counc. Can., Bull. No 12 (1968). (10) R. N. Jones et ab, Nat. Res. Counc. Can., Bull. No 13 (1969). (11) A. H. Anderson, T. C. Gibb, and A. B. Littlewood, J. Chromtogr. Scl., 6, 640 (1970).

(33) (34) (35) (36)

A. H. Anderson, T. C. Gibb. and A. B. Littlewood, Anal. Chem., 42, 434 (1970). A. H. Anderson, T. C. Gibb, and A. B. Littiewood, Chromatographla, 2, 466 (1969). D. Audo, Y. Armand, and D. Arnaud, J. Mol. Struct., 2, 287 (1968). D. Audo, Y. Armand, and D. Arnaud, J. Mol. Struct., 2, 409 (1968). J. W. Perrarn, J. Chem. Phys., 49, 4245 (1968). A. R. Davis etal., Appl. Spectrosc., 26, 384 (1972). J. R. Beachamand K. L. Andrew, J. Opt. Soc. Am.. 61, 231 (1971). J. Pitha and R. N. Jones, Can. J. Chem., 45, 2347 (1967). L. M. Schwartz, Anal. Chem., 43, 1336 (1971). R. N. Jones, Pure Appl. Chem., 18, 303 (1969). H. V. Drushel e t a / . ,Anal. Chem., 40, 370 (1968). D. Papousek and J. Pliva, Collect. Czech. Chem. Commun., 30, 3007 (1965). H. Stone, J. Opt. Soc. Am., 52, 998 (1962). P. C.Jurs, Anal. Chem., 42, 747 (1970). J. Meiron, J. Opt. SOC.Am., 55, 1105 (1965). J. R. Morrey, Anal. Chem., 40, 905 (1968). E. Grushka and G. C. Monacelli, Anal. Chem.. 44, 464 (1972). A. W. Westerberg, Anal. Chem., 41, 1770 (1969). A. Savitzky and M. J. E. Golay, Anal. Chem., 36, 1627 (1964). A. Ralston and H. S.Wilf. Ed., “Mathematical Methods for Digital Cornputers II”, Wiiey & Son, New York, N.Y., 1967, p 192. iUPAC (International Union of Pure and Applied Chemistry), Butterworths, London, 1961. R. N. Jones and A. Nadeau, Spectrochlm. Acta, 20, 1175 (1964). J. T. Bulrner and H. F. Shurvell, J. Phys. Chem., 77, 256 (1973). J. T. Bulrner and H. F. Shurvell, J. Phys. Chem., 77, 2085 (1973). W. Zenker, Anal. Chem., 44, 1235 (1972).

RECEIVED for review March 18, 1975. Accepted July 1, 1975.

Analytical Lines for Long-Path Infrared Absorption Spectrometry of Air Pollutants Bruce M. Golden and Edward S. Yeung Ames Laboratory-ERDA

and Department of Chemistry, Iowa State University, Ames, Iowa 500 10

A scheme for selecting resonant frequencies for the analysis of gaseous air pollutants using long-path absorption of narrow-band infrared sources Is presented. A computer search is conducted using existing spectrometric data to determine lines with minimum interference and maximum sensitivity. Results are given for the pollutants 0 3 , N20, CO, CH4, and the nonpolluting species, H 2 0 and COP.

With the availability of narrow-band infrared laser sources (1-3), there has been increased interest in using long-path infrared absorption for air pollution monitoring (3-7). Since the emission profile of the laser is very narrow relative to vibrational-rotational absorption lines a t atmospheric pressure, by the proper choice of absorption frequency, determination of a given pollutant should be accomplished with high sensitivity and selectivity. Until now, however, no attempt has been made to determine absorption frequencies which are the most suitable for analysis of gaseous pollutants, i.e., those frequencies where the contributions to the total absorbance of the pollutant of interest are high and all other contributions are low. Consequently, it has been common to rely on accidental coincidences of gaseous absorption lines with fixed frequency lasers (3, 4, 7), or on the assumption that only lines in the “atmospheric windows” are useful (5, 7). To avoid such a hit-and-miss approach, and to establish the best spectral lines for infrared absorption spectrometry for the common air pollutants, we have devised a scheme which allows a systematic search of all reasonably strong absorption lines of a given gas and determines which lines are the most suitable for 2132

use as analytical lines (AL’s). As a test of the method, AL’s for six common atmospheric constituents have been determined and the results are critically evaluated.

COMPUTATIONAL PROCEDURE Our interest will be concentrated on the infrared spectral region from 4 to 20 pm, where all gaseous pollutants have vibrational-rotational resonances and where diode lasers and gas lasers are known to work well. Of special interest in this region are H20 and C02, the nonpolluting atmospheric constituents, whose strong absorptions in certain parts of this region make them a major source of interference in determining pollutant concentrations. In this spectral region, reliable spectral information is available (8, 9) for each of the six molecules HzO, COS, 03,N20, CO, and CH4, and our calculations are based on this. The calculations can readily be extended to include other pollutants, e.g., NO2, NO, SOz, H2S, H2C0, HN03, once the appropriate spectral information becomes available. The entire infrared spectrum of each species is computed from 4 to 20 pm using an IBM 360/65 computer. The details of the computer programs will not be presented here, but will be supplied on request (10). The main reason for computing the entire spectrum of each molecule is that, after the initial calculation of the individual spectra, “backgrounds” for any set of concentrations are easily constructed and are available for all frequencies within the spectral region. Calculation of the degree of interference at an arbitrary frequency is thus facilitated. This is important because the frequencies of the AL’s of a given pollutant are not known in general. Actual computation of a molecule’s spectrum involves

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13, NOVEMBER 1975