Analysis of Flow Injection Peaks with Orthogonal Polynomials

peak shape. Studies on chromatographic peaks have shown that analysis of peak profiles can be of both fundamental3 and practical importance.4-6 In FIA...
0 downloads 0 Views 1MB Size
Anal. Chem. 1994,66,971-982

Analysis of Flow Injection Peaks with Orthogonal Polynomials 0. Lee,t G. A. Dumont,* P. Tournler,* and A. P. Wade’*+ Departments of Chemistty and Electrical Engineering, c/o Pulp and Paper Centre, University of British Columbia, 2385 East Mall, Vancouver, BC, V6T 124, Canada

Digitized transient signals such as those acquired in flow injection analysis may be decomposed by a generalized Fourier expansion into a weighted linear combination of discrete orthogonal polynomials. Together, the coefficients from such an expansion form a spectrum analogous to that of the magnitude spectrum of a discrete Fourier transform and provide a useful alternative means of signal idenMiation. This flexible method of representing peak shapes in flow injection (and elsewhere) is not reliant upon any single mathematical model. Two familiesof functions, the Gram and Laguerre polynomials, were investigated. Both series were found to be sensitive to changes in peak shape and able to represent important features of flow injection time domains signals. Indeed, a small number of coefficients was sufficient to accurately approximate even highly bifurcated peaks. The Laguerre spectrum has a characteristic profile similar to that of the actual peak while the Gram spectrum typically has the characteristics of an ac transient signal. The Laguerre spectrum is more computationally expensive to produce since it requires optimization of a time scale parameter; a method for this is described. The utility and robustness of these representations are evaluated on real and simulated data. About 20-25 Gram coefficients and 7-10 Laperre coefficients were found to provide a near-optimal balance between the ability to discriminate between various peak-shaped signals and robustness to noise. Abnormal peak shapes are readily identified. Flow injection analysis (FIA) systems produce output typically in the form of a skewed-Gaussian peak-shaped transient signal. Parameters such as height or area can readily be extracted from this signal for the purpose of quantifying the concentration of an analyte. Measurement of broadness, in the form of full width at half-maximum (fwhm) or peak standard deviation, provides an alternative criterion for use in quantitation and manifold design.lV2 Useful as these traditional parameter are, much more information is potentially available in other parameters that describe aspects of peak shape. Studies on chromatographic peaks have shown that analysis of peak profiles can be of both fundamental3 and practical In FIA, the diagnostic information contained within the shape of the peak has long been recognized t Department of Chemistry. 1 Department of Electrical Engineering. (1) Ruzicka, J.; Hansen, E. H. Flow Injection Analysis, 2nd cd.;Wiley: New York, 1988. (2) Brooks, S.H.; Leff, D. V.; Hernandez Torres, M. A.; Dorsey, J. G. Anal. Chem. 1988, 60, 2737-2744. (3) Vidal-Madjar, C.; Guiochon, G. J. Chromatogr. 1977, 142, 61-86. (4) Giddings, J. C. Sep. Sci. Techno/. 1977, 19, 831-847. ( 5 ) Mott, S.D.; Gnrshka, E. J . Chromatogr. 1976, 126, 191-204.

0003-2700/94/0366-097 1$04.50/0 0 1994 American Chemical Society

and has often been used in an ad hoc fashion by the operator to detect incorrect or nonoptimal analyzer o p e r a t i ~ n . ~ More Jl~ rigorous methods, particularly empirical descriptive methods, have appeared in the literatureSl2 and have been reviewed by Hullet al.13 The use of empirical methods has been motivated by (i) the lack of a usable method for predicting the FIA response curve based on the geometric parameters of the manifold, physical factors (viscosity, flow rates, etc.), and the chemistry involved and (ii) the increasing complexity of response profiles deliberately produced in FIA. Indeed, it is surprising that more work has not been done in this area previously. Although many workers have made significant contributions,’ a general derivation for FIA peak shapes has thus far eluded theoreticians. The complexity of the problem is due to thevariety of manifolds (their components and geometry),14 modes of operation (e.g., merging zones,I5 flow reversal,I6 sinusoidal f l 0 ~ , 1and ~ sequential injection,I8 and the different physical and chemical environments (e.g., concentration gradients, refractive index gradients, liquid-liquid interfaces, etc.) that can be realized (deliberately or otherwise) in FIA systems. Together, these affect critical factors such as convection, diffusion, mixing, chemical kinetics, and selectivity of reaction. Wada et al.I9 demonstrated the dependence of peak profiles on mixing coil geometry and joints. Bifurcated peaks can result if (i) insufficient reagent is available for reaction, (ii) mixing is incomplete between reagent and sample zoned (iii) the sample matrix differs appreciably from the carrier,20 (iv) the sample matrix retards the desired chemical reaction (e.g., through abnormal pH8), and/or (v) the sample is “sandwiched” between different reagents. Finally, theo(6) Debcts, H. J. G.; Wijnsma, A. W.; Doornbos, D.A.; Smit, H. C. Anal. Chim. Acta 1985, 171, 33-43. (7) Van der Linden, W. E. Anal. Chim. Acta 1986, 179, 91-101. (8) Hecschen, W. A.; Greminger, D. C.; Yalvac, E. D. Process Control Qual. 1990.2.215-224. (9) Wade, A. P.; Shiundu, P. M.; Wentzell, P. D. Anal. Chim. Acta 1990, 237, 361-379. (10) Wentzell, P. D.; Wade, A. P.; Crouch, S . R.Anal. Chcm. 1988,60,905-911. (11) Van Nugtcren-Osinga, I. C.; Bos, M.; Van der Linden, W.E. Anal. Chim. Acta 1988, 214,17-86. (12) Reijn, J. M.; Poppe, H.; Van der Linden, W. E. Anal. Chcm. 1984,56,943948. (13) Hull, R. D.; Malick, R. E.;Dorsey, J. G. Anal. Chim. Acta 1992,276, 1-24. (14) Wada, H.; Hiraoka, S.;Yuchi, A.; Nahgawa, G. Anal. Chim. Acta 1986, 179, 181-188. (15) Bergamin, H.; Zagatto, E. A. G.; Krug, F. J.; Reis, B. F . Anal. Chim. Acta i 9 n , 101.17-23. (16) Betteridge, D.; Oatcs, P. B.; Wade, A. P. Anal. Chcm. 1987.59.1236-1238. (17) Ruzicka, J.; Marshall, G. D.; Christian, G. D. Anal. Chem. 1990,62,18611866. (18) Ruzicka, J.; Marshall, G. D. Anal. Chim. Acta 1990, 237, 329-343. (19) Wada, H.; Sawa, Y.; Morimoto, M.; Ishizuki, T.; Nakagawa, G. Anal. Chim. Acta 1989, 220, 293-297. (20) Kcster, M. D.; Shiundu, P. M.; Wade, A. P. Talanta 1992, 39, 299-312.

Ana~icaiChemWy,Vol. 66, No. 7, April 1, 1994 Q l l

retical expressions cannot account for factors or events which are stochastic and/or emanate from outside the scope covered by FIA theory (e.g., sporadic introduction of gas bubbles or particulates). Our interest in peak shape analysis is motivated by our development of computer-controlled automated FIA systems2’ and renewed interest in their application to on-line process monitoring and control.22 In the past, much effort has been expended in the development of novel manifold components, modes of operation, and optimization schemes. While these advances are valuable contributions to the establishment of FIA as a method of choice for chemical process monitoring (and the evolution of FIA in general), they cannot ensure against faulty analyzer operation. This is of practical concern in an (often harsh) industrial environment where these systems may encounter perturbations from normal operation, e.g., temperature differences between sample and reagent(s), large changes in sample pH, introduction of gases or particulates, and accelerated aging or fouling of system components. There is, therefore, a need to evaluate the quality of a determination simultaneously with the quantity of analyte-especially if the analyzer is to operate independently of human supervision. This task can be realized quite economically if an appropriate form of intelligence can be embedded with the data analysis and control algorithms. From the foregoing, schemes which involves postdetection data processing are presently the most viable. In these, problems are not anticipated, but simply recognized and accounted for. This approach can readily be handled by empirical pattern recognition methods. These techniques require a suitable set of inputs which contain the information necessary for classification. One may simply use the entire peak (typically consisting of 200-500 points taken at a rate of perhaps 5-20 Hz) but this is excessive since not all the data points collected can be expected to provide unique information and large data records make analysis computationally expensive. The goal of this approach then is to develop a reduced set of descriptors of higher information content which are, singly or in combination, sensitive to appropriate aspects of peak shape. Ideally, these should be robust to noise, easy to compute, and generally applicable. Once developed, they may be used as inputs to expert systems, neural networks, or any one of many pattern recognition methods. Various methods have been employed in analytical chemistry to describe transient response curves. The most popular of these treats a peak profile as a distribution function and describes it with statistical moments. This approach has found widespread use in chromatography since moments completely specify any peak,23 are easy to interpret, and have been linked to fundamental processes.24 However, the evaluation of the moments depends on the integration interval, and the higher moments are increasingly susceptible to noise if computed n ~ m e r i c a l l y . ~To ~ circumvent the latter problem, some workers have chosen to fit the peak profile to a corresponding (21) Wentzell, P. D.; Hatton, M. J.; Shiundu, P. M.; Ree, R. M.; Wade, A. P.; Betteridge, D.; Sly, T. J. J . Aufom. Chem. 1989, 11, 227-234. (22) Kester, M. D.; Horner, J. A.; Nicolidakis, H.; Wade, A. P.; Wearing, J. T Process Confrol Qual. 1992, 2, 305-320. (23) Grushka, E.; Myers, M. N.; Schettler, P. D.; Giddings, J . C. Anal. Chem. 1969, 41, 889-892. (24) McQuarrie, D. A. J. Chem. Phys. 1963, 38, 4 3 7 4 4 5 . (25) Cheder, S. N.; Cram, S. P. Anal. Chem. 1971, 43, 1922-1933.

972

Analytical Chemistry, Vol. 66, No. 7, April 1, 1994

mathematical model (e.g., Gram-Charlier series3 and Edgeworth-Cramer ~ e r i e s ~ ~from , ~ ’ )which the relevant parameters are then extracted. Other models have been used and a summary of these has been given by Mott and G r u ~ h k aThe .~ effectiveness of this approach, however, depends on how well the model can account for the physical factors contributing to the signal. A poorly chosen model can introduce significant bias. Furthermore, for nonlinear models, starting values of model parameters must first be estimated. A more versatile method involves decomposition of the signal into a weighted linear combination of orthogonal functions via a process known as generalized Fourier expansion.28,29The prototype, and best known, is the Fourier series expansion in which complex exponential functions are used. However, trigonometric functions may not be the most appropriate and other functions may be more e f f i ~ i e n t . ~ ~ , ~ ] Classical orthogonal functions include the Hermite, Legendre and L a g ~ e r r e Discrete . ~ ~ analogs of these are the Krawtchouk, Gram, and Meixner polynomials, r e s p e c t i ~ e l y .The ~ ~ signal can be approximated by the series to any desired degree of accuracy by increasing the number of terms used. A computational advantage afforded by orthogonal functions is that only the added terms need be evaluated, though this is not necessarily true if free parameters are present (Free parameters can acquire any value within a defined range.). The weights subsequently form the information-bearing quantities. In fact, if each function was thought of as a “shape” then the weights quantify the contribution of that shape to the overall signal. This idea was previously recognized and exploited by Glenn33and later by Scheeren et Though the weights may not have any direct relationship to physical parameters, the generality of this method makes it convenient and attractive for analyzing physical signals for which no exact or practical mathematical expression is available.35 Orthogonal functions have found use in a p p r ~ x i m a t i o n , ~ ~ data c o m p r e ~ s i o n filter , ~ ~ design and filtering,38 and signal i d e n t i f i ~ a t i o n .In ~ ~analytical chemistry, Gram polynomials have been used to derive general equations for Savitzky-Golay smoothing weights for one-dimen~ional~~ and two-dimen~ i o n adata l ~ ~arrays. They have also been applied extensively to aid in the spectrophotometricdetermination of compounds of pharmaceutical interest.42 Hassan and Loux used them to (26) Harris, K. R. J . Solution Chem. 1991, 20, 595-606. (27) Dondi, F.; Betti, A.; Blo, G.; Bighi, C. Anal. Chem. 1981, 53, 406504. (28) Lee, Y. W. Statistical Theory of Communicafion;Wiley: New York, 1960. (29) Erdyli,A.Higher TranscendenfalFunctions;McGraw-Hill: New York, 1953; Vol. 11. (30) Deutsch, R. Sysfem Analysis Techniques; Prentice-Hall: Englewwd Cliffs, NJ, 1969. (31) Dumont, G . A. Chemom. Intell. Lab. Sysf. 1990, 8 , 275-279. (32) Abramowitz, M.; Stegun, 1. A. Handbookofb4afhemaficalFunctions;National Bureau of Standards: Washington, DC, 1964. (33) Glenn, A. L. J . Pharm. Pharmacol. 1963, 15 (Suppl.), 123T-130T. (34) Scheeren, P. J. H.; Klous, 2.;Smit, H. C.; Doornbos, D. A. Anal. Chim. Acfa 1985, 171, 45-60. (35) Young, T. Y.;Huggins, W. H. IRE Tram. Circuit Theory 1962, CT-9,362370. (36) Ralston, A,; Rabinowitz, P. A Firsf Course in Numerical Analysis, 2nd ed.; McGraw-Hill: New York, 1978. (37) Ahmed, N.; Rao, K. R. Orthogonal Transformsfor DigitalSignal Processing Springer-Verlag: New York, 1975; Chapter 9. (38) Paraskevopoulos, P. N.; King, R. E. I n f . J . Circuit Theory AppL 1977, 5 , 81-91. (39) Clement, P. R. J . Franklin Insf. 1982, 313, 85-95. (40) Gorry, P. A. Anal. Chem. 1990, 62, 570-573. (41) Kuo, J. E.; Wang, H.; Pickup, S. Anal. Chem. 1991, 63, 630-635. (42) Abdel-Hamid. M . E.; Abuirjeie. M . A. Analysr 1987, 112, 895-897.

correct for spectral interferences in ICP-AES.43 Debets et ala6used an expansion in Hermite polynomials to derive a peak separation quality criterion for chromatography based on the first two coefficients. This paper describes the representation of flow injection response curves by a generalized Fourier expansion in discrete orthogonal polynomials. Two families are considered: Gram and Meixner. Gram polynomials are commonly used for unweighted least-squares fitting. Meixner (and Laguerre) polynomials have been used to represent process dynamics for process control purpose^.^.^^ From a control theoretician's point of view, flow injection peaks resemble the response of a stable dynamic system when subjected to a pulse.31 This is hardly surprising since physically this is exactly what those signals are. Hence, Laguerre and Laguerre-like polynomials are potentially an ideal choice for representing FIA signals. For consistency with previous ~ o r k ,hereafter, ~ , ~ ~ the Meixner polynomials will be referred to as Laguerre polynomials. Both families of functions are evaluated, with simulated and real data, for their sensitivityto different flow injection peak shapes, and their robustness to noise.

THEORY Definitions and Properties of Orthogonal Polynomials. A family of real functions pn(x) for n = 1, 2, ... is said to be orthonormal over the interval [a, b] if and only if

where

am, is the Kronecker delta operator, i.e. 1 0

ifm=n otherwise

Such functions are both orthogonal and normal. This set is said to be complete in [a, b] if any functionflt) that is square integrable on [a, 61, Le., for which

N such that (7) The orthonormal series representation offlt) is such that c i converges, and

C;=l

There are several advantages to this kind of represenatation. As seen from eq 5 , each coefficient cn is determined independently from the other coefficients. The convergence expressed by eq 6 is called convergence in the mean and is less demanding than other forms of convergence, e.g., the convergence of a Taylor series. The former requires only that the error goes to zero as N -, whereas the latter also requires that the derivativesof the error tend to zero and is more difficult to achieve. Thus, orthonormal representations are more versatile, converge over a large range of signals, and as a consequence are commonly used in approximation theory. There are many orthornormal polynomials, and in theory any set could be used as long as it is complete in the range of the signals to be approximated. For the representation of FIA signals, we chose to compare the use of two very different families, the Gram polynomials and the (discrete) Laguerre polynomials. Both are discrete orthogonal polynomials; Le., they are defined at discrete, equidistant time intervals. In such case, the integration signs in the previous equations simply have to be replaced by summation signs. Cram Polynomials. The Gram polynomials have been used in the past for regression analysis to fit a curve to a set of discrete data points.36 For a set of K data points, the Gram polynomials are complete in [0, K - 11. With K = 2 M 1 points, the unnormalized expression for the nth polynomial is given explicitly by36

-

+

(3)

can be represented by

r=O

= Ccnpn(t)

(4)

n= 1

Cn

are given by

and N

lim S * [ f ( t )- C c n p n ( t ) l 2dt = 0

N-m

(-l)'+n(r

+ n)(")(M + k)")

(r!)2(2M)(')

k = -M,

N

where the coefficients

n

tiM(k)=

n=l

Completeness means that there is no function square integrable on [a, b] for which Cn = 0 for all n = 1,2, ..., and that for any piecewise function continuous square integrable on (a, b], for an arbitrary E > 0, there exists a positive integer (43) Hassan, S. M.; Loux, N. T. Spectrochim. Acta B 1992, 45, 719-729. (44) Zcrvos, C. C.; Dumont, G. A. Int. J. Control 1988.48,2333-2359. (45) Dumont. G. A.; Zervos, C. C.; Pageau, G. Auromatica 1990, 26,781-787.

...,-1,

0, 1, ...,M (8)

where (c)(~)= c!/(c - d)! is a generalized factorial function. The series is conveniently generated with the following recurrence relationship:

n ( 2 M + n + 1 ) 2M t,,(k) (n + 1)(2M- n)

n = 0, 1, ... (9)

-

with to(k) = 1 and t-l(k) = 0. As M a, the Gram polynomials become the better known Legendre polynomials and thus are also known as discrete Legendre polynomials. The first five Gram polynomials are shown in Figure la. Laguerre Polynomials. The use of Laguerre polynomials to represent transient signals and synthesize linear dynamic systems was first proposed by Wiener and Lee.28 More recently, Laguerre functions have been used in adaptive control44955 and system identification. The Laguerre functions AnalyticaiChernistry, Voi. 66,No. 7, April 1, 1994

S73

K

SSE = p(f(k) =I

N

- CC,pn(k))2

(13)

n=l

is given by c = (M~M)-’M~G

1

I

b,

0.2

-0 2

R

r

7 K- 1

0

data point Flgure 1. First five(a)Grampolynomials(unnormalized)and(b)Laguerre polynomials ( p = 0.975). Note that only the first ( K - 1) points of an infinite number are plotted for the Laguerre polynomials. Circled numbers show the sequence of functions.

form a complete set in L2(0,03),Le., for functionsf(t) such that Jomf-(t)< 03

for the continuous case

(10)

(14)

where G is the K-dimensional data vector containing thevalues of the signalf(k). The ith column of the K X N matrix M is a K-dimensional vector containing the values of the ith polynomial. Because M is orthonormal, W M is diagonal. Provided that c is estimated from a large population of samples of the same noise-corrupted signal, the expected value of the vector c is unaffected by noise. However, given that only one such sample is available, the estimated vector c will be affected. This will become more evident as the dimension of c increases, in which case overfitting tends to occur; Le., the orthonormal expansion starts to approximate the noise as well as the signal. It is thus important to use a parsimonious model, Le., to use only as many polynomials as is required to achieve a satisfactory accuracy. For instance, a 257-point normalized signal acquired through a 12-bit analog to digital converter (A/D) will have a sum of squared errors (SSE) of 1.53 X arising solely from quantization. Similarly, a 0.1% of full scale measurement noise will result in an SSE of 2.57 X 1P, independent of A / D resolution. The futility of trying to achieve an SSE less than those values on a real signal is thus obvious. A tool commonly used by system identification practitioners is the Akaike information criterion (AIC) defined as

m

xf(t) C 03 for the discrete case

(11)

AIC = log(

0

In the time domain, the discrete Laguerre functions are given by

n, k 2 1 (12)

Figure 1b depicts the first five functions. Those functions are bounded by a decaying exponential whose rate of decay is fixed by the free parameter {. From their shape, it can be seen why those functions are appealing to describe FIA signals. The nth Laguerre function can simply be generated as the pulse response of a network consisting of a first-order lowpass filter followed by n - 1 identical all-pass filters.44~45This provides an easy and elegant way of generating those polynomials. Computation of the Orthonormal Series Expansion via Least-Squares Fitting. The direct way to compute the coefficients c,, as given by eq 5, could be used. With the Laguerre polynomials, because we are dealing with a finite data set and a truncated series, it is more attractive to compute the coefficients using a least-squares fit of the original signal. Assuming that the FIA signal isf(k) for k = 1, ...,K, then the N-dimensional vector c containing the coefficients c, that minimizes 974

Analytical Chemistty, Vol. 66,No. 7, April 1, 1994

p) + 2;

where ~ ( k =) f ( k ) - Et=,c,vn(k). It is seen that the AIC weighs the benefit of achieving good signal description through a small t versus the cost of increasing the number of coefficients needed to reach it, through N. The idea is to choose N that minimizes the AIC. This, plus the previous considerations on realistic values for the SSE, should provide good guidelines for choosing N. The other free parameter when the Laguerre polynomials are used is the time scale parameter 5; which determines their rate of decay. To accurately describe a signal with N polynomials, it is important that the Nth polynomial does not reach zero before the signal to be approximated does. Furthermore, to preserve orthonomality of the N polynomials over the range of interest, it is necessary that they all reach zero by the end of that range. The first condition puts a lower bound on {, while the second one gives an upper bound. While the lower bound is signal dependent, the upper bound is not; it depends only on the number of points K contained in the signals. When orthonormality is preserved, the matrix M’W is the identity matrix. Now, the reciprocal condition number of a matrix is defined as the ratio between the smallest and the largest eigenvalues. It gives an indication of the accuracy of the results from matrix inversion, and of the sensitivity of the solution of a system of linear equations involving that matrix to errors in the data. The reciprocal condition number

ranges from unity for a perfectly conditioned matrix like the identity matrix to zero for a poorly conditioned matrix. Furthermore, for a given number of polynomials N, there is an optimal time scale parameter (that will minimize the SSE given by eq 13. This optimal time scale can be found through numerical optimization, as discussed later.

1 .oo

0.75 0.50

0.25

D

.-Q a

c

E m

EXPERIMENTAL SECTION Simulation of Flow Injection Peaks. The performance of the Gram and Laguerre polynomials in representing different peak shapes is most readily evaluated initially with simulated data since their characteristics are known. A limited set of six different shapes were synthesized to demonstrate the method. They are described below. A Gaussian peak profile, obtained in FIA under conditions of high dispersion, may be computed G ( t ) = A exp {-1r1(2)[~-]’)

o 1.00 0.75 0.50 0.25

0

U

U 0 10 20 30 40 50

I 0 10 20 30 40 50

’ ’ ’ L~ 0 10 20 30 40 5C I

time (s) Flgure 2. The six simulated peak shapes (I)Gaussian, (11)tanks in series (two tanks), (111) tanks in series (five tanks), (IV)as with (11) with dip prlor to peak maximum, (VI)as with (11)with dip after peak maximum, and (VI)severely bifurcatedpeak. Eachpeak is composed of 257 points and has an ordinate range of 1 AU. Refer to Table 1 for model equations.

(16)

where b is the asymmetry factor. The simulated peaks are displayed in Figure 2. Table 1 lists each case and the corresponding parameter values used. We have chosen to use data lengths typically found in practice for FIA. A

sampling interval of 0.2 s was used to generate data lengths of 257 points. Each peak was corrected for baseline and scaled to a ‘peakto-peak” range of 1. Experimental noise was simulated by the addition of normally distributed random values which were then scaled to the desired noise standard deviation (with respect to peak range). For ease of discussion, each peak will be referenced by its Roman numerical label as shown in Table 1 and Figure 2. ExperimentalData. The efficacy of orthogonal polynomial identification was also evaluated on experimental data from the reaction between Fe(1I) and 1,lO-phenanthroline. The manifold and the data obtained were reported previously? In addition to reagent concentration, the extent of reaction was also found to be sensitive to carrier pH under the conditions used.19 Certain combinations of these two factors result in bifurcated peaks. The flow rate of 1,lO-phenanthroline (reagent) was varied from 0.10 to 1.00 mL/min in steps of about 0.225 mL/min, and that of sodium acetate (pH modifier) was varied from 0.00 1.00 mL/min in steps of 0.25 mL/min. The reaction was monitored at 508 nm, and peaks (consisting of 108-214 points) were collected for each flow rate combination. Data Processing. Since the approximation is only valid over the duration of the signal, the peak must be extracted from the data record. Simulated data were used without further treatment. Leading baseline points were stripped from the Fe(I1)-1 ,lo-phenanthroline reaction data; a trailing baseline was precluded by the data acquisition routine. The subsequent data length was made odd if necessary (by dropping the last point) to account for the computational requirement in the implementation of the Gram functions, eq 9. For this work, principal components analysis (PCA) was chosen as the pattern recognition tool because of its ability to effectively reduce the dimensionality of complex data sets to more manageable proportions and allow the results to be presented visually.47 Of course, other approaches could equally well have been used.

(46)Frascr, R. D.B.;Suzuki, E.AMI. Chcm. 1%9,41, 31-39.

(47) Wold, S.;Esbcnsen, K.;Gcladi, P.Chemom. Intell. Lab. Syst. 1901,2,31-52.

where A is the amplitude, t is the independent variable, 7 is the peak center position, and 6t1p is the width at half-height. The tanks-in-series model has been used for describing the shape of more common flow injection peaks and is given by1

where Ti is the mean residence time of an element of fluid in any one mixing tank and N is the number of tanks. This model was used to synthesize two typical flow injection peaks using N = 2 and N = 5. A Gaussian may also be synthesized in this way by setting N = (Le., very large). Bifurcated peaks were simulated by a linear combination of the tanks-in-series model (N = 2) with either an exponentially scaled Cauchy function or with the FraserSuzuki asymmetric peak function. These simulated peaks may represent the case of incomplete reaction. The Cauchy function is given byM A

C(t) = 1

+

[-I’

but has been modified for this work as follows:

The FraserSuzuki function is given by46

AnatyticaiChemWy, Voi. 66, No. 7, Aprll I, 1994

975

~

Table 1. Equatlons Used for Slmulatlng Peak Shapes

rei

casea

I I1

Gaussian highly skewed

I11

moderately skewed tanks in series peak I1 bifurcated to

IV

parameter values

model

tanks in series

the left of peak

V VI

(I

maximum peak I1 bifurcated to the right of peak maximum severely bifurcated peak

G, Gaussian; S, tanks in series; CM, modified Cauchy, F, Fraser-Suzuki.

Prior to using PCA on the data, the coefficients from the approximations were normalized to the sum of the absolute value of the coefficients to account for differences in both peak magnitude and data lengths. The coefficient array was then autoscaled. Computational Aspects. A golden section search routine48 was used to determine the optimal time scale parameter. Normally distributed random number sequences were generated by applications of the Box-Muller method to a portable, uniform, pseudo-random-number generator. Discrete Fourier analysis were performed with MATLAB (version 3.5, The Mathworks Inc., South Natick, MA). Principal components were calculated via a Householder tridiagonalization-implicit QL iterations decomposition. Software developed for this work was written in Microsoft BASIC (version 7.00, Microsoft Corp., Redmond, WA). All programs were run on a 33-MHz Intel 80486-based microcomputer with the exception of MATLAB, which was run on a Sun workstation (Model SPARC 11, Sun Microsystems Inc., Mountain View, CA). Double-precision arithmetic was used througout.

RESULTS AND DISCUSSION Peak Approximation. The process of approximation, for which these orthogonal polynomials were initially developed, cannot be entirely divorced from that of identification. A certain level of approximation accuracy must be met to ensure integrity of representation and, to some extent, to increase the likelihood for spectral uniqueness. Hence, a study into the efficacy of approximation for the two polynomial families under consideration is appropriate. The performance of an approximation can be assessed by evaluating the SSE, eq 13, as a function of polynomial order. The results can then be used to define practical limits. Figure 3 shows results from approximating the noise-free simulated peaks. With Gram polynomials, the number of terms required for a given accuracy increases with complexity in peak shape (i.e., peaks IV-VI), as expected. In contrast, Laguerre polynomials only show this trend over moderate expansion orders; e.g., the SSE for peak VI is less than that of peak V over 12-45 expansion terms but the reverse holds true outside this range. Laguerre polynomials are more adept (48) Press, W. H.; Flannery, B. P.; Teukolsky, S.A,; Vetterling, W. T. Numerical Recipies: The Art of Scientific Computing, Cambridge University Press: Cambridge, MA, 1986; Chapter 10.

976

Analytical Chemistry, Vol. 66,No. 7, April 1, 1994

0 10 20 30 40 50

0 10 20 30 40 50

0 10 20 30 40 50

expansion order Flgure 3. SSE against expansion order for simulated peaks using Gram (solid line) and Laguerre(dashed line) polynomialapproximation. Horizontal line indicates SSE value for 12-bit quantization error. The plot shown in each panel was generated from the peak In the corresponding panel of Figure 2.

at approximating appreciably bifurcated peaks than Gram polynomials for the range of expansion orders studied. For both sets of functions, the SSE profile for peak I11 lies somewhere between that of peaks I and 11, an observation in harmony with the tanks-in-series model. Overall, the numerical accuracy of approximation with Gram polynomials is greater than that with Laguerre polynomials for the peaks used. For example, given that these peaks consisted of 257 points, the former levels off at about while the latter appears to approach an SSE of about as the expansion goes to an infinite number of terms. In practice, the accuracy required for adequate representation is set by the error introduced by the data acquisition system. Present acquisition hardware typically uses 12-bit A/D. Given this and the corresponding quantization SSE stated above, less than 43 Gram polynomials and 28 Laguerre polynomials are required to approximate all simulated peaks presented to within experimental error. No practical advantage is gained by computing more terms than these. These figures are, within computational error, independent of the number of data points. Since the profile for peak VI is quite drastic (ignoring profiles with discontinuities), they also represent the practical upper limit for most FIA applications (under conditions of very high signal-to-noise ratio). Of course,

0

10 20 30 40

0

10 20 30 40

0

10 20 30 40

0 -0.1

c

1

1

,

,

0 2 4 6 810

0 2 4 6 810

0 2 4 6 810

-..

expansion order

time scale parameter

0 A\ UI

a$ .-2 6

5

10

15 20

^.I.

0

5

10

15 20

I II

0

5

10

15 20

I I /I

Flgurr 5. Reciprocal condltlon number of M q as a function of the number of bguerre functlonsand the tlme scab parameter f. Values of the reclprocal condltlon number of 1 and 0,respecthrely, Indicate a perfectly and a poorly conditioned matrix.

VI I

not required (e.g., the first 20 Gram coefficientscan distinguish I : ; r r i i the simulated peaks adequately). 0.1 0.2

Ill,.

n

I............

0

10

20

30 0

10

20

30

Laguerre Time Scale Parameter. Figure 5 shows a plot of the reciprocal condition number for the Laguerre polynomials as a function of N and f for K = 251 points. It is seen that the upper bound on {decreases linearly as N increases. The Laguerre polynomials are defined to be orthogonal over the interval [0, m) and so in practice truncation in time is necessary. This truncation cannot be done arbitrarily. Choosing f less than this upper bound provides a means to perform the approximation using truncated functions without, from a numerical perspective, violating the orthogonality property significnatly. Functions of order higher than that set by the upper bound do not decay back to (or close enough to) zero within the number of points used. Consequently, they are nonorthogonal (numerically) with the other functions. If the signal to be approximated converges to zero, these high-order functions will not contribute significantly. The other problem is the search for the optimal value of the time scale. This is the best "compromise" time scale over all functions used for a given signal and leads to the most compact spectrum. This approach deserves investigation primarily because a computational procedure is available via eq 13 and its properties are less well established. Since the optimal time scale is a function of the signal itself, the Laguerre functions themselves are no longer identical for different signals having the same number of points. However, a "peakdependent" time scale is not detrimental if the spectrum for the peak is invariant to the number of points used to represent it. This, in fact, can enhance discrimination between peaks of different shape. Although this ideal situation is not entirely fulfilled when the time scale is set near the optimum, the spectrum for a given peak is relatively invariant in comparison to the spectrum for peaks of significantly different shape. It is this last point which gives credibility to this approach and that in the previous paragraph. The dependence of the optimal time scale parameter on the number of terms used is evident from the error curves of

lllllllll,,,...._....________

.._._...._

0

10

20

30

coefficient Flgurr 4. Spectral plots for peaks I, 11, and VI, shown in Flgure 2. (a) &am representatlon(41 coefficlents), (b) Laguerre representation (10 coefficients), (c) Laguerre representation(20 coefficients), and (d) Fourier representation (magnltude spectrum, 31 coefficients). All spectra were normalired to the absolute sum over the number of coefficients shown except for the Fourier coefficients, whlch were normalized to the sum of all 129 coefflclents computed with the FFT.

for digitizers with a larger word length, e.g., 16 or 20 bit, these limits should be increased accordingly. Spectral Information. A plot of the coefficients from a general Fourier expansion of the peak against coefficient number produces a spectrum. The normalized Gram spectra for simulated peaks I, 11, and VI are shown in Figure 4a. As the peaks become more complex the high-order terms gain increasing prominence, reflecting the approximation results above. Corresponding normalized Laguerre spectra are shown in parts b and c of Figure 4 for 10 and 20 coefficients, respectively. The similarity of the spectra to the actual peak profiles is striking (a feature which makes the Laguerre representation ideal for compression of peak-shaped data); the match becomes progressively better as the number of coefficients used is increased. Indeed, with reference to the Laguerre polynomials themselves (Figure 1b), there appears to be some rather loose connection between the magnitude of the coefficients and values computed by direct integration of the peak over successive time intervals. For comparison, corresponding normalized DFT magnitude spectra are shown in Figure 4d. It is clear that the polynomial spectra are equally capable of distinguishing these peaks and offer different views of the data. However, while the Fourier spectrum can be interpreted on the basis of frequency, the Gram and Laguerre representations are much more abstract. For the purposes of identification or classification, this is of no consequence. Indeed, for the latter application, the complete spectrum is

Analytical Chemktry, Vol. 66, No. 7, April 1, 1994

077

0

02

04

06

Lime scale parameter

08

10

4

Flgure 7. Difference in Laguerre spectra with time scale parameter showing the difficulty of unique identification when the breadth of the “minimum well” becomes significant. The error curve shown is that for peak 111for expansions into 20 Laguerrepolynomials. The spectra shown in the two insets correspond to the two positions Indicated on curve. The scale is the same on both insets.

10-12 -

‘\

”fly/

I

VI

4I 1

06

07

I

08

09

10

time scale parameter Figure S, Error curves as a function of the time scale parameter { for (a) peak I, (b) peak 11, and (c) peak V I . Each graph contains curves calculated from an expansion in 5, 10, 15, 20,25, 30, 35, and 40 terms. The sequence is indicated by the circled numbers.

peaks I, 11, and VI shown in Figure 6. In general, a decreasing trend is observed as the number of terms increases from 5 to 40. This decrease is expected since the higher-order polynomials decay progressively slower and the time scale must decrease to accommodate the next function into the approximation. As predicted from theory, the “minimum well” of the error curves broadens with an increase in terms such that the time scale becomes essentially immaterial at infinite expansion order. Furthermore, these curves exhibit oscillations as the optimum time scale is approached with the frequency of oscillation increasing with expansion order. These oscillations occur when the time scale is close to 1 and the orthogonality condition is seriously violated; Le., the reciprocal condition number of MrM is very close to zero. They pose problems for minimum search algorithms which commonly assume single-mode functions. Though not as critical for approximation, an inability to locate a unique optimum has potential dire consequences for identification. As shown in Figure I , the spectrum at one part of the minimum well may differ appreciably from that at another, depending on the peak. With the golden search method used here, the routine was initialized with different step sizes and bracket conditions to increase the chance of finding the global minimum. Finally, a changing time scale means that the step features in the curves for the Laguerre polynomials in Figure 3 do not necessarily infer that certain polynomials do not contribute to the peak. 978

AnalyticalChemistry, Vol. 66, No. 7, April 1, 1994

Effects of Noise. Though different peak shapes were found to result in different spectra, real signals are always corrupted by noise and thus, for pattern recognition applications, the reproducibility of spectra from noisy data is of interest. Debets et a1.6 have previously demonstrated how the reproducibility of the coefficients depended on the actual noise sequence, the type of noise (e.g., white or 1/A, and the noise level. The first factor is attributable to the fact that a limited noise sequence is, to some extent, biased. The second is apparent from consideration of the DFT, whose spectrum is in fact used to distinguish the various types of noise. The third (noise level) is intuitively obvious. Employing the same procedures as Debets et ale6we have confirmed their observations for the Gram approximation. These observations follow from the fact that the Gram transformation, like the DFT, is linear, i.e.

where x ( t ) and y ( t ) are the time domain sequences and X ( s ) and Y(s)are their respective Gram transforms (the parameter s is a variable in the Gram domain). Thus, for a given noise level, the noise sequences randomly increase or decrease the value of a coefficient independently of peak shape. An increase in the noise level will multiply this effect and lead to an overall increase in the variability of the coefficients. For a particular noise type, the variation in any coefficient can be represented mathematically by

where is the expectation value of any appropriate measure of coefficient variation. When the noise is white, all coefficients will be affected to more or less the same extent. The smaller the coefficient, the more susceptible it is to noise. Hence, the noise level is critical and must be minimized to facilitate accurate identification. In other words, the signal-to-noise ratio should be made as large as possible. Figure 8 shows the effect of added noise on the error in approximation as a function

FpEi ~;:yy-gKqpq5 1 1e-01 e-02

'e+o1

1e-03

le-04

.......

. . ..... .. ... .. ... . :. ...... ..... .... .. . ... ... ,. ........................ : . . ..,' . . . . .:.. .. . ?...... .. :

I....,

I......

......... i

I.....

...... ....

0%

v)

1e-02

18-03 1e-04

0 0 10 20 30 40

0 10 20 30 40

0 10 20 30 40

expansion order Figure 8. SSE of &am polynomial approxlmatlon as a functlon of noise and expanslon order for simulated peaks. Expansion up to the 40th order Is shown. Curves In all graphs correspond to noise levels of ON, 2 % , 4%, 7%, lo%, and 15% standard devlatbn (with respect to peak height) as Indicated In (I). Plot In each panel was generated from peek in cmespondlng panel of Figure 2.

of expansion order (the same noise sequence was used). For large N, the SSE values approach the theoretical limit set by the signal-to-noise ratio. For real peaks, the resolution of the A/D would be critical here too. When noise is present, the error in approximation is no longer bound by numerical accuracy. Another factor, overfitting, competes and the result is an optimum expansion order. Overfitting arises when the polynomial is more apt to fit the noise than the signal itself (i.e., when the polynomial has the characteristics of the noise and the signal coefficient is small). The optimum expansion order decreases and the minimum SSE increases with an increase in noise. Since the true signal characteristics are unknown, the true optimum order cannot be determined, although it can be estimated with the AIC. Note that optimization with respect to the noise-corrupted signal produces error curves (denoted by SSE*) which continually decrease, but a relatively sharp break can be observed. Unfortunately, the break occurs before the true optimum and so the optimum is underestimated; the difference tends to increase with noise level. All the above factors must be taken into account when the number of coefficients for identification are selected. When Laguerre polynomials are used, noise not only causes varying coefficient values and overfitting, it hinders location of a unique optimal time scale since optimizationvia Equation (13) requires that the function to be approximated be fairly well behaved. The shape and magnitude of SSE* as a function of fexhibit oscillation in the minimum well. These functions also vary (in shape and magnitude) with noise sequence and the amount of variation increases with noise level. One way of reducing the problem of multiple minima is to minimize the following criterion by combining the SSE* and the reciprocal condition number r,:

RCSSE = r, log(SSE*)

(23)

Because log(SSE*) is negative, RCSSE will increase back to zero when r, approaches zero, while in theorthogonality region, i.e., when r, = 1, it will not differ from SSE*. As an example, Figure 9 displays side-by-side three-dimensional plots of log(SSE*)and r, log(SSE*) for peak 11. If we are to minimize

0.6

40

expansion order

t h e scale parameter

...:..... . . . . ...... :.... ..... . .. . . . .... . . .... ...'. . . ~

I....

0

0.8- 40

expansion order

time scale parameter

Figure Q. SSE' and RCSSE crlterla as a functlon of time scale f and expansion order N for peak 11.

SSE* within the constraint of orthogonality, then one should instead minimize r, log(SSE*), which displays a well-defined minimum for each expansion order. It is also interesting to note that the location of those minima in the (N, 5) plane is not very dependent on peak shape. Although for a given expansion order, there might still several local minima, the situation is much improved compared to the SSE* case. Approach for Identification. From the discussion above, the selection of an appropriate number of coefficients for representation is important in minimizing the effects of noise and, for the Laguerre polynomials, the error in finding the true optimal time scale. In addition, pattern recognition algorithms usually require that the number of coefficients used for input (i-e., the length of the pattern vector) be the same. When noise is present, there must necessarily be some trade-off between integrity of signal representation and robustness. The more coefficients used for identification, the better the representation and the greater the number of Deak shapes identifiable, but the more sisceptible the identification is to noise. However, peak discrimination only requires that the variation in the coefficients due to differences in peak shape be greater than that due to noise (and error in optimal time scale parameter). Also, a highly accurate representation in the Gram or Laguerre domain is not always necessary for classification of all anticipated peak shapes, and a truncated or limited (in the case of the Laguerre series) spectrum may Analyfcal Chemistty, Vol. 66, No. 7, April 1, 1994

979

@x I -1

c:

.3 -2:

,

,

-4

.2

,

2

0

j

21

I

I -6

ot

-4

-2

0

2

4

!

.

,

,

,

,

-6

-4

-2

0

2

I 4

Figure 11. Principal componentsanalysis on Laguerrerepresentation of simulated peaks. Analysis was conducted wffh seven coefflclents with 3 % noise added. The data structureusing the first three principal components is shown in two different views.

I .

t

I1 IV

I

1 -4

-2

0

2 PC ill

4

6

8

.

4

-

2

0

2

4

6

8

PC If1

Figure 10. Principal components analysis of Gram representationof simulated peaks: (a) 6 coefficients and 3 % noise; (b) 6 coefficients and 10% noise; (c) 21 coefficients and 3 % noise; (d) 21 coefficients and 10% noise; (e) 31 coefficients and 3% noise; (f) 31 coefficients and 10% noise. Roman numeralsadjacentto clusters are peak labels.

be adequate. This partially handles the noise susceptibility problems of high-order coefficients since they are not used. A difficulty which ensues is that different types of peaks when represented to a given precision require different expansion orders and so not all peaks are treated equally. Finally, in selecting a particular expansion order, a constraint is placed on the amount of noise that the pattern recognition procedure can tolerate. A graphical demonstration of these statements can readily be accomplished with PCA. To this end, 50 different Gaussian-distributed noise sequences were added to each simulated peak to obtain a data set of 306 peaks (including the noise-free peaks). Noise levels of 0.5%, 1%, 2%, 3%, 4%, 6%, and 10% standard deviation (with respect to peak range) were investigated over various expansion orders. For a given expansion order, PCA was first performed on the noise-free peaks to obtain the plane formed by the first two principal components upon which the noise-corrupted peaks were projected. In this way, the variation due to noise could be visualized. Results for the Gram polynomials at 3% and 10% noise, and 6, 21, and 31 coefficients are shown in Figure 10. The entire set of coefficients was used; highly correlated coefficients were not removed prior to analysis. This negligence was not found to be critical here (differences resulting from removal of highly correlated coefficients were, in fact, insignificant Q00 Analytical Chemistry, Vol. 66, No. 7, April I , 1994

over the number of coefficients studied) and does not affect the following discussion. Six clusters are observed in these figures; each corresponds to one of the six simulated peak shapes. When noise is absent, these clusters reduce to a point located near the center of each cluster. As the noise level increases (e.g., from 3% to 10%) the clusters expand outward. Excessive noise results in overlap of clusters making positive peak identification difficult. The balance between integrity of representation and robustness to noise is quite apparent. An inadequate representation which uses just six coefficients results in peak I1 being erroneously similar to peaks IV and V and exceptionally dissimilar to peak 111. Loss of discrimination between peaks I1 and IV is observed at the 3% noise level. At the other extreme, use of 3 1 coefficients yields peak dissimilarities which are consistent with common sense but now noise, rather the uniqueness of peak representation, clouds separation between peak types 11-IV at the 3% noise level (note that when the third principal component is used, adequate separation exists between this group and the other shapes even at 10% noise level). On the basis of this "training set", we suggest that 20-25 Gram polynomials would yield adequate performance for most classifications provided the noise level stays below 10%. Of course, the number of coefficients should be "finetuned" for a particular application, given that a suitable training set is available. With Laguerre polynomials, variability in the computed optimal time scale and the inflated effects of overfitting results in greater sensitivity to noise and consequently the optimum number of coefficients is smaller than that for Gram polynomials. Applying the same approach as above, 7-10 coefficients were found by visual inspection to be adequate for most purposes. The results of PCA for representation by seven Laguerre polynomials with a 3% noise level are shown in Figure 11, where the first three principal components are

1.oo

-.c

&

0.78

E v J2

e

B

0.55

E

.-

E

sc E 2

0.32

e

z7

0.10

0.00

0.25

0.50

K K K K K 1.oo

NaOAc flow rate (mi/min.) Flgure 12. Matrix of peak shapes from Fe(I1)-1,lO-phenanthroiine reaction resulting from different combinations of sodium acetate flow rate (which varies pH) and 1,lO-phenanthroilneflow rate (which varies reagent concentration). Numbers In panels label peak.

-41

,

-

class nonbifurcated weakly bifurcated strongly bifurcated a

peak numbersa 2,3,4,5,8,9,10, 14, 15,19,20,25 1,7,13,18, 24 6,11,12,16,17,21,22,23

Refer to Figure 12 for peak assignments.

plotted. Peaks I1 and I11 are much more sensitive to noise than the others because they are overapproximated-note that their noise-free spectra have fewer prominant coefficients. This also increases the error in time scale optimization and results in the spectrum broadening or contracting (see Figure 7). The systematic lengthening observed in the clusters for peaks I1 and 111is indicative of this process and it is observed to be the dominant error here. Peak I1 is seen to broaden enough to meet peak IV, which itself is a dense cluster, thereby resulting in loss of discrimination between the two. As noted in the previous section, peaks that can be approximated by a very small number of Laguerre components are particularly sensitive to broadening with noise. No simple modification has yet been devised to overcome the compromise between robustness and data integrity. One may attempt to determine the optimum expansion order for a given peak and set all higher coefficients to zero. However, this approach is suspect since the optimum expansion order may vary over a broad approximation order depending on the noise level (see parts IV-VI in Figure 8), given that the optimum expansion can be determined in the first place. Furthermore, by “windowing” the spectrum in this manner, an artificial, noise-dependent difference is introduced. Should the windowed spectrum be sufficiently different, this may at

I , ,

2

0

2

4

6

,

8

0

n

-2 -4 -8

Table 2. Manual Claw Aulgnments ?or Peaks Obtalned from the Reactlon between Fe( 11) and 1,lO-Phenanthrollne

;

,

-6

-4

-2

0

2

PC #I Flgure 13. Principal components analysis on peaks in Figure 12 as represented by (a) 2 1 Gram coefficientsand (b) 10 Laguerre polynomials. The positionof each peak isat the center of the number drawn, except when a pointer Is used. Numbers in squares refer to strongly bifurcated peaks, circled numbers are weakly bifurcatedpeaks, and numbers on their own are nonblfurcated peaks.

best lead to the peak being classified as an unknown but at worst cause it to be identified as another type of peak. However, when signal-to-noise ratio is likely to be very poor, this method may be an option. Performance on Real Data. While simulated data provide an ideal test bed for a systematic study of orthogonal polynomial identification, this capability must also be demonstrated on real data. The peaks obtained from the reaction of Fe(I1) with 1,lO-phenanthroline as a function of reagent flow rate and sodium acetate flow rate are shown in Figure 12. In general, peak magnitude increases with both reagent and sodium acetate flow rates over the ranges c~nsidered.~ In Figure 12, peaks have been normalized to peak area to emphasize their shape; high signal-to-noise ratio is observed. This simple example links nonoptimal reaction conditions with peak bifurcation. Clearly, a blind application of conventional quantitation routines based solely on peak height or peak area could lead to invalid analyses. The peaks were manually divided into three classes: nonbifurcated, weakly bifurcated, and strongly bifurcated. The assignments are given in Table 2. In accordance with the simulation results and the low noise level observed, each peak was expanded into 21 Gram polynomials and 10 Laguerre polynomials. The results of PCA on the Gram and Laguerre Ana!vticaiChemistry, Vol. 66,No. 7, April 1, 1994

Q01

Table 3. Optimum lime Scale Values for Laguerre Approximation of Peaks Obtained from the Reaction between Fe( 11) and l,lO-Phenanthrolinea

peak no.

optimum time scale

peak no.

optimum time scale

1 2 3 4 5 6 7 8 9 10 11 12 13

0.872 0.874 0.871 0.867 0.847 0.870 0.868 0.868 0.864 0.900 0.874 0.876 0.841

14 15 16 17 18 19 20 21 22 23 24 25

0.853 0.863 0.872 0.868 0.900 0.858 0.858 0.869 0.860 0.893 0.880 0.877

0

Determined by Golden Search method for 10 Languerre terms.

spectra are shown in Figure 13. With the Laguerre case, the optimum time scale values determined for the peaks are listed in Table 3. No correlation of the time scale with separation is observed. Good separation between nonbifurcated and strongly bifurcated peaks is seen in each case. The Laguerre representation shows a greater capacity, here, in differentiating some peaks which are weakly bifurcated from those which are nonbifurcated. Indeed, some semblence of a transition from nonbifurcation to strong bifurcation is observed; this is desirable for pattern recognition.

982

Analytical Chemistry, Vol. 66, No. 7, April 1, 1994

Finally, this example also highlights one weakness of peak shape analysis: the information obtained is not always unique to a particular chemical or instrumental cause. In the absence of additional information such as multiwavelength data, the bifurcation cannot be linked solely to low pH or to low reagent concentration, or to whether both conditions were in effect. However, peak shape analysis is still valuable in that problem conditions can often be flagged and it may be used to augment approaches which use information from (or rely on) additional sensors.

ACKNOWLEDGMENT The authors acknowledge funding for this work from the Canadian Networks of Centres of Excellence (Mechanical and Chemimechanical Wood Pulps Network). A.P.W. acknowledges additional funding from the Natural Sciences and Engineering Research Council of Canada (NSERC), Grant 5-80246. O.L. acknowledges DuPont and Paprican for scholarship support. The authors thank the Government of France, Ministry of External Affairs, for leave of absence granted to P.T. and acknowledge Dr. Ye Fu for helpful discussions. Received for review July 2, 1993. Accepted December 22, 1993." @

Abstract published in Aduonce ACS Absrracrs, February 15, 1994.