A New Fuzzy Regression Algorithm - American Chemical Society

Anal. Chem. 1996, 68, 771-778

A New Fuzzy Regression Algorithm Horia F. Pop† and Costel Saˆrbu*,‡

Faculty of Mathematics and Computer Science and Faculty of Chemistry, Babes-Bolyai University, RO-3400 Cluj-Napoca, Romania

A new fuzzy regression algorithm is described and compared with conventional ordinary and weighted leastsquares and robust regression methods. The application of these different methods to relevant data sets proves that the performance of the procedure described in this paper exceeds that of the ordinary least-squares method and equals and often exceeds that of weighted or robust methods, including the two fuzzy methods proposed previously (Otto, M.; Bandemer, H., Chemom. Intell. Lab. Syst. 1986, 1, 71. Hu, Y.; Smeyers-Verbeke, J.; Massart, D. L. Chemom. Intell. Lab. Syst. 1990, 8, 143). Moreover, we emphasize the effectiveness and the generality of the two new criteria proposed in this paper for diagnosing the linearity of calibration lines in analytical chemistry. Quantitative analytical chemistry is based on experimental data obtained with accurate measurements of various physical quantities. The treatment of data is done mainly on the basis of simple stoichiometric relations and chemical equilibrium constants. In instrumental methods of analysis, the amount of the component is computed from measurement of a physical property which is related to the mass or concentration of the component. Relating, correlating, or modeling a measured response based on the concentration of the analyte is known as calibration. Calibration of the instrumental response is a fundamental requirement for all instrumental analysis techniques. In statistical terms, calibration refers to the establishment of a predictive relation between the controlled or independent variable (e.g., the concentration of a standard) and the instrumental response. The common approach to this problem is to use the linear least-squares method.1-5 The ordinary least-squares regression (LS) is based on the assumption of an independent and normal error distribution with uniform variance (homoscedastic). Much more common in practice, however, are the heteroscedastic results, where the y-direction error is dependent on the concentration and/or the presence of outliers. In practice, the actual shape of the error distribution function and its variance are usually unknown, so we must investigate the consequences if the conditions stated above are not met. Generally, the least-squares method does not lead to the maximum likelihood estimate. Despite the fact that the least-squares method †

Faculty of Mathematics and Computer Science. Faculty of Chemistry. (1) Massart, D. L.; Vandeginste, B. M.; Deming, S. N.; Michotte, Y.; Kaufman, L. Chemometrics: A Textbook; Elsevier: Amsterdam, 1988; p 75. (2) Miller, J. C.; Miller, J. N. Statistics for Analytical Chemistry, 2nd ed.; Ellis Horwood: Chichester, 1988; p 101. (3) Agterdenbos, J. Anal. Chim. Acta 1979, 108, 315. (4) Agterdenbos, J. Anal. Chim. Acta 1981, 132, 127. (5) Miller, J. N. Analyst 1991, 116, 3. ‡

0003-2700/96/0368-0771$12.00/0

© 1996 American Chemical Society

is not optimal, there is a justification for using it in the cases where the conditions are only approximately met. In particular, the Gauss-Markov theorem states that if the errors are random and uncorrelated, the least-squares method gives the best linear unbiased estimate of the parameters, meaning that out of all the functions for which each parameter is a linear function, the least-squares method is that for which the variances of the parameters are the smallest.6 Nevertheless, if the tails of the experimental error distribution contain a substantially larger proportion of the total area than the tails of a Gaussian distribution, the “best linear” estimate may not be very good, and there will usually be a procedure in which the parameters are not linear functions of the data that gives lower variances for the parameter estimates than the least-squares method does; that is, the robust and resistant method. A procedure is said to be robust if it gives parameter estimates with variances close to the minimum variance for a wide range of error distributions. The least-squares method is very sensitive to the effect of large residuals, so the results are distorted if there are large differences between the observed data and the model predictions given by a Gaussian distribution. The least-squares method is therefore not robust. A procedure is said to be resistant if it is insensitive to the presence or absence of any small subset of data. In practice, it applies particularly to small numbers of data points that are wildly discrepant with respect to the body of the datasthe so-called outliers. There are several reasons why data may be discrepant, a gross measurement error being the most obvious one. Another is the fact that certain data points may be particularly sensitive to some unmodeled (or unadequately modeled) parameter or, from another point of view, particularly sensitive to some systematic error that has not been accounted for in the experiment.7 In recent years, a great deal of work has been done on determining what properties a robust and resistant procedure should have.8-15 Obviously, if the error distribution is Gaussian or very close to Gaussian, the procedure should give results very close to those given by the least-squares method. This suggests that the procedure should be much like the least-squares method for small values of the residuals. Because the weakness of the (6) Klimov, G. Probability Theory and Mathematical Statistics; Mir Publishers: Moscow, 1986; p 272. (7) Rajko´, R. Anal. Lett. 1994, 27, 215. (8) Thompson, M. Analyst 1994, 119, 127N. (9) Phillips, G. R.; Eyring, E. M. Anal. Chem. 1983, 55, 1134. (10) Schwartz, L. M. Anal. Chem. 1977, 49, 2062. (11) Schwartz, L. M. Anal. Chem. 1979, 51, 723. (12) Rutan, R. C.; Carr, P. W. Anal. Chim. Acta 1988, 215, 131. (13) Hu, Y.; Smeyers-Verbeke, J.; Massart, D. L. J. Anal. At. Spectrom. 1989, 4, 605. (14) Wolters, R.; Kateman, G. J. Chemom. 1989, 3, 329. (15) Vankeerberghen, P.; Vandenbosch, C.; Smeyers-Verbeke, J.; Massart, D. L. Chemom. Intell. Lab. Syst. 1991, 12, 3.

Analytical Chemistry, Vol. 68, No. 5, March 1, 1996 771

least-squares method lies in its overemphasis on large values of residuals, these should be deemphasized (downweighted) or perhaps even ignored. For intermediate values of the residuals, the procedure should connect the treatments of the small residuals and the extreme residuals in a smooth or fuzzy fashion.16-18 The purpose of the present study is to investigate the performance of the new fuzzy regression algorithm, which appears to be more efficient to overcome the difficulties discussed above. The results were applied to relevant calibration data discussed in analytical literature. THEORY The Fuzzy 1-Lines Algorithm. Let us consider a data set X ) {x1, ..., xp} ⊂ Rs. Let us suppose that the set X does not have a clear clustering structure or that the clustering structure corresponds to a single fuzzy set. Let us admit that this fuzzy set, denoted A, may be characterized by a linear prototype, denoted L ) (v,u), where v is the center of the class and u, with |u| ) 1, is its main direction. We raise the problem of finding the fuzzy set that is the most suitable for the given data set. We propose to do this by minimizing a criterion function similar to those presented in refs 19-21, 23, 24. Let us consider the scalar product in Rs given by

〈x,y〉 ) x My, for every x, y ∈ R T

s

D(xj,L) ) d2(xj,L) where

d(xj,L) ) (|xj - v|2 - 〈xj - v,u〉2)1/2 is the distance from the point xj to the line L ) (v,u) in the space Rs. The inadequacy between the fuzzy set A and its prototype L will be p

I(A,L) )

∑[A(x ,L)] D(x ,L) 2

j

and the inadequacy between the complementary fuzzy set A h and its hypothetical prototype will be p

∑[Ah (x )] 1-R 2

j

R

j)1

Thus, the criterion function J, F(X)Rd f R+, becomes

(1)

p

J(A,L;R) ) where M ∈ Ms(R) is a symmetrical, positive definite square matrix. If M is the unit matrix, then the scalar product is the usual one, 〈x,y〉 ) xTy. We used the unit matrix for all the computations performed in this paper. Let us consider the norm in Rs induced by the scalar product,

|x| ) 〈x,x〉1/2, for every x ∈ Rs

∑

p

[A(xj)]2d2(xj,L) +

j)1

and the distance d in

induced by this norm,

∑[Ah (x )] 1-R j

2

R

j)1

where R ∈ [0,1] is a fixed constant. With respect to the minimization of the criterion function J, the following two results are valid. Let us consider the fuzzy set A of X. The prototype L ) (v,u) that minimizes the function J(A,ncd) is given by p

Rs

j

j)1

v)

p

∑[A(x )] x /∑[A(x )] j

2 j

j

j)1

2

(2)

j)1

d(x,y) ) |x - y|, for every x,y ∈ Rs u is the eigenvector corresponding to the maximal eigenvalue of the matrix To obtain the criterion function, we keep in mind that we wish to determine a fuzzy partition {A,A h }. The fuzzy set A is characterized by the prototype L. In relation to the complementary fuzzy set, A h , we will consider that the dissimilarity between its hypotetical prototype and the point xj is constant and equal to R/(1-R), where R is a constant from [0,1], with a role to be seen later in this paper. Let us denote the dissimilarity between the point xj and the prototype L by (16) Otto, M.; Bandemer, H. Chemom. Intell. Lab. Syst. 1986, 1, 71. (17) Hu, Y.; Smeyers-Verbeke, J.; Massart, D. L. Chemom. Intell. Lab. Syst. 1990, 8, 143. (18) Shimasaka, T.; Kitamori, T.; Harata, A.; Sawada, T. Anal. Sci. 1991, 7, 1389. (19) Dumitrescu, D.; Saˆrbu, C.; Pop, H. F. Anal. Lett. 1994, 27, 1031. (20) Pop, H. F.; Dumitrescu, D.; Saˆrbu, C. Anal. Chim. Acta 1995, 310, 269. (21) Dumitrescu, D.; Pop, H. F.; Saˆrbu, C. J. Chem. Inf. Comput. Sci. 1995, 35, 851. (22) Dumitrescu, D. Fuzzy Sets Syst. 1992, 47, 193. (23) Bezdek, J. C.; Coray, C.; Gunderson, R.; Watson, J. SIAM J. Appl. Math. 1981, 40, 339. (24) Bezdek, J. C.; Coray, C.; Gunderson, R.; Watson, J. SIAM J. Appl. Math. 1981, 40, 358.

772

Analytical Chemistry, Vol. 68, No. 5, March 1, 1996

p

S)M

∑[A(x )] (x -v)(x -v) M j

2

j

j

T

(3)

j)1

where M is the matrix from the definition of the scalar product (eq 1). Similarly, let us consider a certain prototype L. The fuzzy set A that minimizes the function J(‚,L) is given by

A(xj) )

R/(1-R) [R/(1-R)] + d2(xj,L)

(4)

The optimal fuzzy set will be determined by using an iterative method where J is succesively minimized with respect to A and L. The proposed algorithm will be called Fuzzy 1-Lines: S1. We chose the constant R ∈ [0,1]. We initialize A(x) ) 1, for every x ∈ X, and l ) 0. Let us denote by A(l) the fuzzy set A determined at the lth iteration.

Table 1. First Example: Calibration Data Measured by ICP-AES

Table 3. First Example: Estimated Concentrations Corresponding to Signal Values of 100 and 600

signals for elements

data set

concn, ppm

Mo

Cr

Co

Pb

Nia

Nib

0 0 0.25 0.25 0.5 0.5 1 1

8.19 16.05 171.90 180.60 406.00 414.50 810.70 818.20

-23.40 -19.40 210.90 213.70 420.40 423.20 843.20 840.60

-1.83 -2.45 261.30 260.10 430.80 431.40 860.30 859.60

13.46 -6.40 112.00 119.20 217.00 207.70 419.60 428.70

6.40 7.47 223.70 215.60 437.90 430.70 897.30 886.80

28.33 30.56 220.80 218.60 410.20 407.90 828.30 826.10

a

Measured at wavelength 221.6 nm. b Measured at wavelength 231.6

nm.

data set

LS LSA LMA IRLS6 IRLS9 MFV SM RM LMS FR (R ) 0.05) XWLS WLS a

a0 a1 a0 a1 a0 a1 a0 a1 a0 a1 a0 a1 a0 a1 a0 a1 a0 a1 a0 a1 a0 a1 a0 a1

Mo

LS LSA LMA IRLS6 IRLS9 MFV SM RM LMS FR (R ) 0.05) XWLS WLS

0.123 0.114 0.133 0.111 0.125 0.113 0.117 0.121 0.110 0.111 0.117 0.143

Cr

Co

y ) 100 0.123 0.098 0.121 0.115 0.131 0.91 0.130 0.118 0.130 0.100 0.117 0.118 0.128 0.117 0.120 0.114 0.116 0.118 0.116 0.118 0.136 0.110 0.138 0.111

Pb

Nia

Nib

0.218 0.213 0.221 0.218 0.218 0.218 0.219 0.218 0.217 0.222 0.218 0.217

0.112 0.110 0.116 0.112 0.112 0.106 0.108 0.108 0.107 0.106 0.108 0.107

0.098 0.100 0.101 0.097 0.098 0.096 0.093 0.093 0.093 0.114 0.091 0.098

1.429 1.441 1.433 1.429 1.429 1.430 1.435 1.435 1.445 1.406 1.436 1.429

0.676 0.676 0.678 0.676 0.676 0.674 0.676 0.682 0.696 0.675 0.687 0.680

0.724 0.719 0.727 0.723 0.724 0.720 0.733 0.746 0.750 0.720 0.739 0.723

y ) 600

Table 2. First Example: Values of the Estimated Parameters

algorithm parameter

method

Mo

Cr

Co

Pb

Nia

Nib

-3.07 814.5 8.19 802.5 -6.29 802.1 11.1 802.3 -1.88 813.6 9.36 803.9 5.25 808.4 2.01 813.5 12.1 802.5 10.63 802.8 12.1 751.7 -19.6 836.2

-11.9 858.5 -2.4 845.6 -12.8 864.0 -11.7 858.0 -11.8 858.3 1.31 840.8 -9.15 854.6 -1.55 844.5 2.4 839.6 2.06 839.8 -21.4 894.8 -7.8 852.9

17.2 846.1 1.3 859.0 21.7 862.1 -1.45 862.1 15.7 847.2 -1.96 862.1 0.39 860.2 2.25 857.7 -1.34 862.0 -1.35 862.0 -2.1 926.6 5.0 857.0

9.84 413.0 13.5 407.1 8.7 412.7 10.0 412.9 9.91 412.9 10.3 412.1 9.65 412.9 10.4 410.8 11.8 407.1 6.36 422.1 9.9 413.9 10.3 412.7

0.47 886.3 2.67 884.1 -3.37 889.8 0.85 885.8 0.63 886.1 6.4 880.4 5.1 880.0 6.2 870.92 8.98 848.6 6.31 880.1 6.9 863.6 6.7 872.1

21.9 798.7 19.0 807.1 19.8 797.7 22.2 799.1 22.0 798.8 22.9 801.3 26.7 782.7 28.9 765.2 29.4 761.1 17.35 809.7 29.4 772.7 22.0 779.0

LS LSA LMA IRLS6 IRLS9 MFV SM RM LMS FR XWLS WLS a

0.740 0.737 0.756 0.734 0.740 0.735 0.736 0.735 0.733 0.734 0.782 0.741

0.713 0.712 0.709 0.713 0.713 0.712 0.713 0.712 0.712 0.712 0.700 0.724

0.689 0.697 0.671 0.698 0.690 0.698 0.697 0.697 0.698 0.698 0.690 0.694


nm.

dr(xj,L) )

d(xj,L) maxx∈X d(x,L)

Thus, the relation used to determine the memberships is

A(xj) )

R/(1-R) [(R/1-R)] + dr2(xj,L)

(5)


nm.

S2. We compute the prototype L ) (v,u) of the fuzzy set A(l) using the relations 2 and 3. S3. We determine the new fuzzy set A(l+1) using the relation 4. S4. If the fuzzy sets A(l+1) and A(l) are close enough, i.e., if

|A(l+1) - A(l)| <

The algorithm modified in this way will be called Modified Fuzzy 1-Lines: S1. We chose the constant R ∈ [0,1]. We initialize A(x) ) 1, for every x ∈ X, and l ) 0. Let us denote by A(l) the fuzzy set A determined at the lth iteration. S2. We compute the prototype L ) (v,u) of the fuzzy set A(l) using the relations 2 and 3. S3. We determine the new fuzzy set A(l+1) using the relation 5. S4. If the fuzzy sets A(l+1) and A(l) are close enough, i.e., if

|A(l+1) - A(l)| < where has a predefined value, then stop, else increase l by 1 and go to step S2. We noticed that a good value of with respect to the similarity of A(l) and A(l+1) would be ) 10-5. We used this value for all the computations performed in this paper. To avoid the dependency of the memberships of the scale, in practice we will use in the relation 4, instead of the distance d, the normalized distance dr given by

where has a predefined value (10-5), then stop, else increase l by 1 and go to step S2. Properties of the Produced Fuzzy Set. In this section we will study the properties of the fuzzy set obtained via the algorithm presented above. Let X be a given data set and let A and L be the fuzzy set and its prototype, respectively, as they were determined by the Analytical Chemistry, Vol. 68, No. 5, March 1, 1996

773

Table 4. First Example: Values of New Quality Coefficients

Table 5. Second Example: Relevant Heteroscedastic Data

data set algorithm LS

LSA

LMA

IRLS6

IRLS9

MFV

SM

RM

LMS

FR (R ) 0.05) XWLS

WLS

a

case 11 case 12

parameter

Mo

Cr

Co

Pb

Nia

Nib

x

QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NC5 QC6 NC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6

1.50 0.27 3.51 0.13 1.30 0.16 4.34 0.29 2.28 0.70 2.90 0.01 1.29 0.16 4.37 0.29 1.45 0.25 3.59 0.14 1.29 0.16 4.34 0.29 1.31 0.17 4.06 0.23 1.34 0.19 3.84 0.19 1.29 0.16 4.40 0.30 1.29 0.16 4.36 0.29 1.57 0.31 3.41 0.11 1.47 0.25 3.49 0.12

1.87 0.48 3.06 0.04 1.32 0.17 4.44 0.31 2.14 0.62 3.07 0.04 1.84 0.46 3.07 0.04 1.86 0.47 3.06 0.04 1.31 0.17 4.83 0.38 1.55 0.30 3.25 0.08 1.31 0.17 4.55 0.33 1.31 0.17 4.86 0.39 1.31 0.17 4.85 0.39 1.44 0.24 3.96 0.22 1.46 0.25 3.41 0.11

1.67 0.37 3.42 0.11 1.39 0.21 5.19 0.45 2.71 0.93 2.83 0.00 1.39 0.21 5.33 0.48 1.61 0.33 3.51 0.13 1.39 0.21 5.35 0.48 1.39 0.21 5.23 0.46 1.40 0.22 5.10 0.43 1.39 0.21 5.32 0.48 1.39 0.21 5.32 0.48 1.69 0.37 3.48 0.12 1.41 0.22 4.49 0.32

1.56 0.31 3.31 0.09 1.62 0.34 3.69 0.16 1.90 0.49 3.40 0.11 1.54 0.29 3.31 0.09 1.56 0.30 3.31 0.09 1.56 0.31 3.32 0.09 1.61 0.33 3.32 0.09 1.70 0.38 3.37 0.10 1.52 0.28 3.51 0.13 1.71 0.38 3.94 0.21 1.49 0.26 3.34 0.09 1.51 0.28 3.32 0.09

1.62 0.33 3.33 0.09 1.52 0.28 3.55 0.14 2.04 0.57 3.27 0.08 1.60 0.33 3.35 0.10 1.61 0.33 3.34 0.09 1.49 0.27 3.81 0.19 1.56 0.31 3.60 0.15 1.31 0.17 3.89 0.20 1.26 0.14 4.53 0.32 1.50 0.27 3.78 0.18 1.23 0.13 4.34 0.29 1.38 0.21 3.76 0.18

1.70 0.38 3.21 0.07 1.66 0.36 3.73 0.17 2.20 0.66 3.17 0.06 1.64 0.35 3.23 0.07 1.68 0.37 3.22 0.07 1.51 0.27 3.43 0.11 1.52 0.28 3.58 0.14 1.37 0.20 4.98 0.41 1.37 0.20 5.19 0.45 1.77 0.42 3.71 0.17 1.41 0.22 4.14 0.25 1.42 0.23 3.99 0.22

1 3 6 10 30 60 100


nm.

algorithm Modified Fuzzy 1-Lines. The following relations are valid (the notations are those used previously):

774

A(x) ) 1 S d(x,L) ) 0

(i)

A(x) ) R S dr(x,L) ) 1

(ii)

A(x) ) ∈ [R,1], for every x ∈ X

(iii)

R ) 0 S A(x) ) 0, for every x ∈ X

(iv)

R ) 1 S A(x) ) 1, for every x ∈ X

(v)

A(xi) < A(xj) S d(xj,L) < d(xi,L)

(vi)


y

x

y

1 1 0.9 3 1 1.0 5 1 1.1 11 3 2.9 34 3 3.2 54 3 3.5 92 10 8.5 10 9.5 10 10.5

case 21

case 22 case 31

case 32

x

y

x

y

x

y

x

y

0.002 0.003 0.005 0.008 0.012

0.12 0.14 0.27 0.40 0.52

0 2 4 6 8 10

0.009 0.158 0.301 0.472 0.577 0.739

1 2 3 4 5

1.1 2.0 3.1 3.8 6.5

5.1 10.1 20.2 30.2 46.5

0.1083 0.1977 0.4278 0.5916 0.9050

A(xi) ) A(xj) S d(xj,L) ) d(xi,L)

case 4 x

y

0.564 1.433 1.107 3.00 1.715 4.90 2.27 6.10 2.28 7.85 3.44 9.51 4.61 12.78 5.16 13.89 5.73 15.97

(vii)

The algorithm presented here converges toward a local minimum. Normally, the results of algorithms of this type are influenced by the initial partition considered.25,26 In this case, the initial fuzzy set considered being X, the obtained optimal fuzzy set is the one situated in the vicinity of X, and this makes the algorithm even more attractive. Let us remark that the role of the constant R is to affect the polarization of the partition {A,A h }. Also, now it is clear why R was chosen to be in [0,1] and the values 0 and 1 were avoided. By using the relative dissimilarities, Dr, this method is independent of the linear transformations of the space. Having in mind the properties i-vii and the remarks above, the fuzzy set A determined here may be called “fuzzy set associated with the classical set X and the membership threshold R”. On the model of the theory presented above, we may build a theory to determine the fuzzy set A represented by a point prototype, or by any geometrical prototype. As seen above, the (Modified) Fuzzy 1-Lines algorithm produces the fuzzy set associated to the classical set X and to the membership threshold R, together with its linear representation. This particular procedure may also be Fuzzy Regression. RESULTS AND DISCUSSION Some relevant cases of calibration in the most popular books and well-cited papers are considered to compare the advantages of the fuzzy regression method described above. Illustrative Example 1. To illustrate the characteristics of performance of the fuzzy regression algorithm (FR) in the case of small deviations from homoscedasticity or in the presence of outliers, we refer to the data discussed by Rajko´,7 concerning the determination by ICP-AES of Mo, Cr, Co, Pb, and Ni in subsurface and drinking water (Table 1). Considering the results obtained (Table 2) using eight different callibration methods, namely ordinary least squares (LS), least sum of absolute residuals (LSA), least maximum absolute residuals (LMA), iteratively reweighted least squares with tuning constants 6 and 9 (IRLS6 and IRLS9), most frequent values (MFV), single median (SM), repeated median (RM) and least median of squares (LMS), Rajko´ concluded that the best results were (25) Dumitrescu, D. Classification Theory (Romanian); Babes-Bolyai University Press: Cluj-Napoca, 1991. (26) Bezdek, J. C. Pattern Recognition with Fuzzy Objective Functions Algorithms; Plenum Press: New York, 1981.

Table 6. Second Example: Values of the Estimated Parameters data set algorithm

parameter

FR (R ) 0.05) OLS

a0 a1 a0 a1 a0 a1 a0 a1 a0 a1 a0 a1

WLS GWLS IRLS ELS

case 11

case 12

case 21

case 22

case 31

case 32

case 4

-0.030 0.9174 1.217 0.9118 0.15 0.95 0.0146 0.9803

0.078 0.9434 0.2134 0.9328 0.0212 0.9954 0.0212 0.9954

0.039 40.0461 0.0400 41.6666 0.0277 43.935 0.0272 44.0390

0.01001 0.0728 0.0132 0.0725 0.0091 0.0738 0.00899 0.0737

0.1852 0.9238 -0.4800 1.2600

0.0082 0.00193 0.0141 0.0193 0.0057 0.0199 0.0073 0.0196

-0.112 2.795 0.257 2.718 -0.099 2.782 -0.1634 2.896

0.0072 0.0197

Table 7. Second Example: Values of New Quality Coefficients data set algorithm FR (R ) 0.05) OLS

WLS

GWLS

IRLS

ELS

parameter

case 11

case 12

case 21

case 22

case 31

case 32

case 4

QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6

1.05 0.03 4.60 0.45 1.15 0.09 3.56 0.21 1.33 0.20 3.57 0.21 1.50 0.30 3.63 0.22

1.54 0.27 4.51 0.25 1.48 0.24 4.24 0.20 1.17 0.08 4.46 0.24 1.17 0.08 4.46 0.24

1.34 0.27 2.98 0.27 1.76 0.61 2.43 0.07 1.45 0.36 2.47 0.08 1.42 0.34 2.47 0.08

1.20 0.13 3.97 0.43 1.22 0.15 3.69 0.35 1.39 0.27 3.40 0.26 1.43 0.29 3.45 0.28

1.00 0.00 4.34 0.76 1.43 0.34 2.72 0.17

1.01 0.01 4.00 0.64 1.17 0.14 2.73 0.17 1.43 0.35 2.61 0.13 1.22 0.18 2.76 0.19

1.04 0.02 6.02 0.50 1.11 0.05 4.68 0.28 1.04 0.02 5.80 0.46 1.29 0.14 4.17 0.19

1.13 0.11 3.12 0.32 1.00 0.00 4.36 0.76

1.41 0.33 2.64 0.14

0 10 20 30 40 60 80 100

Al

Al

0.000 0.030 0.056

0.000

0.133 0.177

0.194 0.245 0.330

0.084

Table 2 also includes the results obtained by the weighted leastsquares method using 1/x2 as a weighting factor (XWLS)27 and the ordinary least-squares method using the reciprocal of variances as weighting factors (WLS). The variances were computed only for two replicates available in the data of Rajko´. Concerning the XWLS and WLS methods, we note good results, especially in the cases of heteroscedasticity. XWLS appears to be closer to LMS. Comparing the results obtained by computation of the fuzzy regression method (FR) discussed above, using a constant R ) 0.05, presented also in Table 2, it is easy to notice that FR is closer to LMS, in some cases being the best. In addition, for a more realistic comparison of the 10 regression methods presented in Table 2, we computed the estimates of sample concentrations considering two values of the signal y, 100 and 600. A careful examination of the results presented in Table 3 illustrates that the performance of FR equals that of LMS in some cases and exceeds the majority of the other methods. To evalute the linearity of the methods studied, this is a good opportunity to compare the different quality coefficients (QC) used so far in analytical chemistry for judging the goodness of fit of a regression line. Thus we present in Table 4 the values obtained for QC1, defined as28

x

N

∑((y - yˆ )/yˆ )

Table 8. Third Example: Data from Atomic Absorption Spectrometry concn, mg L-1

-0.0390 1.0945 0.19 0.92

signals for elements Mn Mn Cu Cu 0.000 0.046 0.089

0.000 0.110 0.226

0.000 0.097

0.000 0.088

0.171 0.256 0.320 0.372

0.430 0.637 0.765 0.870

0.182 0.256 0.331 0.417

0.162 0.222 0.285 0.344

Pb

Pb

0.000 0.025 0.055 0.078 0.097 0.156 0.194

0.000 0.061 0.130 0.183 0.230 0.358 0.433

produced by LMS. MFV yielded appropriate results for measurements of Mo, Cr, Pb, Co, and Ni (221.6 nm). The calibration line computed by RM was acceptable for Cr, Pb, and Ni measured at both wavelengths, and that computed by LSA only for Mo. IRLS9 and LS gave nearly the same results. LMA gave very biased parameters; in fact, it was the most sensitive to the outliers.

QC1 )

2

i

i

i)1

N-1

i

× 100%

(6)

where yi and yˆi are the responses measured at each datum and those predicted by the model in Table 2, respectively, and N is the number of all data points. We also show the values of QC2, constructed similarly to QC1, with the difference that the measured responses yi replace the estimates, yˆi, in the denominator,29 and also QC3 and QC4, respectively referring to the mean of estimated signal, jyˆ , and the mean signal, jy, rather than to the signal itself.30 The main conclusion drawn from the results also presented in ref 27, by comparison to the statements of Rajko´, is that QC2 and QC1 are more suitable for evaluating the goodness of fit. The quality coefficients QC3 and QC4 appear to be a better solution (27) Saˆrbu, C. Anal. Lett. 1995, 28, 2077. (28) Knegt, K.; Stork, G. Fresnius’ Z. Anal. Chem. 1974, 270, 97. (29) Koscielniak, P. Anal. Chim. Acta 1993, 278, 177. (30) Xiaoning, W.; Smeyers-Verbeke, J.; Massart, D. L. Analusis 1992, 20, 209.


775

Table 9. Third Example: Values of the Estimated Parameters data set algorithm parameter

Al

Al

Mn

Mn

Cua

Cua

Pb

Pb

FR a0 (×10-3) -0.15 -0.54 8.36 12.12 16.78 40.11 -0.11 1.99 (R ) 0.05) a1 (×10-3) 2.94 4.10 3.91 9.44 3.99 3.04 2.60 5.93 LS a0 (×10-3) 0.03 6.40 11.42 38.83 19.90 29.70 2.31 11.08 a1 (×10-3) 3.04 4.10 3.79 8.92 3.95 3.18 2.45 5.49 FUZR a0 (×10-3) -1.11 8.84 1.37 8.74 23.66 35.32 9.13 4.72 a1 (×10-3) 2.97 3.94 4.10 10.40 3.87 3.09 2.32 5.89 FUZL a0 (×10-3) -0.04 0.48 0.00 4.66 16.81 40.48 0.23 1.98 a1 (×10-3) 2.95 4.10 4.28 10.55 4.00 3.04 2.57 5.93 SM a0 (×10-3) 0.00 1.00 6.00 14.38 19.25 35.25 0.75 4.00 a1 (×10-3) 2.98 4.11 4.00 9.56 3.95 3.11 2.43 5.70 RM a0 (×10-3) 0.00 0.00 4.62 7.25 24.25 36.00 0.40 7.33 a1 (×10-3) 2.97 4.12 4.14 10.28 3.86 3.10 2.46 5.57 a0 (×10-3) 0.00 0.00 4.00 4.60 33.00 40.67 0.00 29.00 LMSb a1 (×10-3) 2.95 4.13 4.20 10.54 3.73 3.03 2.43 5.05 a0 (×10-3) 0.25 0.75 0.38 4.60 32.75 40.33 0.38 29.75 LMSc a1 (×10-3) 2.95 4.13 4.20 10.54 3.73 3.03 2.43 5.05

Figure 1. First example: variation of NQC6 as a function of R.

a Data sets do not include the zero point. b Obtained with the method of Massart et al. c Obtained with Progress by Rousseeuw and Leroy.

Table 10. Third Example: Values of New Quality Coefficients data set algorithm parameter FR (R ) 0.05) LS

FUZR

FUZL

SM

RM

LMSb

LMSc

QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6 QC5 NQC5 QC6 NQC6

Al

Al

Mn

Mn Cua Cub

Pb

Pb

1.01 0.01 4.05 0.66 1.18 0.15 3.05 0.29 1.01 0.01 3.85 0.58 1.01 0.01 4.11 0.68 1.04 0.03 3.70 0.53 1.03 0.02 3.83 0.57 1.02 0.01 4.13 0.68 1.02 0.01 4.02 0.64

1.00 0.00 4.31 0.75 1.11 0.09 2.84 0.22 1.08 0.06 3.21 0.35 1.00 0.00 4.29 0.74 1.00 0.00 4.32 0.75 1.00 0.00 4.39 0.77 1.00 0.00 4.33 0.76 1.00 0.00 4.14 0.69

1.16 0.09 3.76 0.25 1.60 0.36 3.15 0.11 1.07 0.04 4.02 0.31 1.08 0.04 4.92 0.52 1.08 0.05 4.03 0.31 1.06 0.03 4.79 0.49 1.07 0.04 5.01 0.54 1.06 0.04 4.53 0.43

1.33 0.20 3.53 0.20 1.65 0.40 3.09 0.10 1.08 0.05 4.78 0.49 1.09 0.05 4.96 0.53 1.19 0.11 3.62 0.22 1.08 0.05 4.55 0.43 1.09 0.05 4.97 0.53 1.09 0.05 4.97 0.53

1.13 0.08 4.44 0.41 1.50 0.30 2.89 0.05 1.63 0.38 3.35 0.16 1.16 0.10 3.93 0.29 1.26 0.16 3.74 0.25 1.37 0.22 3.41 0.17 1.28 0.17 3.98 0.30 1.27 0.16 3.86 0.28

1.04 0.02 4.72 0.47 1.70 0.42 3.07 0.09 1.05 0.03 4.52 0.43 1.04 0.02 4.71 0.47 1.22 0.13 3.57 0.21 1.52 0.32 3.24 0.13 1.48 0.29 3.92 0.29 1.46 0.28 3.85 0.27

1.35 0.28 3.20 0.35 1.43 0.35 2.53 0.10 1.35 0.29 2.63 0.14 1.34 0.28 3.21 0.35 1.48 0.39 2.66 0.15 1.33 0.26 2.68 0.16 1.39 0.31 3.32 0.39 1.35 0.29 3.42 0.42

1.00 0.00 4.16 0.69 1.58 0.47 2.55 0.11 1.09 0.08 3.07 0.30 1.00 0.00 4.17 0.70 1.06 0.05 3.31 0.38 1.04 0.03 3.48 0.45 1.01 0.00 4.16 0.69 1.01 0.01 3.99 0.63

a Data sets do not include the zero point. b Obtained with the method of Massart et al. c Obtained with Progress by Rousseeuw and Leroy.

when comparing methods based on the same algorithm, i.e., least squares. However, taking into account the contradictory values of different quality coefficients and the diversity of methods concerning their algorithm, we have introduced two new quality coefficients, QC5 and QC6. 776


Figure 2. Second example: variation of NQC6 as a function of R.

The first coefficient, QC5, refers to the maximum of absolute residuals,

x∑ N

QC5 )

(ri/max|ri|)2

(7)

i)1

and QC6 refers to the mean of absolute residuals,

QC6 )

x

N

∑(r /rj) i

2

(8)

i)1

It is very easy to prove that the possible values for QC5 lie in the range [1,N1/2], and the possible values for QC6 lie in the range [N1/2,N]. The results obtained by computing QC5 and QC6 are given in Table 4. Moreover, in the same table we introduce the values of the normalized QC5 and QC6, namely NQC5 (9) and NQC6 (10), which take values within the range [0,1] and thus appear to be more practical:

NQC5 )

and

QC5 - 1

xN - 1

(9)

NQC6 )

QC5 - xN N - xN

(10)

Examining Table 4, it is easy to notice a very good agreement with Rajko´ statements, especially for QC6. The fuzzy regression method appears to be the best in some cases and much closer to LMS in others. Overall, it may be stated that these new quality coefficients concerning the goodness of fit proposed in this paper confirm our main conclusions and are in good agreement with the statements in the analytical literature (see the following examples). Illustrative Example 2. Let us consider the relevant heteroscedasticity examples discussed by Garden et al.,31 which used the inverse of variance as a weighting factor (wi ) si-2) (cases 11 and 12), and those of Caulcutt and Boddy32 (case 21) and Miller and Miller2 (case 22), which used wi ) si-2(∑si-2/N). The data are presented in Table 5, and the computed results are presented in Table 6. Case 31, also shown in Table 5, refers to the data computed by Phillips and Eyring,9 in order to illustrate the insensitivity of the technique of iteratively reweighted least squares (IRLS) to erroneous observations. Also, we compare here the results given by FR using R ) 0.05, with the results reported by Aarons33 using OLS, WLS, and an extended squares method (ELS) for studying the reproducibility for ibuprofen assay (case 32). The last example in Table 5 (case 4) considers the paper of Schwartz10 concerning the gas chromatographic determination of benzene34 under conditions of nonuniform variance. In this paper, Schwartz observed that the weighting factor must reflect the replication number N and also the variance at the location of the point, so that wi ) ni/s2i . In all these cases, we included the values obtained using XWLS, which produces better results than WLS. Considering the parameters shown in Table 6, we computed the normalized quality coefficients NQC5 and NQC6 (see Table 7). Again, the FR method seems to be the best. Illustrative Example 3. The last example reported in this paper refers to data analyzed by Massart et al.13 concerning the determination of Al, Mn, Cu, and Pb in biological samples by AAS. Data listed in Table 8 were selected to cover a wide range of outlier types. Considering the results obtained (see Tables 9 and 10) using seven different calibration methods, namely LS, SM, RM, LMS, two fuzzy methods (FUZR16 and FUZL17), and a modified last median squares (LMSg), Massart et al. concluded that the best results were given by LMS. With respect to SM and RM, it was noted that RM is not only more robust but is also faster than SM. In the method FUZR by Otto and Bandemer,16 a circle acts as the support of the fuzzy observations. Relative to the method FUZL by Massart et al.,17 as support of the fuzzy observations, a line appeared to be more suitable. By a carefully examination of the data in Table 10, we notice once again a very good agreement between the conclusions expressed by Massart et al. and the values evaluated for NQC5 and NQC6. (31) Garden, J. S.; Mitchell, D. S.; Mills, W. N. Anal. Chem. 1980, 52, 2310. (32) Caulcutt, R.; Boddy, R. Statistics for Analytical Chemists; Chapmann & Hall: London, 1983; p 104. (33) Aarons, L. J. Pharm. Biomed. Anal. 1984, 2, 395. (34) Bocek, P.; Novak, J. J. Chromatogr. 1970, 51, 375.

Figure 3. Third example: variation in NQC6 as a function of R.

Figure 4. Third example: informational energy as a function of R.

Figure 5. Third example: informational entropy as a function of R.

By comparing the three fuzzy methods, we may conclude that the FUZL method17 seems to be the best and FUZR16 seems to be the poorest. FR is much closer to FUZL when a value of R ) 0.05 is used. CONCLUSIONS A new fuzzy regression algorithm has been described in this paper. It was compared with conventional ordinary and weighted least-squares and robust regression methods. The application of these different methods to relevant data sets proved that the performance of the procedure described in this paper exceeds that of the ordinary least-squares method and equals and often exceeds that of weighted or robust methods, including the two fuzzy methods proposed in refs 16 and 17. Analytical Chemistry, Vol. 68, No. 5, March 1, 1996

777

Moreover, we underline the effectiveness and the generality of the two new criteria proposed in this paper for diagnosing the linearity of calibration lines in analytical chemistry. The effectiveness of these criteria allowed us to select an optimal value for the R constant, as we may see in Figures 1-3. In addition, we emphasize that the fuzzy regression method discussed above includes not only the estimates of the parameters but also additional information in the form of membership degrees. In this way, the fuzzy regression algorithm gives a new aspect to regression methods. Using the membership degrees of each point (35) Saˆrbu, C. Anal. Chim. Acta 1993, 271, 269.

778


to the regression line, we may compute the informational energy (E ) (1/N)∑[A(xj)]2)35 and/or the informational entropy, as is well illustrated in Figure 4 and 5. These quantities allow us to appreciate the presence of outliers and the linearity of the regression line.

Received for review June 7, 1995. Accepted October 31, 1995.X AC950549U X

Abstract published in Advance ACS Abstracts, January 1, 1996.

A New Fuzzy Regression Algorithm - American Chemical Society

Recommend Documents