Stable Numerical Solution of the Adsorption Integral Equation Using

Mar 10, 1994 - A stable numerical method for the solution of the adsorption integral equation ... real solids are characterized by continuous distribu...
0 downloads 0 Views 900KB Size
2778

Langmuir 1994,10, 2778-2785

Stable Numerical Solution of the Adsorption Integral Equation Using Splines Jacek Jagiellot Department of Chemical Engineering a n d Materials Science, Syracuse University, Syracuse, New York 13244-1190 Received March 10, 1994. I n Final Form: May 18, 1994@ A stable numerical method for the solution of the adsorption integral equation (SAIEUS)is proposed. The method combinesregularization principle with the B-splinerepresentation ofthe distribution function. Such a representation is convenient and in the case of smooth functionsrequires fewer variables as compared to the discreet representation. The stability of the solution is imposed by the regularization method. The inherent problem of the method, which is the choice of the optimal degree of smoothing, is solved using a comprehensive analysis of the variance of the solution and the effective bias due to the smoothing effect. This approach ensures that the chosen solution contains all the information which can be extracted from the data while the artifacts are excluded. The problem of the resolution of the method and the accuracy of the solutionin context of the error in the data and the complexity of the true adsorption energy distribution are discussed using several simulated examples of adsorption isotherms.

Introduction Adsorption of gases on solid surfaces occurs due to the existence of the gas-solid attraction forces. Energy of interaction between the solid and adsorbed molecule is called adsorption energy, q. It is usually assumed that real solids are characterized by continuous distribution of adsorption energies,flq), rather than a single adsorption energy value which is characteristic for a n ideal homogeneous surface. The fundamental relationship between the measured adsorption isotherm, 0,and unknown energy distribution function is given in the form of a n adsorption integral equation

where p is the equilibrium pressure and B(p,q) is the socalled local adsorption isotherm which describes adsorption on the adsorption sites characterized by energy q. The range of adsorption energies is assumed to be included in the integration interval (a,,B). Function 8(p,q) is the kernel ofthe integral equation and its mathematical form represents the accepted model of adsorption. Since the distribution function f l q ) is a quantitative description of energetic heterogeneity of the solid surface, and as such can be used for characterization of adsorbents, many methods have been proposed to solve eq 1 with respect toflq). Reviews of existing methods for evaluating the adsorption energy distributions can be found in recent monographs by Jaroniec and Madey,l by Rudzinski and Everett,2 and in the review paper by Cerofolini and Re3 who emphasized the mathematical aspect of solution of eq 1. In view of practical applications it appears that two groups of methods were specially popular in adsorption literature: the local approximate solution^^-^ due to their ~~~

~

+ On leave from The Institute of Energochemistry of Coal and Physicochemistry of Sorbents, University of Mining and Metallurgy, 30-059 Krakbw,Poland. Abstract published in Advance ACS Abstracts, July 1, 1994. (1) Jaroniec, M.; Madey, R. Physical Adsorption on Heterogeneous Solids; Elsevier: Amsterdam, 1988. (2)Rudziriski, W.; Everett, D. H.AdsorptwnofGases on Heterogeneous Surfaces; Academic Press: London, 1991. (3) Cerofolini, G. F.; Re, N. Riv. Nuovo Cimento SOC.Ital. Fis. 1993, 16 (71, 1. (4)Harris, L. B. Surf. Sci. 1968,10,128. (5) Rudzinski, W.; Jagiello, J.; Grillet, Y. J . Colloid Interface Sci. 1982,87,478. @

simplicity and numerical methods7-" due to their potential accuracy. Equation 1 is a linear Fredholm integral equation of the first kind and the inherent difficulty in solving this equation numerically is related to its ill-posed character which is manifested by the fact that even small changes in the data cause large changes in the solution. In order to obtain meaningful results one has to stabilize the solution. The best known method for stabilization is the so-called regularization method introduced by Tikhonov,12 first applied to the adsorption integral equation by Houseg and used by other authors.13J4 In all these applications of the regularization method the distribution function, flq), was discretized and evaluated a t a certain number of nodes. There exists, however, another approach in whichflq) is represented by piecewise continuous function. This approach to the solution of linear Fredholm integral equations of the first kind was proposed by Hanson and Phillips15who used linear splines. Also cubic B-splines were proposed by Provencher16J7 as a n option in the numerical package (CONTIN) for inverting noisy linear operator equations. It is surprising that although splines were used in different methods for the solution of the adsorption integral equationaJ8J9they have not yet been used as basis functions in the direct regularization method. The major problem in successful application of the regularization method, regardless of the way in which the unknown function is represented, is the degree of regularization which is controlled by the so-called smoothing parameter to be chosen. Although there exist some (6) Nederlof, M. M.; Van Riemsdijk, W. H.; Koopal, L. K. J . Colloid Interface Sci. 1990,135,410. (7) JagieHo,J.; Ligner, G.;Papirer, E. J . Colloid Interface Sci. 1990,

--".

1.77, 1 9 A --,

(8) House, W. A.; Jaycock, M. J . Colloid Polym. Sci. 1978,256,52. (9) House, W. A. J . Colloid Interface Sci. 1978,67,166. (10)Sacher, R. S.; Morrison, I. D. J.Colloid Interface Sci. 1979,70, 153. (11)Vos, C. H. W.; Koopal, L. K. J . Colloid Interface Sci. 1985,105, 183. (12) Tikhonov, A. N. Dokl. h a d . Nauk U S S R 1943,39,195. (13) Merz, P. H. J . Comput. Phys. 1980,38,64. (14) Von Szombathely,M.; BrBuer, P.;Jaroniec,M. J . Comput. Chem. 1992,13,17. (15) Hanson, R. J.; Phillips, J. L. Numer. Math. 1975,24,291. (16) Provencher, S. W. Computer Phys. Commun. 1982,27,213. (17) Provencher, S. W. Computer Phys. Commun. 1982,27,229. (18) Brauer, P.; Fassler, M.; Jaroniec, M. Thin Solid Films 1985, 123,245.

0 1994 American Chemical Society

Solution of Adsorption Integral Equation

Langmuir, Vol. 10, No. 8,1994 2779

criteria for choosing a smoothing p a r a . ~ n e t e r , ~it ~ seems J~-~~ that there is no general consensus on the method of its optimal choice.23 The objective of this paper is to demonstrate the efficiency of the numerical method for solution of the adsorption integral equation using B-splines (SAIEUS). In the proposed method the choice of a smoothing parameter is based on the analysis of the variance of the solution.

Outline of the Method It is assumed that the solution of eq 1 may be represented in the following form

where bj are the coefficients to be established, q5j are cubic B-splines with equally spaced knots in the range of (a,p), and m is the number of splines taken to the linear combination. B-splines are well-known in the numerical analysis l i t e r a t ~ r e .Here ~ ~ it is important to know that each cubic B-spline is a cubic polynomial in four subsequent subintervals between the knots and is equal to zero outside this region. It is a continuous nonnegative function and its first two derivatives are also continuous. The advantages that the spline representation of the unknown function has as compared to the discreet representation are fewer variables are required to approximate the solution with a given accuracy and the accuracy of evaluating the integrals is independent of the number of variables. Equation 1 may now be written for n isotherm data points having statistical weights wi in the following form m

(i = 1,2, ...)n)

yz =

(3)

j= 1

where

and

or in the matrix notation

y=Ab

(6)

Such a system of linear equations is usually solved with respect to the unknown vector b by the least-squares(LS) principle Minimize llAb - y1I2with respect to b

(7)

which leads to the well-known solution

where 6Ls is the LS estimate of b. At this point we digress to the ill-posed character of eq 1. P h i l i i p ~illustrated ~~ this problem in a very elegant way by showing that addition of some fast oscillating function, even with large amplitude, to the solution may have a negligible effect on the values of the integral. This is due to the smoothing effect of the integration itselfwhich makes the value of the integral insensitive to the exact form of the f(q) function. Conversely, small changes in the data may have a strong effect on the solution. As a consequence of low sensitivity of the integral to the particular form of the integrated function, when m increases and functions #j become more similar to each other the columns of matrix A become more linearly dependent and the system of linear equations (6)becomes an ill-conditionednumerical problem. The formal solution given by eq 8 is not stable in this case since matrix ATA is close to singular. The magnitude of this problem depends on the number of unknown parameters, m, and, thus, on the desired accuracy or amount of information about the solution. It is obvious that a stable least-squares solution will be obtained for a low number of parameters. I t follows that the controlled adjustment of m may be a way of stabilizing the solution. This approach, however, does not guarantee the sufficient accuracy in general. Let the root-mean-square (rms) error, s

be a measure of the accuracy of fit the equation to the data and let SO be the minimum value of this quantity obtained from the least-squares solution. Now let us consider such changes of this solution which would make s only slightlylarger thanso but still within the uncertainty of the data. In the case of a well-posed (stable) problem such changes may only be small, whereas in the case of the ill-posed problem the changes in the solution may be large without considerably affecting the s value. Consequently there is a number of different solutions which are acceptable from the view point of the accuracy of fit. A n approach is sought, which leads to that solution which would be optimal in the sense that it would extract maximum information from the given data but would not introduce artifacts in the form of details which are not justified by these data. In light of the fact that many fimctions which fit the data with acceptable accuracy show spurious oscillations, we will choose the function which is the least oscillatory, and the smoothest in some sense. To accomplish this we modify the least-squares principle (7) by adding the smoothing term and thus obtain the regularization principle.12 The commonly accepted measure of smoothness of a curve is the integral of its squared second derivative. Adopting this measure, the regularization principle is written as

+

Minimize IlAb - y1I2 AJ:f”(q)’

dq with respect to b

(10) (19)Contescu,C.;Jagiello, J.;Schwarz,J. A.Langmuir 1993,9,1754. (20) Wahba, G.SIAM J.Numer. Anal. 1977, 14, 651. (21) Butler, J. P.;Reeds, J. A.; Dawson, S.V. SLAM J.Numer. Anal. 1981, 18,381. (22) Brown, L.F.;Travis, B. J.InFundamentals ofuorption; Myers, A. L.,Belfort, G., Eds.; Engineering Foundation: New York, 1984; p 125. (23) Koopal, L. IC;Vos, C. H. W. Langmuir 1993, 9, 2593. (24) De Boor, C.APractica1 Guide to Splines;Springer-Verlag: New York, 1978.

where 1 is the so-called smoothing parameter which controls the balance between the smoothnesslroughness of the solution and the accuracy of fit. If the solution is given in the form of eq 2, then the integral in expression 10 may be written in terms of b parameters as (25) Phillips, D. L.J.Assoc. Comput. Mach. 1962,9, 84.

Jagidto

2780 Langmuir, Vol. 10,No. 8,1994

Min I Ab - yl i2

+ AbTGb

(11)

where matrix G is defined by

(12) A necessary condition for a minimum, that is the first derivatives with respect to variables are equal to zero, leads to the solution of the regularization problem (11)

6 = ( A ~ A+ A G ) - ~ A ~ Y

(13)

This result is different from the LS solution (8) by the stabilizing term AG. An equivalent approach, called ridge regression, was proposed by H o e r P in the context of the unstable LS problems. It was demonstrated2' that there exists a parameter, I =- 0, such that the mean square error of the ridge regression estimate of variables is less than the mean square error ofthe LS estimate. Therefore the regularized solution not only is more stable but also may be more accurate than the least-squares solution provided the optimal smoothing parameter I is established. Before addressing the problem of choice of the smoothing parameter, it is useful to consider some properties of the regularized solution. Unlike the LS solution the regularized solution (13) is a biased estimator of b which means that the expected value of 6, ~ ~ 6depends 3 , on the I parameter. Consequently also the variance of 6 is dependent on I var(6) = ~ Z A ~ A Z

(14)

+

(15)

where

Z = (ATA IG1-l

and u2is the error variance of the data. The expression for the variance is derived from eq 13 and is based on the well-known2* property of the linear transformation of random variables that is var(u) = Lvar(t)LT

(16)

where t and u are vectors of random variables related by the linear transformation u = Lt. Hoerl and Kennard2' in their discussion of ridge regression represented the bias introduced by this method in terms of the unknown vector b. Such an approach, although useful for general considerations, is not practical for the estimation of the amount of bias introduced by using a given value of I . Let us consider another approach that introduces the measure ofthe effectivebias based on regularized solution 6. Suppose that 6 ( I ) estimates the true solution with sufficient accuracy for certainil. Then the predicted values of y are given by

y(A) =

= AZATy

(17)

Now, if y in eq 13 is replaced by its predicted vector 4, a new estimate of b is obtained

6(A)= ZATg(A)= ZATA6(I)= ZATAZATy (18) which is apparently different from 6 unless d = 0 when (26) Hoerl, A. E.Chem. Eng. Prog. 1962, 58, 54. (27) Hoerl, A. E.;Kennard, R. W. TechnometTics 1970,12, 55. (28) Scheffe, H. The Analysis of Variance; Wiley: New York, 1959.

both estimates reduce to the LS solution. The difference given by

d(I)= 6(A)- b(A) = ( U T A- I)6(A)= (ZATA- I)ZATy (19) represents the effective bias introduced by the regularization procedure when 6(A)is assumed to be the estimate of b for a given I . The effective bias plays an important role in our further analysis of the regularized solution of the adsorption integral equation. At this point it is convenient to introduce the singular value decomposition SVD29 which is a fundamental computational tool used in regression problems. The properties and numerical applications of the SVD are wellLet P be a matrix defined by

p =~

~ - 1 ' 2

(20)

and let the singular value decomposition of P be

P = usvT

(21)

where U and V are orthogonal matrices and S is the diagonal matrix of singular values. Using eqs 20 and 21 and the properties of orthogonal matrices in combination with eq 13, we obtain the solution in the new form

6(A)= G-1/2V(S2+ ,WISUTy

(22)

From this expression the role of the smoothing parameter is explicitly seen. Another, computatipal, advantage of using the SVD is that from eq 22 b(I)can be easily calculated for different values ofA by matrix multiplication only. It is straightforward to express also other quantities associated with the regularized solution in temrs of SVD. Here, for instance are expressions for the variances associated with vectors 6 and d which are important in our further analysis: var(6) = 2

~1 / -2

~ 2 ~ 2 1 (/ 2~~ )- T

(23)

+

var(d) = dG-1/%(K2S2- 2K3S4 K4S6)(G-1/2V)T (24) where

K = (5'

+ AI)-'

(25)

The Choice of the Smoothing Parameter. Since the regularized solution (13)is biased and depends on the smoothing parameter, the choice of I is a n inherent part of the regularization method. There exist several methods of choosing this parameter. The literature on related problems such as ridge r e g r e ~ s i o n ,nonparametric ~~ r e g r e s ~ i o n , linear ~ ~ , ~ ~operator equations,20 and recent review paper^^^"^ reveal that the method used most is the generalized cross-validation method GCV.20This method does not require knowledge of u2and estimates A from the data. The I parameter is obtained as a minimizer of the following expression: (29) Golub, G.H.; Reinsch, C. Numer. Math. 1970, 14, 403. (30) Lawson, C. L.; Hanson, R. J. Solving Least Squares Problems; Prentice-Hall: Englewood Cliffs, NJ, 1974. (31) Golub, G . H.; Heath, M.; Wahba, G. Technometrics 1979,21, 215. (32) Craven, P.;Wahba, G. Numer. Math. 1979,31, 377. (33) Silverman, B. W. J . R.Statist. SOC.B 1986, 47,1. (34) Hall, P.;Titterington,D. M. J . R.Statist. SOC.B 1987,49, 184. (35) Thompson,A.M.;Brown, J. C.; Kay, J. W.; Titterington, D. M. IEEE Trans. Pattern Anal. Machine Intell. 1991, 13, 326.

Solution of Adsorption Integral Equation

GCV =

Langmuir, Vol. 10, No. 8, 1994 2781

nl I(I - AzAT!yl l2

(26)

[trace(I - AZA~)]'

The theoretical rational for this method31is that for large n the GCV estimate ofd minimizes the mean square error of prediction of y values. From a number of applications it follows that in practice this method usually gives good results. However, there are reports that in some cases it underestimates the values of ;113,36o r can even fail ~atastrophically3~ giving no positive d or grossly underestimated value of this parameter. If GCV is the only method one applies to find d, then there is no way to verify whether the result is correct or not. Therefore we prefer more comprehensive analysis of the regularized solution in order to choose the optimal smoothing parameter. Intuitively, it is natural to consider in this purpose the balance between the stability (robustness) of the solution and the effective bias introduced by the smoothing parameter. Since we are not interested in the values of particular bi parameters but rather in their average properties, we define the following average quantities

(27)

Now let us examine the properties of the effective bias. Using similar arguments as above we have rms(E[dl) = q ( d )

(33)

for large OM). Another important property can be derived comparing eqs 23 and 24. Taking into account definition 25 it is easy to show for d > 0 that

K?S? - 2K:S:

+ K;S:

< Ki2Si2

for i = 1, 2, ..., m (34)

and consequently

q(d) < ~ ( b ) for ;1 > 0

(35)

It follows that as long as eq 33 applies, the expected value of effective bias is smaller than uncertainty of 16 estimate. On the other hand i t can be deduced from eq 23 that $6) is a strictly decreasing fbnction of d which is consistent with the stabilizing role of this parameter. In light of these properties, we propose, as a second criterion, to choose a maximum value of A for which the effective bias is still smaller than the uncertainty of the 16 estimate rms(d) < q(b)

(36)

In practice it is convenient to plot the relative effective bias defined by where u is a vector such as 16 or d . In this notation ~(16) represents the average uncertainty of bi parameters while rms(d) is a measure of the average effective bias. Let us look more closely a t the following fundamental property of the diagonal elements of the variance matrix associated with the vector u = (16 or d) m

where E[ul denotes the expected value of u. It follows from this relationship that for ui having large individual variances, we can write m i=l

which is equivalent to

-

-

for

dui)

-

for

a(u,)

~E[ui2I/traceEvar(u)1 1

rms(E[ul)/~(u) 1

-

00

(30)

00

(31)

This is the case when our problem is ill-posed and the values of b have large variances. Conversely we can say that if for vector 6,for some d, the relationship (30 or 31) holds, then we have unstable estimate of b which is completely useless as a solution. In practice, however, we do not have at our disposal the expected values E[61. Nevertheless, it is still reasonable to reject such an estimate 16 for which rms(16) is of the same order of magnitude as its measure of uncertainty q(16). This leads us to our first criterion of choice of d parameter, that is, only suchb can be accepted as a meaningful solution for which q(b) is considerably smaller than rms(16)

q(ii)