Analytical moments of skewed Gaussian distribution functions

Analytical Moments of Skewed Gaussian Distribution Functions. P. F. Rusch and J. P. Lelieur. Laboratoire de Chimie-Physique, 13, rue de Toul, 59 -Lill...
4 downloads 0 Views 334KB Size
Analytical Moments of Skewed Gaussian Distribution Functions P. F. Rusch and J. P. Lelieur Laboratoire de Chirnie-Physique. 13, rue de Toul, 59 -Lii/e. France

Many experiments yield data in the form of distribution functions which are analyzed to obtain physical quantities of interest. Symmetric distribution functions are generally described as bell-shaped curves and are characterized by several parameters. A general distribution function is given by Equation 1

(1) y = f ( Y ,W , x , , x ) where the parameters which describe the function are: Y , the maximum ordinate value which, by definition, occurs a t abscissa value x,, and W , the full width of the distribution function at half-maximum intensity. The parameter W is referred to as the half-width. The most often encountered line-shapes are the well-known Gaussian and Lorentzian distribution functions. Several other types of distribution functions have been defined and may be found in standard reference texts ( I ) . Both Gaussian and Lorentzian distribution functions are symmetric with respect to xo. Some observed experimental data are not symmetric distribution functions but rather asymmetric or skewed. Skewness may often be the result of overlapping symmetric distribution functions. In this case, a resolution of the overlapping components is indicated. It may be that a change of abscissa value may lead to equally meaningful data which are symmetric. There are, of course, cases where a skewed distribution function is predicted. It is these cases which are of concern here. A few examples of these cases are the electronic spectra of complex molecules (2) and chromatographic elution curves (3, 4 ) . There have been several attempts to find empirical functions which describe these skewed distributions. One common method is to use the quotient of two polynomials. Another method is that of describing the distribution by two different functions ( 5 ) with common Y and x , but different W . Both of the above methods are successful but the resulting functions lack certain properties of interest such as the ability to easily calculate or even define the moments. Although it is possible to calculate the moments of a discrete distribution function ( I ) , these moments are usually underestimated because of the lack of data extending to large values of x. It is the purpose of this paper to show that for a skewed Gaussian distribution, all the moments are defined and can be expressed analytically. Skewed Gaussian Distribution Function. Function. In 1969. Fraser and Suzuki (6) proposed an empirical function which can be used to describe a skewed distribution. This function, given in Equation 2, contains an exponential factor and is termed a “skewed Gaussian.”

(1) “Handbook of Mathernaticai Functions,” M. Abramowitz and I . A . Stegun, Ed., U.S. Government Printing Office, Washington, D.C., 1965. ( N a f . Bur. Stand. IU.S.i Appi. Mafh. Ser. 5 5 ) , p 927. (2) D . E. Siano and D . E. Metzler. J . Chem. Phys.. 51, 1856 (1969). (3) 0. Grubner, A n a / . Cherri.. 43. 1934 (19711. ( 4 ) S . N . Chesler and S. P. Cram. Anai. Chem.. 43, 1922 (1971). ( 5 ) J. L. Dye. M . D e Baclter. and L. M . Dorfman, J. Chem. Phys.. 52, 6251 (1970). (6) R. D . B Fraser and E. Suzuki, Anal. Chem.. 41, 37 (1969)

In the skewed Gaussian, both Y and x, are defined in the same way as for a symmetric Gaussian. The parameter b is the asymmetry parameter which is positive for skewing in the direction of x > 0 and negative for skewing in the opposite direction. As b approaches 0, the skewed Gaussian tends to a symmetric Gaussian (6). The parameter Ax is related to the half-width by

W

=

sinh b

AX-

b

After initial estimation (6) of the parameters in Equation 2, the method of non-linear least squares (7) may be used to refine the values for a better fit to the data. Moments. The kth moment ( v k ) of a distribution function is defined ( I ) by Equation 4 where f ( q ) is the distribution function. +=

vh

.f--

q’f(qIdq

(4)

For a symmetric Gaussian function all the moments (k 2 0) are defined ( I ) and may be calculated from Equation 4. We report here the moments for the skewed Gaussian distribution function. Moments are not defined for Lorentzian distribution functions ( I ) and thus the skewed Lorentzian is of little interest (6). The integration in Equation 4 requires that the distribution function be continuous. The skewed Gaussian in Equation 2 is, in fact, not continuous from + m to - m . For values of x given by Equation 5 , the natural logarithm becomes undefined.

These values of X L always lie to the side of X, which is opposite the skewing (for b > 0, X L < X, while for b < 0, X L > x,). At values of x approaching X L from x,, the intensity of the skewed Gaussian becomes negligibly small so that the discontinuity is of little consequence. The zeroth moment ( h = 0) in Equation 4 is simply the area under the distribution function. The area under the skewed Gaussian distribution function was reported by Fraser and Suzuki (6). The method of integrating Equation 2 is of general interest and will be given here in some detail. In the following development, it will be assumed that b > 0. The integration of Equation 2 is accomplished by changing the variable as shown in Equation 6.

With the change in variable, the lower limit of integration (for b > 0) changes from x > X L to - m . In the case of b < 0, the upper limit of integration would change from x < XI> to + = . Thus, the change of variable permits the calculation of the area (and higher moments) under the definition given in Equation 4. Substitution of Equation 6 into Equation 2 yields the definition of the area given in Equation 7. ( 7 ) W E WentworthJ Chem Educ 42, 96 (1965)

ANALYTICAL CHEMISTRY, VOL. 45, NO. 8, JULY 1973

1541

Table I. Normalized Moments (Relative to X R ) of the Skewed Gaussian Distribution Function Wk ( k

XR

> 1)"

xo

4In2

);(

a E,,,,,,

AX

k k+l

c

= vo = Y k2 S -=

exp[ - In 2 X X2

+ bX]dX

(7)

The problem now is clearly one of integrating the exponential polynomial on the right-hand-side of Equation 7 . The polynomial in X can be written as shown in Equation 8. b2 -In 2 X X2 bX = -1n2(X (8)

The normalized moments are then defined by Equation 14

&)' +

+

By substitution of Equation 8 into Equation 7 , the integral of interest can be written as shown in Equation 9.

:1

4 In 2

is the binomial coefficient for 0 5 n 5 m.

t=

area

(-l)k--Lflflk,l-lexp

1 = 1

exp[-ln 2 X X2

where the distribution function is normalized so that

The first moment (61) is the average value of x ( ( x ) ) and for a symmetric Gaussian distribution function is equal to xo. For the skewed Gaussian ( x ) # xo and is defined by Equation 15.

+ bX1dX =

(x)

By changing to polar coordinates, the integral on the right-hand-side of Equation 9 can be calculated. The value of the integral in Equation 7 is then given in Equation 10.

Jr

exp[-ln 2 X X'

+ bX]dX

141

= exp -

[4

21

(&)ll'

(10) This result is perfectly general in terms of b being any constant, a fact which is important in calculation of the higher moments. The area of the skewed Gaussian is found by substitution of Equation 10 into Equation 7 . The area given in Equation 11

is identical with that reported by Frazer and Suzuki (6). For many types of analysis, it is of interest to have the normalized moments. These normalized moments are calculated with a normalized function so that the zeroth moment is unity. The normalizing condition is given by Equation 12

JT'N

f ( b ,Y ,Ax. no,x)dx = 1

(12)

where N is the normalization constant. It is obvious that N is simply the inverse of the area so that the normalized skewed Gaussian, 7, is given by Equation 13 which is independent of Y . 1542

t o=

1.

ANALYTICAL CHEMISTRY, VOL. 45, NO. 8, JULY 1973

= T, =

x F(b, Ax, xo. x)dx

(15)

To perform the integration in Equation 15, it is necessary to change the variable according to Equation 6. The general integral of Equation 10 is used recalling that the result is general for any constant b. Substitution into Equation 15 yields Equation 16

which reduces to the value for a symmetric Gaussian if b = 0. Higher moments may be calculated in a similar manner yielding the t k for k > 1. The purpose of calculating the moments of a distribution function is to obtain information about fundamental parameters of the system under investigation. Physical interpretation of the higher moments (3, 8) ( k ? 2) may be more convenient in terms of the moments relative to some reference value. These moments are defined by Equation 17. -

,uh

L:-(x

- XRYT((X) dn

(17)

The reference value ( x ~ may ) be any convenient, unique abscissa value but usually is taken to be x o for a symmetric distribution function. For the skewed Gaussian, either xo or ( x ) may be used. To calculate the f i k it is necessary to use Equation 13 and change the variable according to Equation 6. The change of variable leads to the following expressions for ( x - X R ) ~ For . XR = xo. (8) M L a x , J Chem Phys 20, 1752 (1952)

sioned. Thus, the value of the skewed Gaussian is always in ordinate units while the value of W (Equation 3) is in abscissa units. The area in Equation 11 has units of ordi, have values nate times abscissa. All the moments ( v ~C(k) in abscissa units to the kth power.

and for X R = ( x ) :

The first term on the right-hand-side of Equations 18 and 19 is a constant. The second term may be expanded to the kth power by use of the binomial theorem (9). Final evaluation of Equation 17 depends on evaluating the integral of a sum of k + 1 terms. This problem reduces to the sum of k + 1 integrals of the type in Equation 10. Once again it is recalled that Equation 10 is a general result for any constant b. The result of these calculations is given in Table I. Results are given with sums from 0 to k and from 1 to k + 1. This latter result is more amenable to rapid calculation in computer programs. Units. The asymmetry parameter ( b ) in Equation 2 is a dimensionless constant. Both x o and Ax have the units of the abscissa of the experimental data as, of course, does x . Depending on the type of experimental data being considered, the ordinate parameter Y may or may not be dimen(9) "Handbook of Chemistry and Physics." The Chemical Rubber Publishing Co., Cleveland, Ohio, 1961, p 334.

CONCLUSION Determination of the four parameters ( Y , Ax, xo, b ) of the skewed Gaussian distribution function (6) provides a complete description of an asymmetric peak. Not only can the shape of the peak be reproduced but the moments of the distribution can be calculated. Use of the non-linear least squares method (7) to obtain the four parameters permits the calculation of the errors of the "final values." Errors of the calculated moments of distribution may then also be estimated. This is in marked contrast to the numerical calculation of moments of the distribution from discrete experimental values where no such estimates of the errors are possible although they can be quite large. Received for review July 3, 1972. Accepted January 22, 1973. One of us (PFR) would like to acknowledge the receipt of a NSF-CNRS fellowship which permitted him to study with the research group of the Laboratoire de Chimie-Physique.

Electrophoretic Desalting of Acid Hydrolysates for the Isolation of Amino Acids Mario R. Stevens' Jet Propulsion Laboratory, California institute of Technology, Pasadena, Calif The successful application of the quantitative procedures for the analysis of amino acids, as developed by Gehrke and Stalling ( I ) , Roach and Gehrke ( 2 ) , and Gehrke, Kuo, and Zumwalt ( 3 ) , is predicated upon the elimination of any inorganic salts present in the amino acid-sample matrix. Each of the above cited procedures makes use of gas-liquid chromatography for the separation and identification of the individual amino acids. Since the free amino acid is not volatile enough to permit a direct gas chromatography analysis, one must first convert the amino acid into a volatile derivative. Several such derivatives have been prepared, the most prominent being the N-TFA-n-butyl esters ( I , 4 ) , N-TFA-2-butyl esters ( 5 ) , the N-acetyl-n-butyl esters ( 6 ) .and the N-trimethylsilyl-0-1-butyl esters (7). Generally, the amino acids of interest are those associated with soils, sedimentary rocks, and biological materials. In most procedures, the initial sample processing step involves acid hydrolysis using 6N HCl. High salt concenPresent address, T o w n e - P a u l s e n

& Co. Inc., M o n r o v i a , C a l i f .

(1) C. W. Gehrkeand D. L. Stalling.Separ. Sci., 2, 101 (1967). (2) D. Roach and C. W.Gehrke, J. Chrornatogr., 44, 269 (1969) (3) C. W . Gehrke, K . Kuo, and R . W . Zumwalt, J. Chrornatogr., 57, 209 11971). (4) E. Geipi, W. A. Koenig. J. Gilbert, and J. Oro, J. Chrornatogr. Sci., 7. 604 (1969). ~~. (5) G. E. Pollock, A. K. Mijamoto, and V. I. Oyarna, "Life Science and Space Research," Vol V I I , North Holland Publishing Co., 1970, pp 99-1 07. (6) S. C. J. Fu and D. S. H. Mak, J. Chrornatogr., 54, 205 (1972). (7) 3 . P. Hardy and S. L. Kerrin, Anal. Chern., 44, 1497 (1972).

trations are characteristic of such hydrolysates. These salts interfere with the preparation of the required derivatives and/or with the subsequent gas chromatography of the derivatives. Gehrke and Leimer (8) made a study of the effects of salts upon the derivatization and chromatography of amino acids. They found that iron(II1) salts interfere with the derivative-chromatographic process. Iron( I n ) salts are most common to geological samples. They demonstrated interference at 1:1 concentration ratios of amino acids to salts whereas, in geological samples, the ratio is more likely to be 1:lo3 or 1:104. Ion exchange and electrolytic desalting techniques are commonly employed. Both are very effective. In particular cases, adsorption dialysis, solvent extraction, and ultrafiltration are also applicable, as shown by Smith (9). Both ion exchange and electrolytic methods may prove to be inadequate when trace quantities of amino acids are present. The former requires controlled elution techniques and subsequent concentrating steps. Losses may take place in each step. This method may prove cumbersome for automated, remote control applications. The latter method will lead to losses with respect to acidic and basic amino acids. These will behave as anions and cations, respectively, and will be removed from the solution under the electrolysis conditions.

I

(8) C. W. Gehrkeand K . Leimer, J. Chrornatogr., 53, 195 (1970). (9) "Chromatographic and Electrophoretic Techniques," lvor Smith, Ed., Vol. 1, Chapter 3, lnterscience Publishers Inc., New York, N.Y., 1960. ANALYTICAL CHEMISTRY, VOL. 45, NO. 8, JULY 1973

1543