Factor analysis as a complement to band ... - ACS Publications

J. T. Bulmer and H. F. Shurvell. Factor Analysis as a Complement to Band Resolution Techniques. I. The Method and its Application to Self-Association ...
9 downloads 0 Views 979KB Size
J. T. Bulmer and H. F. Shurvell

258

ysis as a Complement to Band Resolution Techniques. 1. The Method and its Application to Self-Association of Acetic Acid J. T. Btuimer and M. F. Shurvell* 0epartr;wnt of Chemistry, Queen's University, Kingston, Ontario, Canada (Received July 28, 79721

Band resolution of digitized infrared spectra of the carbonyl region of acetic acid-CCl4 solutions yields direct evidence for the existence of an equilibrium competing with the monomer-cyclic dimer equilibrium traditionally assumed to occur in carboxylic acid solutions. As the total acid concentration increases, four instead of two Lorentzian bands are required -to obtain a satisfactory fit of the _spectra. The position and intensity trend as a function of concentration of these additional bands are consistent with the formation of a chain structure. The mathematical technique of factor analysis has been modified into a Form applicable to digitized infrared spectra in order to act as a complement to the band resolution method. Factor analysis confirms the presence of four absorbing components over the concentration range 1.72 X 1P4-4.31 X 10-2 M without the band shape assumption inherent in the curve fitting.proceduse.

Introduction A great deal of work has been published previously on association equilibria in carboxylic acid systems. However, in general there has been poor agreement between results obtained from different experimental methods and from different, methods of analysis of data obtained from a single technicjue.1%2To illustrate the extent of the problem, consider the data summarized in Tables I and I1 for acetic acid and its chlorinated derivatives. Table Is-' lists

TABLE II: Comparison of Dimerization Equilibrium Constants ( M - ' ) for Acetic Acid and its Chlorinated Derivatives in Benzene Ref

6

9

30

25

3

4

6

7

N.R.

25

0.05

0.001

5 T, "C

25

25

24 Cancn, M

20/30

26

N.R.

N.R.

Dielectric constant

CH3COOH 380 CHzCICOOH 102 CHCIzCOOH CC13C00H

_______I_

11

1

12

20

25

25

0.5

Differential vapor pressure

Concn, M

Acid

Ref

2 T, "C

0.01 1.6 TABLE I: Comparison of Dimerization Equilibrium Constants ( M I ' ) for Acetic Acid and its Chlorinated Derivatives in CCl4 Solution

10

226/131

24

1.0 Infrared

500

28 24

48.4

11 1.1

27.1 6.7

50 5

dependence of the cyclic dimerization equilibrium constant. They concluded that cyclic dimerization is the major association process operating a t acid concentrations

_ I -

AcP

0 07

0.0016

0.2

CH3COOH CH2CICOOH CHCI&OOH CG13COOH

4000 1610 1C70

2000

2370

530

270

1000-2650

877 280

J. Steigman and W. Cronkright, Spectrochfm. Acta, Part A , 26,

1805 (1970). D. P. N. Satchel1 and J. L. Wardell, Trans. Faraday Soc., 61, 1199 11965\

i:-f.-'Harris and M . E. Hobbs, J. Amer. Chem. Sac., 76, 1419

410

(1954). A. M. Melnick, eta/., Spectrochim. Acta, 20, 285 (1964). J. Wenograd and R. A. Spurr, J. Amer. Chem. SOC., 79, 5844

results for the determtnation of monomer-cyclic dimer equilibrium constants in carbon tetrachloride solution using infrared absorption measurements, while Table 111.2 98-12 compares results for benzene solutions using three different experimental techniques. There i s clearly no agreement arr;ong the results. There are several reasons for this lack of agreemlent. The usual ashumption made in studies of hydrogen bonding in carboxylic acids is that the only equilibrium occurring is between monomer and cyclic dimer independent of the total afcnd concentration. However, there is considerable evidence that this is not the only equilibrium occurring;6,13--16 Chiang and Hammake+ studied trimethylacetic acid in CCla and noted a concentration The Journal of Physical Chemistry, Vol. 77,No. 2, 7973

(1957). G . M . Barrow and

E. A. Yerger,

J. Amer. Chem. SOC., 76, 5248

(1954). R.

E.

Kagarise, Naval Research Laboratory Report No. 4955

(1957). H. A. Pohl, M. E. Hobbs, and P. M. Gross, J. Chem. Phys., 9, 408

(1941). R. J. W. LeFevre and H. Vine, J. Chem. SOC.,1795 (1938). (a) E. A. Moelwyn-Hughes, J. Chern. SOC., 850 (1940):(b) C. P. Smith and H. E. Rogers, J. Amer. Chern. SOC.,52,1624 (1930). Y. Nagai and 0. Sirnamura, Bull. Chem. SOC.Jap., 35, 132 (1962). S. Bruckenstein and A. Saito, J. Amer. Chem. SOC.. 07, 698

(1965). C. Chlang and R. M. Hamrnakcr, J. Phys. Chem.. 69, 2715 (1965). E. S. Hanrahan and 6. D. Bruce. Spectrochim. Acta, Part A , 23, 2497 (1967). S.Chang and H. Morawetz, J. Phys. Chern., 60, 782 (1956). S. L. J. Bellamy, R. F. Lake, and R. J. Pace, Specfrochim. Acta, 19, 443 (1 963). T.

Factor Analysi!; Techniques below 4 X &I, but suggested that chain structure formation also occurs to a significant extent especially as the M. Similarly Barconcentration increases above 4 x row and YergerG studying acetic acid in C C 4 found that the apparent dimerization equilibrium constant tends to decrease with rising acid concentration as would be expected if a (competitive equilibrium was setting in. The existence of other hydrogen bonded species such as the chain structure invalidates the previous mathematical approaches which are dependent on the mass balance equation resulting from the assumption of only cyclic in Tables I and I1 dimer f 0 r r n a t ~ o m . ~ ~ 2 ~ 3As , 5 indicated ,~,~5 for solutions with concentrations as high as I .0 M a monomer- cyclic dimer equilibrium only has been assumed, whereas there is now some indirect evidence indicating the existence of a competitive equilibrium even a t acid concentrations as low as 10- M. Infrared spectroscopy was selected as the experimental technique for a renewed investigation of the hydrogen bonding of carbox:ylic acids in view of the advantages documented by Pimrntel and McClellan.17 In particular it was decided t o pursue the computer techniques applicable to digitized spectra, since their use encourages rationalization of the entire band shape rather than mere observation of the position and heights of the major spectral features. In view of the large molar absorptivity associated with the carbonyl stretching vibration, as well as the frequency and intensity dependency of this vibration on hydrogen bonding, the carbonyl region was selected for band resolution studies. This approach may be considered as an application of very sensitive data analysis to a normal mode which i s sensitive in its dependence on the nature and extent of the hydrogen bonding. Allowance for overlap is also permitted in the band resolution process. The simplest theoretical treatment of the shape of infrared bands predicts a Lorentzian band contour.1s-21 Deviations from Lorenmian contours in the form of a Gaussian perturbation have been interpreted in terms of specific solute-solvent interactions, isotopic species, conformational isomers, or rotational freedom in the solvent cavity.22 Since the Gaussian perturbation is expected to be small for dilute solutions in nonpolar solvents such as CCI4 Liorentziim band shapes were used for identifying the number of absorbing components. The programs of Pitha and Jones2" in which the experimental band envelope is approximated by the Lorentzian function

where the parameters for the P t h band rl(P), xz(P), and Q(P) are estimates of log(T/To),(o,, v ( 0 ) and 2 / A v 1 / 2 , respectively; a is the baseline displacement in absorbance; NC is the number of absorbing components. Whether this Lorentzian function or a Gaussian perturbation in terms of a sum or a product is used these band resolution programs require the number of absorbing components as, an input parameter. Correct knowledge of this parameter is a prerequisite to any band shape analysis or assignment of physical interpretation to the spectrum. One procedure is to chose a probable band shape and sequentially increase the number of absorbing components until the value of the minimization function d, given by

257

Q, = Z f i 2

approaches the optimum fit value of S2.NP where NP is the number of points in the digitized spectrum and S is the standard deviation in the transmittance measurements. Fluctuation of the wave number of maximum discrepancy between the calculated and experimental spectrum (WFM) from iteration to iteration in the convergence process indicates that the correct number of absorbing components is being used whereas an invariant WFM indicates the necessity of an additional absorbing component. In view of the importance of the parameter NC and the dependence of the above approach on an assumed band shape and initial estimates of the parameters an independent method for the determination of NC was sought. Factor analysis provides an independent determination of the number of absorbing components (NC)for a series of spectra. It is based on standard theorems of linear algebra rather than on any assumed band shape. In general the method of factor analysis is applicable when a set of constitutive properties is measured for each-of a series of mixtures. The absorbance values normalized to unit path length exemplify such measurements as indicated by the Lambert-Beer's law which in matrix notation is

A = EC (3) in which A is the NW x NS normalized absorbance matrix, E is the NW X NC molar absorptivity matrix, and C is the NC X NS concentration matrix where N W is the number of wave numbers at which digitization is performed and NS is the number of solutions studied. Under the experimental arrangement where N W > NC and NS > NC, it can be shown that the rank of the matrix A is equal to NC24-30 provided the spectrum of one of the individual components is not a linear combination of the spectra of the other components and that the concentration of one or more species cannot be expressed as a linear combination of the other components for all the experiments performed. The problem now i s to determine the rank of the absorbance matrix. Previous spectrophotometric approaches24-26 have formulated the matrix P = AAT where A T is the transpose of A. The matrix P has the advantages of being positive definite and of having the same rank as A namely NC. However a positive definite matrix may be diagonalized in t e r m of nonnegative eigenvalues. In the absence of experimental or computational errors the rank of P and hence the rank o f A is the

G. C. Pimentel and A. L. McClellan, "The Hydrogen Bond," W. H. Freeman, San Francisco, Calif., 1960. K. S. Seshadri and R. N. Jones, Spectrochirn. Acta, 19, 1013 (1963). S. Abrarnowitz and R. P. Baurnann, J. Chern. Phys., 39, 2757 (1963). J, T. Shirnozawa and K. M. Wilson, Spectroochim. Acts, 22, 1591 (1966). R. N. Jones, et a/., Can. J. Chern., 41, 750 (1963). J. G. David and H. E. Haliarn, J . Mol Sfruct.. 6 , 31 j i Y 7 0 ) . R. N. Jones, ef a/., National Research Council Bulleiin No. 11 and 12 (1968); No. 13 (1969). J. J. Kankare, Anal. Chem., 42, 1322 (1970). G. Wernirnont, Anal.~Chem.,39,554 (1987). Z. Z. Hugus and A. A. El-Awady, J. Phys. Chon?.,75,2954 (1971). S. Ainsworth, J. Phys. Chem., 65, 1968 (1961); 63, 1613 (1963). R. M. Wallace, J. Phys. Chem., 64, 899 (1964). G . Weber, Nature (London), 190, 27 (1961) 5.Kalakis, Anal. Chem., 37, 876 (1967). The Journal of Physical Chemistry, Voi. 77. No. 2, 7973

J. T. Bulmer and ti. F. Shurvell

258

number of nonzero eigenvalues of P. In practice however these errors ale alwa.ys present and statistical criteria must be used to determine which eigenvalues are nonzero. The matrix P has dimension NW X NW, which can be very large for the detailed spectrum digitization required for curve fitting. However, from the properties of characteristic polynomials and the Cayley-Hamilton theorem it follows that if (s and B are matrices of dimensions ( m X n ) and ( n X m ) , respectively, then CD and DC possess the same set of nonzero eigenvalues. Therefore P = AAT and Q = ATA have the same nonzero eigenvalues. This simplifies the problem of obtaining NC enormously, because the dimension of Q will be relatively small (NS X NS). The number of different concentrations used (NS) must of course be considerably more than the possible number of absorbing components (NC) to be determined. Having diagonalized the matrix Q the problem remains of deciding which eigenvalues are in fact nonzero. The following three staitistical methods have been used. 1. Residual Standard Deviation Method. The residual standard deviation S k is defined by 24925

In these equations the standard deviation in the absorbance, uA, is determined using eq 5 . Thus this statistical analysis permits inclusion of the variation of the absorbance error as bands of the spectrum are scanned. The number of absorbing components is taken as the number of eigenvalues which are larger than the square root of its estimated variance. 3. Estimation of x 2 . In order to establish whether m components are sufficient to fit the experimental data, Hugus and El-Awady26 have also derived an expression for xm2 using the first m eigenvectors of P to regenerate the data space. For Q = ATA the formulation is somewhat different. If S ( Q ) is the matrix of eigenvectors of Q then

S T ( Q ) Q S ( Q=) ST(Q)ATAS(Q) =D where D is a diagonal matrix of eigenvalues and ST( Q)S(Q) = I (the identity matrix). If A4 = AS( Q) then

MST(Q)= A S ( Q ) S T ( Q=) A where ri are the eigenvalues of Q ordered in descending order. If S k L: Sa, then NC > K where Sa is the standard deviation in the absorbance measurements either estimated or calculated using the above procedure on a system with a known number of components. Although the latter procedure is more rigorous the use of an estimated Sa should be permissible since the accuracy of Sa need only be sufficient to distinguish the integer value of K which may be properly taken as NC. For the estimation of Sa it must be recalled that infrared spectra are normally recorded linearly in transmittance 16 with a constant error in the transmittance rather than in absorbance. In general eq 5 which relates an increment in the Lransmittance AT to the corresponding increment in the absorbance A.4 may be used to calculate the absorbance error from the transmittance error for any given value of T. For the residual standard deviation method Sa may be estimated from the estimated standard deviation in the transmittance and a typical transmittance. However this approach may be faulted on the grounds that it does not allow for the continuous variation of the transmittance as the bands are scanned but rather requires the simplifying assumption of a typical value for the transmittance, The next two statistical criteria permit use of eq 5 for. ebtimating the absorbance error for each digitized data pioint. hAi = -logeAT/T (5) 2. Estimation of the Square Root of the Variance of Each Eigenvafluc.. Hugus and El-AwadyZ6have derived an expression for the square root of the variance in the lth eigenvalue of P" In the case of Q =e ATA the square root of the variance of the lth eigenvalue denoted as crvli is given bY

where S j l ( Q ) and SkE(&)are the jth and kth components of the Ith eigenvector of Q and

The Journal of Physics/ Chemistry, Vol. 77,

No. 2, 1973

These equations are exact if all NS eigenvectors of Q are used, whereas use of the first m eigenvectors corresponding to the largest m eigenvalues yields an approximation to A. Then x2 is given by

where V p = ZlmMLISJIT(Q) ( L e , , V is the approximation to A using only the first m eigenvectors) and where M S I= Z K A L ~ S ~ J(by ( Qdefinition). ) When m = NC, xm2 should have a value near the expectation value (NW - rn) (NS m).31 Consideration of the quantity in brackets in eq 8 indicates that a distribution of the misfit between the observed absorbance matrix and estimated absorbance matrix using the first m eigenvectors may be obtained as a function of the standard deviation in the absorbance measurements. Inspection of this distribution provides an additional criterion for determining NC. For simplicity the above discussion has neglected the influence of path length by simply considering the normalized absorbance matrix, in whic the absorbance readings for each solution were divided by the path length of the cell. However if concentrations are varied over several orders of magnitude the path lengths must be varied appropriately. Division of the absorbances by the path length then leads to an unfortunate weighting Bn which intensity measurements for the more concentrated solutions are weighted more heavily than those for dilute solutions. Since factor analysis essentially consists of removing the linear combinations of rows of a matrix to obtain the number of independent components which span the data space, multiplication of a row of the absorbance matrix by a constant (e.g., the reciprocal of the path length) is irrelevant from an algebraic point of view. Neglecting the path length normalization does not change the rank of the absorbance matrix. but has the computational advantage of not introducing large weighting factors, Also onrimission of (31) z. z. Hugus, private communication.

Factor Analysis Techniques

259

the path length in the factor analysis eliminates the path length determmnation as a source of error.

Experimental Section Carbon tetrachloride (Fisher Spectranalyzed) was dried over molecular sieve6 4A. Dryness was checked by monitoring the 37O9-cir1 va(HzO) and 3618-cm-1 vs(HzO) water bands.32 Fresh Fisher reagent grade glacial acetic acid was used without further purification. Stock solutions were prepared by weight with subsequent dilutions being performed volumetrically. In order to determine the effect of water on the acetic acid-CClc system, solutions employing (XI4saturated with respect to water were used and compared with solutions where drying and manipulations were performed in a drybox. Infrared spectra were rekorded from 1850 to 1670 cm-I on a Perkin-Elmer 1.80 spectrophotometer using a resolution of 2.0 cm-l at a scan rate of 4 cm-l/min. A R.I.I.C. variable path length cell with KBr windows was used. This cell was calibrated by the interference fringe method. A similar cell was placed in the reference beam to Compensate for the slight CC14 absorption and reflections at the window mterfaces. The temperature of the sample was monitored using a chromei-alums1 thermocouple system. A Honeywell Electronik 194 recorder was used to monitor the thermocouple emf as function of time in order to ensure thermal equilibrium for the sample. The thermocouple system was calerrnometers built and calibrated to A.S.T.M. specifications E l (ASTM. No, 64c,65c,66c) using improved thermocouple reference tables.33 At least two spectra were recorded at each concentration with thermal equilibrium being ensured by the reproducibility of the replicate spectra and by a constant thermocouple reading. Nine solutions were studied in the concentration range 1.72 x 10-4-4.31 x 10-2 M and the sample temperature during recording of the spectra was 33.8 f 0.5". The spectra were digitized by the manual technique described in a prpvious publication.34 Replicate spectra were averaged and checked for reproducibility by a small computer program The programs of Jones, et aE.,23 were used for the subsequent band resolution. Programs were written to treat the N S x NS factor analysis approach described above. An IBM 360/50 computer was used for the calculations The influence of water on the acetic acid-CCl4 system was studied Figures 1 and 2 show the spectra in the carbonyl region of 5 >: 1 0 - 4 and 1.4 X 10-2 M acetic acidCC14 solutions saturated with water. The water apparently hydrogen bonds i o the acetic acid with a band appearing in the carbonyl region at 1745 cm-1. However the intensity of this 1745-cm-1 band suggests that the hydration i s not very competitive compared to the self-association of the acetic acid. The existence of this 1745-cm-l band is similar to that of De Villepin, et al.,32 who found that methanol forms a complex with carboxylic acids with an absorption band appearing between the monomer and ~ ~ cyclic dimer bnnds. ,similarly Haurie and N ~ v a kfound that the forination of complexes of proton acceptors with acetic acid yields a band in the neighborhood of 1745 em- 1. From water solubility studies Snead36 has recently concluded that no detectable hydration of formic acid in dilute moist CC14 oc