Anal. Chem. 1997, 69, 1477-1484
Resolution of Complex Liquid Chromatography-Fourier Transform Infrared Spectroscopy Data F. Cuesta Sa´nchez,† B. G. M. Vandeginste,‡ T. M. Hancewicz,§ and D. L. Massart*,†
ChemoAC, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium, Unilever Research Laboratorium Vlaardingen, P.O. Box 114, 3130 AC Vlaardingen, The Netherlands, and Unilever Research US, 45 River Road, Edgewater, New Jersey 07020
The analysis of a reaction product by high-performance liquid chromatography coupled with Fourier transform spectroscopy (LC-FT-IR) originates a series of overlapping peaks. The applicability of the orthogonal projection approach, the fixed size window evolving factor analysis approach, and the evolving factor analysis approach for the resolution of those overlapping peaks into individual chromatograms and IR spectra is discussed. The results are evaluated by identifying the different compounds present using the spectral characteristics of the resolved pure compound spectra and the chemistry of the system. The hyphenation of liquid chromatography (LC) with Fourier transform infrared spectrometry (FT-IR) constitutes an important advance for the identification of unknown compounds in complex mixtures. In this technique, the separation capability of chromatography is coupled with the identification ability of infrared spectrometry. The charactersitics and different interfaces of this hyphenated technique (LC-FT-IR) have been discussed by Griffiths et al.1 and Fujimoto and Jinno.2 The reaction product of a straight-chain alkyl alcohol and citric anhydride was analyzed by HPLC-FT-IR. The complete chromatographic run produces a series of overlapping chromatographic peaks. A better separation of the eluting compounds could, perhaps, be achieved by modifying the chromatographic conditions. However, this is tedious work that substantially increases the analysis time. Therefore, resolution of the overlapping chromatographic peaks was attempted by the application of chemometric techniques. The ability of chemometric techniques to resolve data matrices obtained from hyphenated techniques into individual concentration profiles and pure compound spectra has been proven with different experimental systems.3-18 The self-modeling curve resolution †
Vrije Universiteit Brussel. Unilever Research Laboratorium Vlaardingen. § Unilever Research US. (1) Griffiths, P. R.; Pentoney, S. L., Jr.; Giorgetti, A.; Shafer, K. H. Anal. Chem. 1986, 58, 1349A-64A. (2) Fujimoto, C.; Jinno, K. Anal. Chem. 1992, 64, 476A-81A. (3) Lawton, W. H.; Sylvestre, E. A. Technometrics 1971, 13, 617-33. (4) Vandeginste, B. G. M.; Essers, R.; Bosman, T.; Reijnen, J.; Kateman, G. Anal. Chem. 1985, 57, 971-85. (5) Vandeginste, B. G. M.; Derks, W.; Kateman, G. Anal Chim. Acta 1985, 173, 253-64. (6) Gemperline, P. J. Anal. Chem. 1986, 58, 2656-63. (7) Maeder, M. Anal. Chem. 1987, 59, 527-30. (8) Maeder, M.; Zilian, A. Chemom. Intell. Lab. Syst. 1988, 3, 205-13. (9) Maeder, M.; Zuberbu ¨ hler, A. D. Anal. Chem. 1990, 62, 2220-4. ‡
S0003-2700(96)01036-0 CCC: $14.00
© 1997 American Chemical Society
approach proposed by Lawton and Sylvestre3 constitutes the basis of the self-modeling approaches. This curve resolution approach consists of the rotation of the abstract spectra obtained by the decomposition of the data matrix by singular value decomposition (SVD) into real spectra. The main limitation of this method is that its applicability is restricted to binary mixtures. This curve resolution approach was extended to more complex mixtures by Vandeginste et al.,4 by Malinowski,19 and by Schostack and Malinowski.20 The applicability of the orthogonal projection approach (OPA) and the evolving factor analysis approach (EFA) for the resolution of HPLC-DAD overlapping peaks was discussed in a previous paper.18 The OPA starts by selecting as many spectra of the data matrix as compounds are eluting. The selection procedure is based on a dissimilarity criterion.21,22 The location of the selected spectra is closely related to the retention times of the eluting compounds. Once the number of absorbing compounds has been determined, the data matrix is decomposed into concentration profiles and spectra of the individual compounds using an iterative least-squares procedure.14-16 The EFA approach7,8 determines the number of compounds present in the system by inspecting the singular values obtained from the analysis of an increasing number of consecutive spectra. The singular values plot is also used to define the concentration window (region of existence) of each eluting compound. The individual concentration profiles are then estimated by rotating the abstract chromatograms obtained by SVD of the complete data matrix. The rotation matrix is found using the zero concentration regions for each compound. Den and Malinowski12 pointed out the difficulty in defining the concentration window for each compound using the EFA approach (10) Liang, Y.-z.; Kvalheim, O. M.; Ralimani, A.; Brereton, R. G. J. Chemom. 1993, 7, 15-43. (11) Malinowski, E. R. J. Chemom. 1992, 6, 29-40. (12) Den, W.; Malinowski, E. R. J. Chemom. 1993, 7, 89-98. (13) Scarminio, I.; Kubista, M. Anal. Chem. 1993, 65, 409-16. (14) Karjalainen, E. J.; Karjalainen, U. P. Anal. Chim. Acta 1991, 250, 169-79. (15) Tauler, R.; Durand, G.; Barcelo´, D. Chromatographia 1992, 33, 244-54. (16) Tauler, R.; lzquierdo-Ridorsa, A.; Casassas, E. Chemom. Intell. Lab. Syst. 1993, 18, 293-300. (17) Windig, W.; Guilment, J. Anal. Chem. 1991, 63, 1425-32. (18) Cuesta Sanchez, F.; Rutan, S. C.; Gil Garcia, M. D.; Massart, D. L. Chemom. Intell. Lab. Syst. 1996, 34, 139-71. (19) Malinowski, E. R. Anal. Chim. Acta 1982, 134, 129-37. (20) Schostack, K. J.; Malinowski, E. R. Chemom. Intell. Lab. Syst. 1989, 6, 219. (21) Cuesta Sanchez, F.; Toft, J.; van den Bogaert, B.; Massart, D. L. Anal. Chem. 1996, 68, 79-85. (22) Cuesta Sanchez, F.; Khots, M. S.; Massart, D. L. Anal. Chim. Acta 1994, 290, 249-58.
Analytical Chemistry, Vol. 69, No. 8, April 15, 1997 1477
when several strongly overlapping peaks are present. Therefore, the fixed size window evolving factor analysis approach (FSWEFA)23,24 to determine the local rank of the data matrix is applied here. In this paper, the OPA, the FSW-EFA, and the EFA are applied to resolve the HPLC-FT-IR data matrix of a reaction product resulting from the process described earlier. The HPLC-DAD data set used in the previous paper18 was an “artificial” experimental example, since reference spectra were available. Those reference spectra were used to evaluate the results obtained. The HPLC-FT-IR system discussed in this paper is a real experimental example, so no reference spectra of the eluting compounds are available. The results obtained are evaluated through the identification of the eluting compounds based on the spectral features of the resolved IR spectra and on the chemistry of the reaction. THEORY The instrument produces a data matrix, X (m × n), where the m rows are spectra measured at regular time intervals, and the n columns are chromatograms measured at different wavenumbers. The data matrix X is bilinear, i.e., it can be decomposed into the product of the individual chromatograms matrix (C) and the pure compound spectra matrix (S):
X ) C‚ST
for i ) 1, ..., m
(2)
A dissimilarity plot is then obtained by plotting the dissimilarity values, di, as a function of the analysis time. Initially, Yi (n × 2) consists of one reference spectrum, which is the mean (average) spectrum of matrix X, and the spectrum at the ith analysis time. The spectrum having the highest dissimilarity value is the least correlated with the mean spectrum, and it is the first spectrum selected, xs1. Then, the mean spectrum is substituted by xs1 as reference in matrices Yi (Yi ) [xs1:xi]), and a second dissimilarity plot is obtained by applying eq 2. The spectrum most dissimilar with xs1 is selected (xs2) and added to matrix Yi. Therefore, for the determination of the third dissimilarity plot, Yi contains three columns [xs1:xs2:xi], i.e., two reference spectra and the spectrum at the ith time. In summary, the selection procedure consists of three steps: (1) comparison of each spectrum of X with all spectra already (23) Keller, H. R.; Massart, D. L. Anal. Chim. Acta 1991, 246, 379-90. (24) Keller, H. R.; Massart, D. L.; De Beer, J. O. Anal. Chem. 1993, 65, 471-5.
1478
Analytical Chemistry, Vol. 69, No. 8, April 15, 1997
C ) (X‚S)‚(ST‚S)-1
(3)
where S (n × nc) is a matrix containing in the columns the (nc) spectra selected in the previous step (i). The concentration profiles are then modified, taking into account the nonnegativity and unimodality (within a tolerance limit) of the chromatograms,18 and a new set of spectra (matrix S) is obtained by least-squares:
(1)
The OPA,18,21 FSW-EFA,23,24 and EFA7,8 approaches have been extensively explained elsewhere; therefore, only a short explanation of each one will be given here. Orthogonal Projection Approach. The OPA consists of two main steps: (i) determination of the number of compounds present in the mixture and (ii) resolution of the data matrix into concentration profiles and pure compound spectra. (i) Determination of the Number of Compounds. Spectra are sequentially selected, taking into account their dissimilarity. The dissimilarity of the ith spectrum, di, is defined as the determinant of the dispersion matrix of Yi. In general, matrices Yi consist of one or more reference spectra and the spectrum measured at the ith analysis time.
di ) det(YiT‚Yi)
selected by applying eq 2 (initially, when no spectrum has been selected, the spectra are compared with the average spectrum of matrix X), (2) plotting of the dissimilarity values as a function of the analysis time (dissimilarity plot), and (3) selection of the spectrum with the highest dissimilarity value by including it as reference in matrix Yi. The selection of the spectra is finished when the dissimilarity plot shows a random profile. It is considered that there are as many compounds as there are selected spectra. (ii) Resolution of the Data Matrix. An iterative least-squares approach14-16,18 is used for the resolution of the overlapping peaks into individual chromatograms and spectra. The procedure starts by calculating the concentration profiles (matrix C) by leastsquares:
S ) (XT‚C)‚(CT‚C)-1
(4)
The sum of squares of the residuals, SSR, is calculated:
R ) X - C‚ST m
SSR )
(5)
n
∑ ∑r
2 ij
(6)
i)1 j)1
Equations 3-6 are repeated iteratively until the relative difference in the SSR values of two consecutive iterations is lower than an a priori defined convergence limit. In this case, it was set equal to 0.1%. The pure compound spectra are then normalized to unit area by dividing each element of a column of S by the total sum of the elements in the column:
SN ) S‚N
(7)
where N (nc × nc) is a diagonal matrix with the elements defined by
nkk ) 1/zk
for k ) 1, ..., nc
(8)
for k ) 1, ..., nc
(9)
n
zk )
∑s
jk
j)1
The concentration matrix then has to be normalized by
CN ) C‚Z
(10)
where Z is a diagonal matrix, the elements of which are the zk values defined by eq 9.
Fixed Size Window Evolving Factor Analysis. The FSWEFA approach23,24 determines the singular values of a window containing p consecutive spectra. The window is moved from the first to the last spectrum of the data matrix, and the singular values obtained are plotted as a function of the window number. The number of singular values above the noise level in each window indicates the local rank of the window and, therefore, the number of absorbing compounds in that local region. Evolving Factor Analysis. The EFA approach starts by analyzing an increasing number of consecutive spectra by SVD, starting from the first spectrum of matrix X until the last one. The same procedure is repeated in the backward direction, i.e., starting from the last spectrum of matrix X and going back to the first one. The combination of the singular value plots obtained in the forward and backward directions provides the elution region for each eluting compound. It is assumed that the compounds disappear in the same order as they appear. Once the number of compounds present is determined and the corresponding concentration window is defined, the complete data matrix is decomposed by SVD, T
X ) U‚Λ‚V
(11)
and the nc first abstract chromatograms, contained in matrix U (m,nc), are transformed into real ones (matrix C). Λ (nc,nc) is a diagonal matrix containing the singular values of the nc first factors, and V (n,nc) is the matrix of the nc significant column singular vectors:
C ) U‚T ck ) U‚tk
for k ) l, ..., nc
(12) (13)
The chromatogram of the kth compound, ck, is obtained by multiplying matrix U by the kth column of the transformation matrix T. Each column of matrix T, tk, is obtained using the zero concentration region for the kth compound, where its concentration (c0,k) is known to be equal to 0:
c0,k ) U0,k‚tk
for k ) 1, ..., nc
(14)
Matrix U0,k consists of the rows of U with no contribution from the kth compound. The system defined by eq 14 is underdetermined due to the elimination of the contribution of the kth compound, making it necessary to give an arbitrary value to one of the elements of vector tk. EXPERIMENTAL SECTION Apparatus. The data were obtained using an LC-FT-IR interface (Lab Connections) for performing pseudo-real-time LC data acquisition with infrared detection. The LC interface uses a liquid deposition solvent volatilization technique which produces a thin, narrow film of solute on a germanium (Ge) disk. The disk spins slowly, at a given rotation speed, as the solute trail is deposited in a circular geometry onto the disk. The disk is then removed from the LC interface module and placed in the FT-IR interface module. The FT-IR portion of the LC-FT-IR uses beam condensing optics to focus the IR beam onto the disk as it traces along the previously deposited solute “trail” at the same rotation speed used at the time of deposition. The FT-IR spectrometer
Figure 1. Average baseline-corrected chromatogram.
operating in fast acquisition (kinetics) mode then records spectra as the disk rotates. Materials. A sample of monoalkyl (C14/C15) citrate, produced by the reaction of a C14/C15 straight-chain aliphatic alcohol and the anhydride of citric acid, was analyzed by LC-FT-IR. The anticipated reaction product, and indeed the expected major component of the mixture, is assumed to be the disodium salt of monoalkyl (C14/C15) citrate ester. Also present in the sample is 5-10% (w/w) citric acid (trisodium salt), as determined by a separate analytical HPLC analysis. The separation was performed using a C-18 column (Regis) with a 65:35 water/acetonitrile mobile phase buffered with ammonium acetate. The flow rate was set to 1 mL/min, and 100 mL of a 0.5% (w/w) sample diluted in mobile phase was injected onto the column. FT-IR spectra were recorded using a Bio-Rad FTS-60A FT-IR in GC-IR kinetics data acquisition mode. Spectra were collected at 30 s intervals for 66 min (132 spectra) and scanned in the wavenumber range from 798 to 3600 cm-1, with a wavenumber interval equal to 2 cm-1 (1453 wavenumbers). Each spectrum is the average of 16 scans. RESULTS AND DISCUSSION The mean baseline-corrected chromatogram is shown in Figure 1. Baseline correction was performed by subtracting the linear interpolation of the spectra at the start and end of each chromatographic peak at each wavenumber. The three peak clusters observed in Figure 1 are studied separately. Therefore, the complete data matrix is split into three submatrices, the first from spectra 1-50 (cluster 1), the second from spectra 51-110 (cluster 2), and the third from spectra 111-132 (cluster 3). Discussion of the eluted chromatographic peaks and their associated spectra is given in terms of possible and/or likely interaction of components expected in the reaction sample. It should be emphasized that the information extracted using the OPA techniques provides a basis, or starting point, for further structural characterization. Given the ambiguity inherent in infrared characterization of very similar chemical species, there remain some aspects of the results that are, at best, estimated assignments. An attempt has been made to interpret the results in light of what is known about the chemistry, but additional experiments and additional analyses are warranted for a more Analytical Chemistry, Vol. 69, No. 8, April 15, 1997
1479
a)
b)
c)
d)
e)
Figure 2. Spectrum of the compound present in cluster 1.
complete identification of each compound and a confirmation of the given assignments. Cluster 1. For the analysis of the data, the spectral range is reduced to the region between 989.48 and 3495 cm-1 to eliminate noisy parts of the spectra without valuable information. The analysis of the data by OPA indicates the elution of one main compound with retention time around time 27. The pure spectrum of the compound is given in Figure 2. The most significant spectral features extracted from the spectrum are the N-H stretching; Fermi resonance modes near 3200, 3070, and 2550 cm-1; the acid CdO and associated C-O stretching modes at 1718 and 1210 cm-1, respectively; and the carboxylate salt asymmetric and symmetric CO2- stretches at 1570 and 1420 cm-1, respectively. Since the mobile phase was buffered with ammonium acetate at pH ∼4, very near the pK2 of citrate anion,26 a strong interaction might be expected between the buffer and the citrate due to partitioning between acid and carboxylate forms of the anion leading to subsequent exchange of cationic species. Furthermore, citrate is known to complex with large cations such as metal ions, ammonium ions, and other positively charged amino groups. This phenomenon has been observed, for example, in GPC-IR analyses, where polycarboxylate salts interact with ammonium acetate buffer.27 The infrared spectral features of the compound in cluster 1 suggest a complex form of ammonium citrate, perhaps existing as a mixed cationic species with both acid and carboxylate carbonyls present (diacid monoammonium citrate). This interpretation is supported by the analytical HPLC results found previously,25 which indicate the presence of 5-10% (w/w) citric acid (sodium salt form) in the sample. The singular values plot obtained with the FSW-EFA approach using a window of size 3 (results not shown here) also indicates the presence of one compound, since the second and third singular values plots present similar structure and they have practically the same value along the peak. Cluster 2. The spectral range is reduced to the region between 931.61 and 3599.1 cm-1, to eliminate the noisy spectral regions without valuable information. Four spectra at times 65, (25) Bautista, B.; Dalton, J. Internal Unilever report, July 1991. (26) Bates, R. G.; Pinching, G. D. J. Am. Chem. Soc. 1949, 71, 1274-83. (27) Hancewicz, T. M.; Jilani, M. 1996 FACSS Conference, Sept 29-Oct 4, 1996, Kansas City, MO, Poster No. 466.
1480 Analytical Chemistry, Vol. 69, No. 8, April 15, 1997
Figure 3. Cluster 2: dissimilarity of each spectrum with respect to (a) the average spectrum, (b) spectrum at time 65, (c) spectra at times 65 and 75, (d) spectra at times 65, 75, and 71, and (e) spectra at times 65, 75, 71 and 81.
75, 71, and 81 are selected in this order by OPA. The dissimilarity plots obtained are shown in Figure 3. The first dissimilarity plot (Figure 3a) represents the dissimilarity of each spectrum of X with respect to the mean (average) spectrum of the data matrix. The spectrum scanned at time 65 is the least correlated with the mean spectrum, and it is, therefore, the first spectrum selected. In the next step, each spectrum of the data matrix is compared with the first spectrum selected, the one at time 65, and a second dissimilarity plot is obtained (Figure 3b). The spectrum with the highest dissimilarity value, the spectrum at time 75, is then selected. The third dissimilarity plot (Figure 3c) shows the dissimilarity of each spectrum with respect to both spectra already chosen, the spectra at times 65 and 75. As before, the spectrum with the highest dissimilarity value, the one measured at time 71, is selected. The process continues comparing each measured spectrum with all spectra previously selected, until the dissimilarity plot shows a random profile such as the one given in Figure 3e. In the mean chromatogram plot (Figure 1), a peak with retention time around 96 is observed. This peak is also seen in the first dissimilarity plot (Figure 3a) but not in the second dissimilarity plot (Figure 3b). This fact leads us to think that the pure spectra of the compounds eluting around tR ) 65 and tR ) 96 are quite correlated. The selection procedure was repeated using as first reference the spectrum at time 96, instead of the mean spectrum. The spectrum at time 96 is included as a selected spectrum in matrices Yi, and is considered in the determination of all the dissimilarity plots (results not shown here). The spectrum least correlated with the spectrum at time 96 is the one measured at time 75, and it is the second spectrum selected. The procedure is repeated by comparing each spectrum with respect to both spectra already selected. The second dissimilarity plot shows a peak around analysis time 68, and this spectrum is the third one selected. The selection procedure continues until the dissimilarity plot shows a random profile. In conclusion, by starting the procedure with the spectrum at time 96, six spectra at time: 96, 75, 68, 65, 82, and 72 (in this order) are selected, two components more than previously (Figure 3).
Figure 4. Individual chromatograms of the compounds present in cluster 2 estimated with the OPA approach.
Once the number of eluting compounds is determined, the data are resolved into elution profiles and spectra using an iterative least-squares procedure. The initial estimates are the spectra selected in the first part of the OPA approach, i.e., spectra at times 65, 68, 72, 75, 82, and 96. The concentration profiles and IR spectra obtained are shown in Figures 4 and 5, respectively. The spectra of the compounds A and F (Figure 5a,f) are very similar. They present the spectral features of a long-chain alkyl C-H stretch (2950-2800 cm-1) and a broad carbonyl peak near 1730 cm-1 that is a combined acid and ester carbonyl superimposed over each other. Close inspection of the spectra in this region clearly shows a less intense shoulder near 1710 cm-1 and a more intense central peak near 1730 cm-1. This phenomenon is common in polycarbonyl compounds such as citric acid and citrate esters and is due to the presence of two distinct types of carbonyls in the compound. Also apparent is a very broad acid O-H stretch directly underneath the alkyl stretch. The only significant difference between the spectra in Figure 5a and 5f is the relative intensities of the carbonyl absorbances compared to the C-H stretch absorbances, which are explained by differences in the degree of ester substitution. These spectral assignments are consistent with a monoalkyl (C14/C15) citrate ester (diacid form) for compound A and a dialkyl (C14/C15) citrate ester (monoacid form) for compound F. Compounds B, C, and E (tR ) 67, 71, and 75, respectively) also present quite similar spectral characteristics (Figure 5b,c,e). They exhibit long-chain alkyl C-H stretching modes (2950-2800 cm-1), ester carbonyl stretching modes (1735, 1210 cm-1), and carboxylate salt stretching modes (1570, 1430 cm-1). Compound B and C both have similar C-H stretch intensities that would indicate a monoalkyl substitution. The most significant difference between the two is the relative intensity and wavenumber position of the carboxylate features. This indicates a difference in the cation associated with the carboxylate.28 For these compounds, this would be limited to either sodium or ammonium ion since they are the only species known to be present in the sample. The less intense, higher wavenumber position for the asymmetric carboxylate stretch in compound B would suggest ammonium (28) Lin-Vien, D.; et al. The Handbook of Infrared and Raman Characteristic Frequencies of Organic Molecules; Academic Press: New York, 1991.
ion rather than sodium ion. The opposite argument is made for the carboxylate features in compound C. In addition, compound B shows a very broad low-intensity band directly under the C-H stretch between 3500 and 2300 cm-1 which also suggests ammonium ion. Although another assignment for this band might be that of a dimer acid O-H stretch, there is no associated acid carbonyl stretch (1710 cm-1) in the spectrum. Also, the peak shape of the band is symmetrical as opposed to the skewed acid hydroxyl band observed in previously identified compounds (clusters 1, 2a, and 2f). In fact, there is no indication of acid functionality in any of the spectra of compounds B, C, D, and E. Compound E is differentiated from B and C by the presence of a non-hydrogen bonded hydroxyl stretch near 3500 cm-1 that is indicative of water of hydration. Compound E otherwise has spectral features very much like those of compound C and indicates the presence of sodium cation only. Based on these observed spectral features, compounds B, C, and E are assigned as diammonium monoalkyl citrate, ammonium-sodium monoalkyl citrate, and hydrated disodium monoalkyl citrate, respectfully. Characterization of compound D (Figure 5d) was not straightforward since the spectral features were vastly different from those expected from the chemistry of the system. While the major spectral features appear to be similar to those of compound E (the disubstituted species), it is clear that the carboxylate features (1568 cm-1) dominate, and the ester features (1730 cm-1) are nearly nonexistent. Identification of compound D was facilitated through a search of library reference spectra. The spectrum was identified as a good match to that of sodium acetate trihydrate. This assignment is consistent with those given for previous compounds and would be expected if, in fact, the buffer cation had exchanged with the analyte and produced sodium acetate. The singular values plot obtained by the FSW-EFA approach using a window of size 7 is given in Figure 6. The plot indicates that the maximum number of compounds eluting simultaneously is three. Initially there is only one compound. From window 8, a second compound is detected by the increase in the second singular value. From windows 10-21, three compounds are coeluting, as indicated by the presence of a peak in the third singular value. From windows 21-34, it is difficult to determine the number of coeluting compounds. From windows 34-40, a peak is observed in the second singular value, together with a valley in the first singular value, which indicates the end of the elution of a compound and the start of another one. This is the transition from the main peak to the small one (Figure 1). It is not possible to estimate the total number of eluting compounds in cluster 2, nor their concentration windows using the singular values plot obtained by the FSW-EFA approach. The severe overlap of the chromatographic peaks, together with the correlation of the pure compound spectra (Figures 4 and 5), makes the interpretation of the evolving singular values plots very difficult. The EFA approach was applied to estimate the individual concentration profiles and pure compound spectra. The evolving singular values obtained in the forward and backward directions (plots not presented here) do not clearly show the rank of matrix X, making very difficult the selection of the concentration window for each eluting compound. The concentration windows for each eluting compound were determined using the individual chromatograms estimated with Analytical Chemistry, Vol. 69, No. 8, April 15, 1997
1481
a
b
c
d
e
f
Figure 5. Pure compound spectra of the compounds present in cluster 2 estimated with the OPA approach.
OPA (Figure 4), and the rotation step of the EFA approach was performed (eqs 13 and 14). The concentration profiles obtained after the rotation of the abstract chromatograms (matrix U) are given in Figure 7. The concentration windows used are 53-70 (A), 63-74 (B), 66-80 (C), 70-85 (D), 76-99 (E), and 86-104 1482
Analytical Chemistry, Vol. 69, No. 8, April 15, 1997
(F). The concentration profiles obtained with the EFA approach are quite similar to the ones obtained with OPA, except for the last compound (F). This compound is present at low concentration, and the spectra in its concentration window are quite noisy (Figure 1). This fact, together with the strong correlation
a
Figure 6. Cluster 2: plot of the singular values obtained by the FSW-EFA approach using a window of size 7.
Figure 7. Individual chromatograms of the compounds present in cluster 2 estimated with the EFA approach.
between the spectra of compounds F and A, could explain of the wrong determination of the chromatogram for compound F using the EFA approach. Cluster 3. For the analysis of the data, the spectral range has been reduced to the region between 989.48 and 3599.1 cm-1 to eliminate noisy regions without valuable information. The analysis of cluster 3 with OPA indicates the presence of two compounds with retention times around 115 and 119. The individual unit spectra are given in Figure 8. The spectrum of the main compound (Figure 8b) presents features similar to those of cluster 1 (Figure 2) and of compounds A and E of cluster 2 (Figure 5a,e) in terms of ratio of ester carbonyl to carboxylate absorbance. The spectra show similarity in functional group information with respect to ester (1732 and 1202 cm-1), carboxylate (1576 and 1431 cm-1), and ammonium ion (3500-2300 cm-1) features. The compounds of clusters 2 and 3 are differentiated from that of cluster 1 by the presence of alkyl chain features (3000-2800 cm-1) and ester rather than acid cabonyl (cluster 1) in addition to carboxylate carbonyl. This suggests that the main compound of cluster 3 is an ammonium salt of dialkyl citrate ester. This is possible, given the dramatic difference in retention time between the respective clusters, the increase in alkyl to carbonyl
b
Figure 8. Pure compound spectra of the compounds present in cluster 3 estimated by the OPA approach.
functional group features, and the otherwise very similar carbonyl fingerprint characteristics. The spectrum of the minor compound (Figure 8a) exhibits features of an alkyl ester compound (3000-2800 cm-1) and shows no strong evidence of either acid or carboxylate carbonyl features (1740-1700 cm-1). The only likely compound that would produce this type of spectrum and would also be a theoretically possible side product of the primary reaction would be the trisubstituted ester compound (trialkyl citrate ester). Cluster 3 was analyzed by FSW-EFA using a window of size 3. The singular values plot (not shown here) presents small peaks in the second singular values profile around windows 4 and 13. It may indicate the presence of minor compounds, but the plot is not conclusive. The evolving singular values in the forward and backward directions obtained by EFA (results not shown here) indicate only the presence of one compound. CONCLUSIONS In this article, the applicability of the orthogonal projection approach (OPA), the fixed size window evolving factor analysis approach (FSW-EFA), and the evolving factor analysis approach (EFA) for the analysis of HPLC-FT-IR overlapping peaks has Analytical Chemistry, Vol. 69, No. 8, April 15, 1997
1483
been shown. Among them, OPA appears to be the most suitable approach for the analysis of several strongly overlapping peaks. In this system, the evolving singular values obtained by EFA and FSW-EFA do not provide conclusive information about the number of eluting compounds nor about their elution regions. All the identified compounds are explained by the chemistry of the reaction and the interaction between the compounds and the HPLC mobile phase. Further analysis should be performed to investigate the bilinearity of the data matrix that is assumed in the application of
1484
Analytical Chemistry, Vol. 69, No. 8, April 15, 1997
these multivariate techniques. Several minor compounds may be present that have not been selected. The high noise and baseline present in the data make it more difficult to detect the presence of those minor compounds. Received for review October 9, 1997. Accepted January 17, 1997. AC9610366 X
Abstract published in Advance ACS Abstracts, February 15, 1997.