Anal. Chem. 2001, 73, 1587-1594
Resolving Factor Analysis Caroline Mason, Marcel Maeder,* and Andrew Whitson
Department of Chemistry, The University of Newcastle, Callaghan NSW 2308, Australia
Bilinear data matrices may be resolved into abstract factors by factor analysis. The underlying chemical processes that generated the data may be deduced from the abstract factors by hard (model fitting) or soft (model-free) analyses. We propose a novel approach that combines the advantages of both hard and soft methods, in that only a few parameters have to be fitted, but the assumptions made about the system are very general and common to a range of possible models: The true chemical factors span the same space as the abstract factors and may be mapped onto the abstract factors by a transformation matrix T, since they are a linear combination of the abstract factors. The difference between the true factors and any other linear combination of the abstract factors is that the true factors conform to known chemical constraints (for instance, nonnegativity of absorbance spectra or monomodality of chromatographic peaks). Thus, by excluding linear combinations of the abstract factors that are not physically possible (assuming a unique solution), we can find the true chemical factors. This is achieved by passing the elements of a transformation matrix to a nonlinear optimization routine, to find the best estimate of T that fits the criteria. The optimization routine usually converges to the correct minimum with random starting parameters, but more difficult problems require starting parameters based on some other method, for instance EFA. We call the new method resolving factor analysis (RFA). The use of RFA is demonstrated with computer-generated kinetic and chromatographic data and with real chromatographic (HPLC) data. RFA produces correct solutions with data sets that are refractory to other methods, for instance, data with an embedded nonconstant baseline. There are two fundamentally different ways of analyzing data matrices: hard, model fitting approaches and soft, model-free analyses. Model fitting is essentially a two-step process. First, a model must be chosen that forms the basis of a mathematical function to quantitatively describe the measured data; typical models are reaction mechanisms for kinetic investigations or the set of expected species formed during an equilibrium investigation. In a secondary step, after the choice of the chemical model, estimates for the underlying parameters (e.g., rate or equilibrium constants) have to be guessed and subsequently they are iteratively refined by a least-squares fitting routine. Reliable optimization procedures are available, and this secondary step is fairly routine. The choice 10.1021/ac991141q CCC: $20.00 Published on Web 03/02/2001
© 2001 American Chemical Society
of the correct model is not so straightforward, and often not much guidance is available; consequently, relatively exhaustive trials have to be conducted in order to be able to choose the appropriate model. Multivariate data allow soft, or model-free analyses of the measurements. A variety of model-free analyses have been developed recently. These include EFA,1 HELP,2 SIMPLISMA,3 and ALS.4 All these methods attempt to analyze the data based on very few fundamental restrictions, such as nonnegativity of concentrations and molar absorptivities or restrictions implicit in basic laws such as Beer-Lambert’s law. No specific chemical model is required, so the cumbersome selection of the right model is not necessary. Further, some measurements cannot be quantitatively described by any mathematical function (typical examples are concentration profiles in chromatography, which do not follow a simple or generally useful function5,6); here model-free methods are mandatory.7 Model-free analyses also can give preliminary insight into the chemical processes in the data set and guide the choice of chemical models that are subsequently fitted to the data. Model-free analyses are very attractive: “The methods can be classified into two groupssthose in which assumptions about the system are made (e.g., number or identity of compound, line shapes) and those in which no assumptions are made, other than (perhaps) linear system behavior. In the former case you get back what you assumed, in the latter you get what you get.”8 The main advantage of model fitting is the significantly higher robustness compared to model-free analyses. The number of variable parameters reflects the robustness in a simple and representative way. It is easiest to consider an example: in order to investigate the consecutive reaction X f Y f Z, 100 absorption spectra, measured at 50 wavelengths, have been recorded during the reaction. In the hard, model-fitting approach, the two rate constants are the only parameters to be optimized (the molar absorptivities of the three components at 50 wavelengths are linear parameters and can effectively be eliminated in the nonlinear leastsquares fitting process9). In model-free analyses, the number of unknowns is usually orders of magnitude higher. They include all the concentrations of the 3 components at 100 reaction times, 300 values, and the 150 molar absorptivities. However, we have (1) Maeder, M.; Zuberbu ¨ hler, A. D. Anal. Chim. Acta 1986, 181, 287-291. (2) Liang, Y.-Z.; Kvalheim, O. M. Chemom. Intell. Lab. Syst. 1993, 20, 115125. (3) Windig, W.; Guillment, J. Anal. Chem. 1991, 63, 1425. (4) Tauler, R.; Barcelo, D. TRAC-Trends Anal. Chem. 1993, 12, 319-327. (5) Foley, J. P. Anal. Chem. 1987, 59, 1984-1987. (6) Felinger, A. Anal. Chem. 1994, 66, 3066-3072. (7) Maeder, M. Anal. Chem. 1987, 59, 527-530. (8) Delaney, M. F. Anal. Chem. 1984, 56, 261R-277R. (9) Maeder, M.; Zuberbu ¨ hler, A. D. Anal. Chem. 1990, 62, 2220-2224.
Analytical Chemistry, Vol. 73, No. 7, April 1, 2001 1587
to recognize that this very high number of variable parameters does not adequately represent the situation, as there are restrictions such as the nonnegativity of all these numbers, closure of the concentrations, etc. Further, the concentration profiles and absorption spectra are related and knowing one immediately defines the other one via a linear regression step. This is similar to the situation mentioned above in the model-fitting algorithm. The number of unknown parameters to be determined is clearly a badly defined quantity; nevertheless, it is a substantial quantity. The very rigid constraints of a chemical model form a framework within which the fit is confined and which results in a robust analysis; in model-free analyses, this framework is dramatically wider and looser and these methods suffer generally from a severe lack of robustness. It must be remembered, however, that the choice of the wrong model necessarily results in the wrong analysis and wrong resulting parameters. In this contribution, we propose a novel combination of the two approaches; we propose to call it resolving factor analysis (RFA). RFA attempts to combine the strengths of hard modeling fits (small number of parameters) and of soft, model-free analysis (no a priori model required). The method is applicable to any bilinear data matrices. We will present the results of the application of RFA to several sets of kinetic and chromatographic data; some of them are computer generated and some are real measurements. Systematic analysis of computer-generated model data allows the exploration of the limits of RFA: Three component chromatograms with variable resolution and relative noise levels were analyzed. Similarly kinetic data were generated and analyzed. Kinetic data notoriously resist model-free analyses, as there is usually a lack of appropriate concentration windows. These kinetic data were also used to demonstrate the increase of robustness by the introduction of additional knowns, such as absorption spectra of selected compounds. We also successfully analyzed real and heavily overlapped chromatograms which suffer from baseline drifts. Such difficulties generally preclude the application of traditional model-free analyses. It must be emphasized that, since RFA makes no assumptions about the data other than the validity of basic laws, it is a proper soft-modeling method. Unlike classical soft-modeling techniques (and like classical hard-modeling methods), the method requires only a few parameters to describe the data. This is achieved because the parameters relate the true chemical factors to eigenvectors of the data, which may be regarded as a sort of internal model. THEORY The notation used in this paper is consistent with traditional matrix notation: matrices are represented by boldface, upper case characters (D), vectors (as column vectors) by boldface lower case characters (a), and scalars by italic lower case characters (n). The transpose of vectors and matrices is denoted by superscript t (Dt) and the pseudoinverse by superscript + (J+). To be amenable to all the model-free analyses developed previously and also for the present technique, the data must bilinear. In matrix notation, the data matrix D must be the product of a matrix A and a matrix B.
D)A×B
(1)
The ultimate goal of the model-free analysis of the data matrix D 1588
Analytical Chemistry, Vol. 73, No. 7, April 1, 2001
is the restoration of the true factors, the matrices A and B. Typical examples of such data are spectrophotometric measurements, e.g., the collection of absorption spectra measured as a function of reaction time in kinetics, elution time in chromatography (using multivariate detectors such as diode array detectors in HPLC, mass spectrometers, or FT-IR detectors in GC), or as a function of reagent addition in equilibrium studies. If the absorption spectra are collected row-wise in the matrix D, the matrix A contains, as columns, the concentration profiles of the components and the rows of B their corresponding molar absorption spectra. This is the direct result of Beer-Lambert’s law. Equation 1 represents a system of r × c (number of rows times number of columns in D) equations with r × n + n × c (number of elements in A and B) unknowns. n
D(i,j) )
∑A(i,k) × B(k,j)
(2)
k)1
n is the number of factors, components, and number of columns in A or rows in B. If n is sufficiently small, then the number of equations exceeds the number of unknowns, and we can hope to find a unique solution for A and B. In chemical data, we often find situations where this prerequisite is met: the number of components, n, in the system is normally much smaller than the number of measured spectra, r, and also smaller than the number of wavelengths, c; n , r, c. A bilinear data matrix may be decomposed into the product of two (or more) matrices which share the same vector spaces as the original A and B. One specific decomposition is achieved by the singular value decomposition algorithm:10
D ) U × S × Vt
(3)
The matrices U, S, and V are abstract factors and have the following properties: U consists of the eigenvectors of D × Dt, V consists of the eigenvectors of Dt × D, and S is a diagonal matrix of the singular values; the positive square roots of the eigenvalues of D × Dt and Dt × D. U and V are orthonormal, with the columns ordered from the most to the least significant eigenvectors. In the absence of experimental error (noise), there will be n (number of components) positive singular values; all the other ones are zero. In real, noisy, data sets, all singular values are positive and statistical tests are available to determine the number n of significantly positive singular values.11 Unfortunately, these tests often suffer from a certain ambiguity and the results may be difficult to interpret. However, the discussion of these questions is beyond the scope of this contribution. Setting the insignificant singular values to zero is an efficient way of reducing the noise component of the data.12 It is most convenient to reduce the size of S to the upper left n × n submatrix of the original S. Correspondingly, only the first n columns of U and V are retained. (10) Golub, G. H.; van Loan, F. Matrix Computations; Johns Hopkins University Press: Baltimore, MD, 1983. (11) Malinowski, E. R. Factor Analysis in Chemistry, 2nd ed.; Wiley-Interscience: New York, 1991. (12) Press: W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T. Numerical Recipes in C, 1st ed.; Cambridge University Press: Cambridge, 1988.
We assume in the following that the correct number of factors has been determined and the appropriate number of columns and rows in the matrices U, S, and Vt are chosen. Unfortunately there is no unique solution for the decomposition of D. In the absence of other information, A and B are not distinguishable from any other matrices that D can be decomposed into, e.g., the eigenvector matrices U and Vt; they are no more “right” than any other pair of matrices in the factor space. Since all of the solutions for the decomposition of D are linear combinations of each other, any particular solution can be transformed into any other by multiplication by an n × n transformation matrix T. The information that allows us to distinguish A and B from any other arbitrary or abstract matrices in the factor space is that A and B have chemical meaning. For instance, an estimate of A that is supposed to consist of concentration values may be rejected if some of the elements of A are negative (since negative concentrations do not exist and, therefore, have no chemical meaning). The relationship between the chemically meaningful matrices A and B and the abstract matrices U and Vt can be established in the following way. For any nonsingular matrix T, the product T × T-1 is the identity matrix, which can be inserted into eq 3:
D ) U × (T × T-1) × S × Vt
(4)
Substitute eq 1 into eq 4
A × B ) (U × T ) × (T-1 × S × Vt)
(5)
It is the goal to find the unique T for which
A)U×T
(6a)
B ) T-1 × S × Vt
(6b)
and
where A and B are the true, chemically and physically meaningful factors. In this contribution, an approach combining the advantages of the small number of parameters of model-based analyses with the lack of model constraints of the model-free methods is presented: The elements of T are treated as a set of unknown parameters which define A and B. The task is defined as finding the one set T for which the products A ) U × T and B ) T-1 × S × Vt are physically correct (e.g., all elements in A and B g 0) and which reproduce the original data matrix D as well as possible. The quality of the fit is indicated by the traditional sum of squares of the residuals, the matrix R, between measured data D and its recalculated equivalent Dcalc. R is a function of the matrix T, R ) f (T).
R ) D - Dcalc ) D - A ˆ ×B ˆ
(7)
where A ˆ and B ˆ are the matrices A and B as calculated by eqs 6a and 6b after appropriate corrections such as negative elements set to zero, etc. Other restrictions that can be applied if appropriate
include the removal of secondary maximums, e.g., if the concentration profiles are known to be unimodal13-15 or the application of symmetry constraints in ESR spectra.16 It is also possible to incorporate concentration windows as determined by EFA, HELP, or related methods. In a recent paper, the conditions for unique solutions were discussed:17 the criteria are based on the noncomplete overlap of concentration profiles. These restrictions apply in chromatography and often also in equilibrium data, but generally they do not apply to kinetic data where no clear concentration windows can be defined. In such instances, further restrictions may result in unique solutions. The symmetry of eq 1 ensures that restrictions apply analogously to both matrices A and B. There are usually no generally valid restrictions, in addition to nonnegativity, that can be applied to absorption spectra. However, it is possible or even likely that the absorption spectra of one or several of the components are known (rows of B). It is straightforward to implement such constraints into the algorithm. Similar restrictions could be applied to the concentration profiles; however, situations where concentration profiles are known are rare. Particularly in chromatography concentration profiles are often not well reproduced. The task of finding the optimal matrix T can be set up as a nonlinear least-squares fit. Many general programs exist for that task.12,18,19 In this contribution, we suggest an adaptation of the Newton-Gauss-Marquardt algorithm, which is generally robust and significantly faster than the conceptually simpler algorithms.9 The nonlinear least-squares fit of the elements of the matrix T by the Newton-Gauss-Marquardt algorithm is best described in the following steps: (a) Initial Guess of the Elements of T. The initial guess can be based on the concentration profiles resulting from a traditional model-free analysis such as EFA. The initial T is then calculated as T ) Ut × AEFA. In many instances, the algorithm converges straightforwardly starting from T matrices formed from random numbers. (b) Calculation of the Matrices A and B.
A)U×T B ) T-1 × S × V (c) Application of Corrections to A and B.
AfA ˆ BfB ˆ This includes all the physical restrictions such as (i) nonnegativity of the elements of A and B (set negative elements of A, B to 0), (ii) monomodality of concentration profiles (remove secondary maximums in A), (iii) symmetry constraints of spectra in B, and (13) Tauler, R.; Cassasas, E. J. Chemom. 1988, 3, 151. (14) de Juan, A.; Vanderheyden, Y.; Tauler, R.; Massart, D. L. Anal. Chim. Acta 1997, 346, 307-318 (15) Nagypa´l, I.; Beck, M. T.; Zuberbu ¨ hler, A. D. Talanta 1983, 30, 593. (16) Steinbock, O.; Neumann, B.; Cage, B.; Saltiel, J.; Muller, S. C.; Dalal, N. S. Anal. Chem. 1997, 69, 3708-3713. (17) Manne, R. Chemom. Intell. Lab. Syst. 1995, 27, 89-94. (18) Bevington, P. R. Data Reduction and Error Analysis for the Physical Sciences; McGraw-Hill: New York, 1969. (19) Gans, P. Coord. Chem. Rev. 1976, 19, 99-124.
Analytical Chemistry, Vol. 73, No. 7, April 1, 2001
1589
(iv) set elements of A outside additionally known concentration windows to zero. (d) Residuals and Sum of Squares.
Dcalc ) A ˆ ×B ˆ , R ) D - Dcalc
Dt ) Bt × At ) U × (T × T-1) × S × Vt
(8)
Bt ) U × T
(9)
and
btcalc is the projection of the known spectrum (btknown ) onto U; it forms a column of Bt, and thus can be written as
and sum of squares,
ssq )
∑∑
R(i,j)2
btcalc ) U × tknown
(10)
where tknown is the corresponding column of T, calculated as Exit loop if ssq remains constant. (e) Calculation of the Jacobian J. The Jacobian contains the derivatives of the residuals with respect to the parameters J ) ∂R/∂T. In this application, the Jacobian has to be approximated numerically as
Ji )
R(T) - R(T + ∆Ti) ∆Ti
where ∆Ti is a small shift applied to the ith element of T. (f) Calculation of Parameter Shifts. The shifts ∆T that are applied to the parameters T are calculated as ∆T ) J+ × R (J+ is the pseudoinverse of J) and then added to T: T ) T + ∆T. (g) Loop. The algorithm then continues at (b). The Newton-Gauss algorithm is based on a Taylor series expansion of the residuals R as a function of the parameters T and thus relies on the differentiability of R with respect to T.18 Numerical calculation of the Jacobian can be corrupted by the fact that the continuity and thus the differentiability of R with respect to T can be lost in the steps at (c) above, particularly if the corrections to A and B are large. The calculated Jacobian is then not accurate and the resulting shift vector can be seriously wrong. The Marquardt algorithm is usually beneficial, but there is no guarantee for convergence and in rare cases the calculation collapses entirely. Any analysis that is not based on additional concentration information (e.g., total concentrations of the reacting components in kinetic or equilibrium investigations; closure) will necessarily only result in the shapes of concentration profiles and spectra. Multiplication of a column of A by some value is compensated by the multiplication of the corresponding row of B by its inverse. The multiplication of a column of A is achieved by multiplication of the corresponding column in T. In our algorithm, we normalize the columns of T by dividing by the element on the main diagonal. This leaves the diagonal elements normalized to 1. Thus, instead of n2 parameters (the n × n elements of T), only the n2 - n parameters required to define the shape need be iteratively refined. Known Components. If one or more components of the mixture being analyzed are known, this information may be included in the transformation matrix, and the number of parameters to be optimized is reduced accordingly. Usually the spectrum of a component is known rather than its concentration profile. To incorporate this additional information into the algorithm, we take advantage of the symmetry of eq 1 and analyze the transpose of D: 1590
Analytical Chemistry, Vol. 73, No. 7, April 1, 2001
tknown ) Ut × btknown
(11)
The elements tknown of T remain fixed during the iteration of the algorithm, while the other elements are iteratively refined, as described above. Naturally, more than one known component spectrum can be incorporated into the algorithm. In the case of an independently known column of A, the implementation is carried out in an analogous way. This effectively reduces the number of parameters to be fitted and thus the robustness of the analysis. For m known absorption spectra, this number is reduced to (n - m)(n - 1). EXPERIMENTAL SECTION All algorithms were written in MATLAB 520 and the calculations were performed on PCs. The performance of the algorithm was extensively tested on simulated and real chromatographic data as well as simulated kinetic data. Simulated Chromatograms. To investigate the performance of the algorithm, chromatograms of varying noise level and overlap were generated and subsequently analyzed. For a three-component overlapping peak system, Gaussian concentration profiles (the columns of A) were generated in the following way: the first two components had means at 200 and 250 time units with widths of 30, resulting in a chromatographic resolution of 0.42. The peak heights were set to 0.5 and 1. The third component, with a peak height of 0.8 and a width of 30, was moved from means of 358 to 263 in steps of 5, resulting in 20 resolutions between 0.90 and 0.11 for the component pair 2 and 3. Spectra were generated between 0 and 500 at time intervals of 5 (r ) 101). The spectra B for each run were also Gaussian functions, with means at 400, 450, and 510 nm, widths of 30 nm, and heights of 0.8, 0.6, and 1, respectively. The spectra were generated at 61 wavelengths between 300 and 600 nm, at intervals of 5 nm (c ) 61). For each resolution, 19 different levels of normally distributed random noise were added to the generated data matrix. The standard deviation of the random numbers was varied from 0.0005 to 0.08, which amounted to relative levels between 0.06 and 9% of the maximum signal. The starting parameters, or initial guesses for the off-diagonal elements of the matrix T, for each run were random numbers between -1 and 1. Kinetic Simulations. Kinetic data generally result in nonunique resolutions; this is the result of a lack of appropriate windows for the concentration profiles.17 The ambiguity can be removed by the introduction of known component spectra or (20) The MathWorks, Inc.: Natick, MA, 1999.
Figure 1. (a) Residual standard deviation as a function of chromatographic overlap and the level of added noise. (b) shows the quality (as defined by the angle between the true and caculated absorption spectra for the second component) as a function of the same parameters.
concentration profiles. Simulated kinetic data were used to test k1 this feature of the algorithm. The consecutive reaction X 98 k2 Y 98 Z was simulated. Absorption spectra for the three components were the same as for the chromatograms (means 400, 450, and 510 nm, widths 30 nm, and maximum heights 0.8, 0.6, and 1, c ) 61). The initial concentration [X]0 was set to 1, the total reaction time was 500 time units (t), and 101 spectra were taken at 5-unit intervals (r ) 101). Two sets of data were generated with the rate constants k1 ) 0.05t-1, k2 ) 0.01t-1 and k1 ) 0.02t-1, k2 ) 0.05t-1. Gaussian random numbers of standard deviation 0.001 were added, amounting to 0.1% of the highest signal. Real Chromatograms. To ensure serious overlap in the concentration profiles, only a guard column was used! The column was an SGE SCX guard column of 4 mm i.d. × 10 mm long. Mixtures of the coordination compounds [Co(NH3)5Cl]Cl2, [Co(NH3)5NO2]Cl2, and [Co(NH3)5OCOCH3](NO3)2 were prepared at concentrations of 0.2, 0.2, and 0.4 g L-1, respectively. The eluent was 0.1M NaCl, with a flow rate of 1 mL min-1. Injected volumes were 1, 10, and 20 µL. Spectra were acquired at 1-s intervals for a total of 560 s. They were measured at 5-nm intervals between 200 and 520 nm. Spectra measured below 220 nm were very noisy, so the lowest wavelengths of the spectra were trimmed (r ) 60). Similarly, spectra measured earlier than 157 s showed an injection artifact that interfered with the analysis, so they too were trimmed (c ) 400). A Waters 600E system with a Waters 490E diode array detector was used. Additional sets of experiments were run to collect the deionized water blank data and also pure component runs for verification of the results. RESULTS AND DISCUSSION Simulated Chromatograms. As described above, the algorithm terminates when there is no further improvement in the residual sum of squares. For an acceptable fit, we can expect the standard deviation of the residuals to match the level of noise in the data. This value may be known for a particular instrument, but it is naturally known for the generated data in our chromatographic simulations. Figure 1a is a plot of the logarithms of the
standard deviation of the residuals σR as defined by
σR )
x
ssq r × c - (n2 - n)
(12)
as a function of the logarithm of the standard deviation of the added noise and the resolution between the second and third component. It may be seen that, with the exception of two outliers, the standard deviation of the residuals (σR) are the same as the standard deviation of added noise (σN) and surprisingly little influenced by the peak resolution. There is only a marginal increase of the residuals with increasing overlap which is just visible in Figure 1a. This indicates that the algorithm has found a solution that represents the data within the noise level. In the two outlying runs, the algorithm converged to some local minimum instead of the global minimum; starting with different initial guesses for the elements of T resulted in the right minimum. A better indication for the quality of the result can be defined as the difference between the true absorption spectra, B, and the fitted spectra, B ˆ . As there is no absolute information available, only their shapes are defined. The criterion for the performance was the angle between each of the calculated molar absorbance spectra in B ˆ (bˆ k) and the corresponding known spectrum in B (bk): The angle of the pair of vectors representing the spectra is independent of normalization and thus is a useful measure of similarity. The spectrum of the second component (which does not itself move but is subject to varying overlap by component 3) was chosen to represent the similarity between calculated and true spectra. Figure 1b is a 3-D mesh plot of the angle between bˆ 2 and b2 for each run, plotted against added noise and resolution. It may be seen that the angle is very small (good fit) over a wide range of noise levels and peak resolutions; it only increases to unacceptable levels at very high overlap and noise. Figure 2 displays the outcome for one particular calculation. The arrows in Figure 1 show its position in the series of simulations; it represents a relatively difficult case. Panels a and b of Figure 2 show plots of calculated concentration profiles A ˆ and absorbance spectra B ˆ , respectively, calculated in a run with Analytical Chemistry, Vol. 73, No. 7, April 1, 2001
1591
Figure 2. (a) Concentration profiles and (b) absorption spectra for one particular calculation.
Figure 3. (a) concentration profiles for a consecutive reaction with k2 > k1. (b) corresponding absorption spectra. The SOLID lines represent the result of unrestricted RFA, and the dashed line represents the (correct) result after the introduction of the known absorption spectrum for the intermediate.
an added noise level of 5.9% of the maximum signal and a resolution of 0.23. The angle between the spectra of the true and caculated second components at this point is 0.11 rad. As the peak resolution decreases, the amount of distinct information in the third eigenvector pair (the third columns of U and V) decreases, and at high noise levels, the third eigenvectors are indistinguishable from the noise eigenvectors. The apparent noise in the calculated concentration profiles A ˆ and absorption spectra B ˆ are a direct result of that; they are linear combinations of noisy eigenvectors and thus noisy themselves. Even if the quality of the calculation as defined by the angle between spectra is relatively low, the shapes of both concentration profiles and absorption spectra are essentially right. Kinetics. The use of known spectra to constrain possible solutions and to reduce the number of parameters is demonstrated with simulated kinetic data based on the consecutive reaction k1 k2 X 98 Y 98 Z. In the first experiment with k1 < k2, inspection of the concentration profiles reveals that the conditions for separate concentration profiles were essentially met and therefore the unique solution was found straightforwardly. If k2 > k1, however, almost no Y exists outside the concentration window of X, the intermediate concentration profile is almost parallel to the profile of X, and therefore, the first of the 1592 Analytical Chemistry, Vol. 73, No. 7, April 1, 2001
requirements for a possible solution defined by Manne17 is not met. As predicted by theory (ibid.), it is not possible to resolve the spectrum of Y or the concentration profiles of X and Z; see Figure 3. The “possible” areas defining values for the spectrum of Y and the concentration profile of X are not sufficiently constrained by the criteria of nonnegativity to permit unique solutions. It is interesting to note that new developments based on three-way data allow detailed analyses that are not restricted by the same requirements as for bilinear data21 Additional constraints in the form of known spectra may be used to remove the ambiguities, and this can result in a unique solution. Theory suggests that using knowledge of the spectrum of Y will permit complete resolution of X, Y, and Z, which is found to be the case (Figure 3). The resolution is essentially perfect with A ˆ and B ˆ indistinguishable from A and B used to generate the measurement. Incorporation of known absorption spectra for X or Z does not result in a unique solution. It might be argued that knowing the spectrum of the intermediate is not the most likely scenario in a consecutive reaction, and knowing the spectra of either X or Z is much more likely. This argument is certainly true in this example, but this does not invalidate the general idea. (21) Bijlsma, S.; Louverse, D. J.; Smilde, A. K. J. Chemom. 1999, 13, 311-329.
Figure 4. (a) Normalized concentration profiles for a three-component chromatogram with nonconstant baseline. (b) Corresponding normalized spectra.
Real Chromatograms. First we have to stress again that chromatographic resolution was performed on a cation exchange guard column only. This has several distinctly negative consequences: (a) The column is heavily “overloaded”. The retention times of the cationic coordination compounds are predominantly determined by the ionic strength of the eluent. Under the present overloaded conditions, the ionic strength is strongly influenced by the volume of injected analyte solution. To complicate matters further, the ionic strength of the analyte solution is defined by the analytes themselves, and as a result, the retention times and peak widths both are strongly dependent on injection volumes. As an additional consequence, the concentration profiles are highly asymmetric (skewed). (b) Additionally, because of significant changes in the ionic strength of the eluting solution, the refractive index changes concomitantly and this results in strong apparent shifts in the“baseline”. This is the case in the chromatographic time dimension as well as in the wavelength dimension. These experiments were performed under nonrealistic chromatographic conditions in order to generate data of maximal overlap and the additional complications mentioned above. Many potential difficulties in chromatography are exaggerated, whereas the noise structure is realistic. Such extreme difficulties are not easily achieved on modern columns, and of course, we are not suggesting analytical chemists should measure under such conditions. Baseline shifts during the chromatograms are difficult to deal with, and under such conditions, the classical model-free approaches are not applicable. For example, the analysis with EFA failed because of the presence of the baseline. Injecting various amounts of deionized water revealed that the “baseline” in the chromatographic direction can be approximated by a straight line. Thus, a straight line can be used as a “fixed concentration profile”. The measured chromatograms are now evaluated in terms of a four-component mixture with three chemical components, the fourth “component” being the baseline for which we know the “concentration profile”. The number n of parameters in the transformation matrix T is thus 9. The results of the analysis based on these restrictions are shown in Figure 4. Figure 4a displays the concentration profiles, including that of the baseline. As expected, the proper concentration profiles are asymmetric and the profiles for the first two components overlap
substantially. The first concentration profile is also slightly confounded by the third profile. More interesting is the “baseline profile” (btcalc, eq 10): Its size is highly exaggerated as it has been normalized to a maximum of 1; visual inspection of the data shows that the baseline signal is less than 1.5% of the largest peak over the times measured. It also shows characteristic fluctuations with an interval of approximately 3.5 s; these fluctuations may straightforwardly be assigned to the high-pressure pump. The absorption spectra in Figure 4b are correct (normalized); the vector angles between the calculated spectra of the three compounds and spectra measured for the pure components are 0.097, 0.054, and 0.044 rad. The“baseline spectrum” mainly reflects the emission spectrum of the light source, confounded to some extent by the absorption spectra of the complexes, which all peak around 230-240 nm (obviously we did not apply the nonnegativity constraint to the baseline spectrum). Again, the magnitude of the baseline spectrum is highly exaggerated as it has been normalized. The number of components nc was chosen on an ad hoc basis, by using the number that gave meaningful results. More formal methods of determining nc have been documented by Malinowski,11 but in this example, the algorithm does not converge properly for nc * 4, not even for the correct number of true chemical components (nc ) 3). CONCLUSIONS The model-free analysis of bilinear data can be performed as a nonlinear least-squares fit where the parameters are the elements of the transformation matrix T which relates the abstract eigenvectors U and V to the chemically meaningful factors A and B (e.g., concentration profiles and absorption spectra). Without compromising the model-free aspect of the analysis, the resolution is strongly enhanced with respect to traditional model-free analyses due to the much smaller number of variable parameters. Experiments with computer-generated and real data indicate that in those cases where there is a unique solution (as defined by Manne17) convergence is reliable and is achieved often from random starting values. RFA allows the introduction of any conceivable restrictions such as monomodality of concentration profiles or known columns in A or rows in B (e.g., absorption spectra). Analytical Chemistry, Vol. 73, No. 7, April 1, 2001
1593
The range of potential applications of RFA is large; we have used it successfully with simulated kinetic and simulated and real chromatographic and titration data. RFA has sucessfully resolved real chromatographic data with a nonzero baseline in both the time and the spectral dimensions; this is a very difficult class of problem that is not easily resolvable by other factor analysis based model-free techniques. Provided there exist constraints that define
1594
Analytical Chemistry, Vol. 73, No. 7, April 1, 2001
a unique solution, the method can in principle be used with any bilinear data set in any field.
Received for review October 4, 1999. Accepted January 12, 2001. AC991141Q