991
Anal. Chem. 1984, 56, 991-995
.
?
gel permeation chromatography. The main groups of components separated by the three methods are paraffinic hydrocarbon, monoalkylphenols, dialkylphenols, and bis(hydroxypheny1)alkanes. Accuracies of the three methods are almost equivalent. Gel permeation chromatography gives results quicker than gas chromatography and elution liquid chromatography, but the gas chromatography method is slightly more precise.
3
ACKNOWLEDGMENT The authors with to express their gratitude to K. B6ldi and E. Ker6nyi for their attention to detail in performing the laboratory analysis.
LITERATURE CITED
0
1
I
5
10
(1) Brodskii, E. S.; Lukashenko, J. M.; Lebedevskaya, V. G.; Polyakova, A. A. Khim. Tekhnol. Topl. Masel 1973, 16, 54-58. (2) Zakapra, V. A.; Chernetskaya, T. J. Khlm. Tekhnol. Top/. Masel 1973; 16, 51-55. (3) Raverdino, Vittorio; Sassetti, Plerguido J. Chromafogr. 1978, 753,
I
ELUTION V O L U M E , c m
__
181-188. -
3
Flgure 3. Elution curve of an alkylphenol with an average 24 hydrocarbon chain on Sephadex LH-20: (1) dialkylphenol, (2) bis(hydroxyphenyl)alkane,(3) monoalkylphenol, ( 4 ) paraffinic hydrocarbon.
CONCLUSIONS The concentration of the compound type in industrial grade alkylphenols with CI5-C3 isoalkane chains can be determined by gas chromatograhy, elution liquid chromatography, and
(4) Nichikova, P. R.; Rud, A. N.; Tember, G. A.; Getmenskaya, Z. J.; Ivanov, U. N.; Zerzeva, I . M.; Martynushkina, A. V. Neftepererab. Neftekhim. (Moscow) 1979, 3 , 46-48. (5) Schabron, J. F.; Hurtubise, R. J.; Silver, H. F. Anal. Chem. 1978, 50, 1911-1917. (6) Ogan, Kenneth: Katz, Eiena Anal. Chem. 1981, 5 3 , 160-163. (7) Hurtubise, R. J.; Hussain, A.; Silver, H. F. Anal. Chem.1981, 5 3 , 1993- 1997.
RECEIVED for reiew September 23, 1983. Accepted January 10, 1984.
Multivariate Curve Resolution in Liquid Chromatography David W. Osten and Bruce R. Kowalski* Laboratory for Chemornetrics, Department of Chemistry BG-10, University of Washington, Seattle, Washington 98195
Self-modeling curve resolution has been shown to allow resolution of two coeiuting chromatographic peaks without requiring any assumption of an underlying peak shape. The subsequent problem of quantitation of these coeiuting peaks Is limited both by the Chromatographic resolution (separation in time antl difference in elution profile) and by the degree of spectral uniqueness. An experimental system of two watersoluble vitamins has been used to examine the effects of varying chromatographic resolution on the quantitative accuracy of the curve resolution method.
The importance of high-performance liquid chromatography in the analysis of complex chemical mixtures has been recognized for some time. The chromatographer is often faced with the problem of unresolved or only partially resolved chromatographic peaks eluting from the analytical column. The accepted response to inadequate resolution has generally been increased attempts with complex and expensive gradient elution methods to improve the chromatographic conditions; however, this will not always provide a solution to the general chromatographic problem of coeluting peaks. Davis and Giddings (I) have shown that the likelihood of overlapping chromatographic components is much higher than previously
realized. The availability of multiwavelength absorbance data from the new generation linear diode array UV/visible liquid chromatography detectors presents a challenge to analytical chemists to develop methods for analyzing the large volume of data generated. This additional information can be used to determine the number of components in a single chromatographic peak and to accomplish both qualitative and quantitative resolution of the underlying components. Single wavelength absorbance detectors provide a record of the total amount of material eluting from the chromatographic column as a function of time. Unfortunately, these detectors force the chromatographer to rely on peak shape criteria to determine if the eluting peak is a single component or a mixture. Poor peak shape or the presence of a valley point can suggest the presence of at least two components, but with only the absorbance a t a single wavelength, no resolution of the mixture is possible. Dual wavelength detectors provide an additional means of detecting the presence of a mixture eluting from the chromatographic column. The ratio of the absorbance a t two different wavelengths can be used to both evaluate peak purity and confirm peak identity (2). A multiwavelength absorbance detector, such as the linear diode array detectors now available from several manufacturers, does provide the chromatographer with the raw data necessary to determine the number and identity of the eluting chroma-
0003-2700/84/0356-0991$01.50/00 1984 American Chemical Society
992
ANALYTICAL CHEMISTRY, VOL. 56, NO.
6,MAY 1984
tographic components. Piole and Coulon (3)have described an extension of the absorbance ratioing method, which they term the absorbance index technique. This method utilizes the absorbance at multiple wavelengths, Piole and Coulon recommend nine or more, to evaluate peak purity or confirm peak identity. Knorr, Thorsheim, and Harris ( 4 ) have used multichannel detectors to obtain numerical resolution of overlapping chromatographic peaks. Their method assumes the individual component chromatograms can be modeled as a Gaussian function convoluted with a single sided exponential to account for tailing. Recently, McCue and Malinowski ( 5 ) have proposed using rank annihilation factor analysis to resolve coeluting chromatographic peaks. This method allows one to quantify a partially resolved chromatographic component provided it has a somewhat unique absorbance spectrum and chromatographic profile. It does not require knowledge of the number or identity of the other interferents but it does require a standard chromatographic run of the desired analyte under identical chromatographic conditions. The standard sample must chromatograph with the same elution profile as the analyte in the mixture sample. Self-modeling curve resolution, based on the method described by Lawton and Sylvester (6),provides a method for resolving two unresolved chromatographic peaks without requiring knowledge of the identity of either component. The measured absorbance data at several wavelengths is arranged into a data matrix where each row represents an absorbance spectrum measured at a given point on the chromatographic peak and each column represents a chromatogram at a single wavelength. Factor analysis of this data matrix allows determination of the total number of components, n, which are present. The original absorbance data matrix can then be rotated onto n new orthogonal eigenvectors. If two significant eigenvectors were found, deconvolution and quantitation can be accomplished as described for GC/MS data by Sharaf and Kowalski (7,8). Spjotvoll, Martens, and Volden (9) recently described a restricted least-squares solution of the multivariate two-component mixture problem involving modeling the mixture as a combination of the mean of the data matrix and only one eigenvector. Regardless of the specific notation used, curve resolution uses the information provided by the absorbance at all available wavelengths to resolve the underlying components and provide quantitative results to the limit of the instrumental and spectral resolution. In this paper quantitation performed by curve resolution was compared to the results obtained by the more familiar perpendicular dropline method for the analysis of two overlapping liquid chromatographic peaks. Several reasons are discussed to account for the differences observed.
EXPERIMENTAL SECTION Data were collected on a Waters Associates Model ALC/ GPC-244 liquid chromatograph equipped with a Waters Model 660 solvent programmer. A Hewlett-Packard Model 1040A high speed spectrophotometric detector was used to monitor the absorbance of the material eluting from the analytical column at six wavelengths: 210, 225, 254, 275, 295, and 315 nm. Materials. Niacinamide and riboflavin were obtained from the Sigma Chemical Co., St. Louis, MO. All HPLC solvents were Burdick & Jackson “distilled in glass” grade. Procedures. Standards and mixture samples were analyzed on a Varian NH2-10chromatographic column (30 X 0.4 cm). In order t o obtain varying resolution the solvent was varied from 20% H 2 0 in CH3CN to 58% H 2 0 in CH3CN. Absorbance data were collected by using the default conditions for the 1040A, TIMEBASE = 160 ms, PEAKWIDTH = 0.1 min, and MEM BUNCHING = 1 (enabled). This means the six chromatographic signals, one corresponding to each wavelength, are available from the detector every 160 ms. An integral power of two sequential data points is then averaged, or bunched, such that at least eight data points are stored within the time interval specified by the
PEAKWIDTH parameter. These default conditions result in chromatographic data points being stored every 640 ms. A t the conclusion of the experimental runs, the stored data were analyzed by the standard 1040A operating software. A portion of the data analysis was performed with the MCR-2 curve resolution package for the 1040A detector available from Infometrix, Inc., Seattle, WA.
THEORY The development of a curve resolution algorithim for use with the HP 1040A spectrophotometric detector employed a different mathematical formulation than Lawton and Sylvester’s original work (6). For the case of two coeluting components, each mixture spectrum in Lawton and Sylvester’s notation is expressed as a linear combination of the two largest eigenvectors of the correlation matrix. In order to avoid a matrix diagonalization to obtain the first two eigenvectors, the data matrix is first centered about the mean and then factor analyzed. This reduces the number of eigenvectors necessary to describe a two-component mixture to one, allowing a faster executing iterative algorithim to be used to extract the only first eigenvector instead of performing a complete matrix diagonalization. Martens (IO) has described the advantages of this transformation for the factor analysis of chemical mixtures. The chromatographic data are first formed into data matrix, X, composed of NP rows and NW columns, where NP is the number of spectral points recorded across the chromatographic peak and NW is the number of wavelengths stored. The data are normalized so that the sum of the absorbances in each spectrum is equal to one, and then mean centered. NW
Ri = EX, j=1
X .V. = X ,41. / RJ . 1
NP
NP
i=l
for all i = 1, N P
X, = - [ E X i j ]
for all j = 1, NW
(3) (4)
If the chromatographic peak consists of only one pure component, then the normalized spectrum of each data point recorded across the peak will be identical within the limits of the detector noise and the mean vector will be sufficient to explain the variance of the data matrix. The fraction of the total variance described by the mean is tested against a preset noise threshold, usually 1%, to determine if only one component is present. If the fraction variance due to the mean is greater than (100 - noise threshold), a conclusion is made that only one component is present. Assuming two components are present, the above test will fail and the analysis continues. The theory of factor analysis indicates n components can be described by a linear combination of the mean and the first n - 1 eigenvectors of the normalized, centered data matrix, Y . The first eigenvector of the correlation matrix P Y is extracted = ’JL]
(5)
The normalized, mean centered data are projected onto the first eigenvector, L , yielding a score for each spectrum point
s, = Y,LT
(6)
Each normalized mixture point can now be described as a combination of the mean and the first eigenvector of the centered data matrix, Y = S,.L + t, (7)
x, x +
The variance of the residual,
e,,
is tested against the noise
ANALYTICAL CHEMISTRY, VOL. 56, NO. 6, MAY 1984
threshold to ensure the two-component model is an adequate representation of the original data. If only two components are present, resolution of the mixture spectra can now be accomplished. Since all normalized mixtures of the two underlying components are combinations of 8 and L, the normalized pure components must also be combinations of and L. This condition yields the following equations: (PS), = tL (8)
Table I. Retention Time of Pure Vitamin Standards (in minutes) % H,O in CH,CN
x
x+ (PS), = x + t Z
(9)
for a l l j
niacinamide
20 25 30 35 40 45 50 55
where (PS)lis the pure spectrum of the first component and (PS), is the pure spectrum of the second component. Two constraints are also imposed which limit the possible ranges of t and t’. First, absorbances may only be positive or zero, which implies rt, tLJ I0 for all j (10)
+ X j + t’Lj I0
993
riboflavin
3.65 3.46 3.29 3.16 3.08 3.02 2.95 2.94
6.11 4.13 4.07 3.64 3.38 3.21 3.09 3.04
(11)
The second constraint is that all mixture spectra must be positive combinations of the two pure spectra. This yields the relationship
X‘ = @S),
+ P(PS),
(12)
where CY 1 0 and ,8 1 0. Equations 7-12 above are analogous to equations 2-7 in Sharaf and Kowalski (7)for the discussion of the constraint conditions for the two-eigenvector hotation. Combining these two constraints yield two possible regions for the location of the underlying pure component spectra
< -max X-I
(13)
-min - < t’< min Sk
(14)
mgx Sk 6 t
L,O
Lj
k
In order to obtain a specific solution for the quantitative amounts of the two underlying components, a single point within each range must be selected for the pure spectra. Trials with simulated and real chromatographic data have indicated the inner bounds are the best choice if no other information is available. This corresponds to selecting the purest spectra recorded as the estimates of the pure components. Sharaf and Kowalski (8) have shown that for any mixture point, m, the ratio of the distance between m and t’to the distance between t and t’is equivalent to the fraction of the normalized response due to component 1. By an analogous argument the ratio of the distance between m and t to the distance between t and t’is equivalent to the fraction of the normalized response of component 2. The chromatographer is most often interested in the absolute absorbances of each component at a given wavelength. The method of Sharaf and Kowalski has been extended to provide this information as follows. The fraction of the absorbance due to component 1 in spectrum i is computed as in ref 8
The total absorbance due to component 1 in spectrum i is determined by multiplying the fractional amount, F(1),by the original area of the spectrum. This is then corrected to the absorbance of component 1at wavelength k in spectrum i by multiplying by the fraction of the total absorbance of component 1which occurs at wavelength k. Since all mixture spectra were normalized to a total adsorbance of one, the fraction of component 1 a t wavelength k is simply given by
F(l)ika t
Xk
=8
k
+ tLk
(16)
2 10.5
260.5
310.5
360.5
WAVELENGTH (nm) Flgure 1. Pure spectra of niacinamide (solid line, 685 mAU full scale) and riboflavin (dotted line, 132 mAU full scale).
therefore, the absorbance of component 1at wavelength k in spectrum i, is given by
The analogous expression for component 2 is
HESULTS AND DISCUSSION Table I lists the observed retention times of niacinamide and riboflavin with the chromatographic conditions described earlier. It was possible to selectively control the resolution of these two compounds from base line separation to complete overlap by varying the percentage of water in the mobile phase from 20 to 58%. The pure spectra of these two compounds, shown in Figure 1,indicate that they both have two adsorption maxima below 300 nm. There is about a 10-nm shift in the position of the lower wavelength band, 215 nm for niacinamide and approximately 225 nm for riboflavin. There is also a definite difference in the relative strength of the two adsorption bands. In the case of niacinamide, the maxima at 215 nm is about three times as intense as the band near 260 nm. For riboflavin, the two major adsorption bands are of about equal intensity. With these two particular compounds it would be possible to select a wavelength above 320 nm which will be specific for riboflavin. As in a normal chromatographic analysis, specific adsorption signals are desirable when performing curve resolution. Since the objective of this study was to evaluate the performance of curve resolution in a general chromatographic analysis, the longest wavelength stored was set at 315 nm. The monitor or pilot wavelength used for peak getection in the HP 1040A detector was arbitarily set at 225 nm. A t this wavelength 65% of the mixture absorbance is due to the niacinamide and 35% is due to the riboflavin. Figure 2 illustrates the resolution obtained with four different sqlvent mixtures. The actual resolution obtained in these four runs ranged from slightly greater than 1 to near 0.4. The first two chromatograms, recorded at 38 and 43%
994
ANALYTICAL CHEMISTRY, VOL. 56, NO. 6, MAY 1984
I
1
I
I
I
1
4 .O TIME (rnin)
3.0
C
I
I
1
3.0
4.0
2.95
3.15
3.35
2.85
3.05
3.55
3.75
ri
I
I
3.0
I
I
I
3.0
4.0
3.25
3.45
3.65
1
I
4.0
TIME ( m i d
Figure 2. Chromatographic separation of niacinamide and riboflavin
analytical mixtures at 225 nm with CH,CN/H20 on a Varian “,-lo column: (A) 3 8 % HO , in CH3CN, (B) 4 3 % HO , in CH,CN, (C) 4 8 % H20in CH,CN, (D) 5 8 % HO , in CH,CN. In all cases niacinamide elutes before riboflavin.
Table 11. Quantitation at 225 nma
%H,O 23 25 30 33 38 43 48 53 58
integrator %Nia %Rib 65.0 64.6 65.8 65.8 64.6 62.7
35.0 35.4 34.2 34.2 35.4 37.3
curve resolution
%Nia
%Rib
66.6 66.0 65.3 72.3 63.3 59.5
33.4 34.0 34.7 27.7 36.7 40.5
2 80
3.00
2.80
3.00
3.20
3.40
a Actual concentrations: niacinamide, 65%; riboflavin, 35%.
water, respectively, both posess a valley point indicating the presence of a t least two components. Quantitation of these two chromatograms by dropping a perpendicular from the valley point yields 64.6 and 62.7% for niacinamide. When the water concentration is increased to 48%, the resolution drops to a value of 0.6. At this degree of resolution no valley point is obtained, making it impossible to quantitate the components by the perpendicular dropline method. It is still possible to identify the presence of two components in the peak due to the shoulder present on the trailing edge. A further decrease in the resolution to 0.4, caused by an increase in the water concentration to 58%, makes it impossible to even identify the presence of the second component. Figure 3 illustrates the results obtained when curve resolution is used to resolve these four chromatograms making use of all the available absorbance information. Even in the worst case of resolution of less than 0.5, curve resolution is able to correctly identify the number of components in the mixture. The quantitative results summarized in Table 11, indicate the performance of curve resolution compared to the
3.20
3.40
TIME ( m i n )
Figure 3. Resdved chromatographic absorbance at 225 nm for niacinamide-riboflavin mixtures: (A) 3 8 % HO , in CH,CN, (E) 4 3 % H,O in CH,CN, (C) 4 8 % H20 in CH,CN, (D) 5 8 % HO , in CH,CN.
perpendicular dropline method. For the chromatograms that possessed a definite valley point, the results obtained by both methods provide a reasonable estimate of the areas of the individual chromatographic bands. Curve resolution is most advantageous when the resolution is even poorer. Curve resolution performed on chromatograms without a valley point still yields quantitative information within $-lo% of the correct value. Even in a situation of virtually complete overlap, curve resolution is able to provide the number of components and an estimate of the amounts of the analytes present. The reason for the improvement in resolution made possible by curve resolution is 2-fold. First, all of the available absorbance information is used in selecting the best estimates of the spectra of the underlying components. Second, the
995
Anal. Chem. 1984, 56,995-1003
multivariate absorbance data provided by a diode array detector provides a signal averaging and signal to noise improvement. Since curve resolution performs the quantitative resolution using all the absorbance information, the problem of varying apparent resolution which affects the perpendicular dropline method is avoided. The accuracy of the dropline method is greatest when the two chromatographic peaks are of equal size and it progressively decreases as the relative difference in the individual bands increases. Curve resolution performs the quantitation in the eigenvector space using all the wavelengths recorded and then displays the results at the selected wavelength, therefore, it is less susceptible to errors caused by changing the wavelength at which the quantitation is displayed. The limit to the accuracy obtainable using curve resolution is dependent on the degree of chromatographic resolution and the degree of spectral uniqueness. Curve resolution makes use of two assumptions; first, only nonnegative quantities of material are present, and, second, only nonnegative absorbances are permitted. In order to assure these assumptions are valid, the base line absorbance immediately before and after the coeluting band must be tested to assure that the base line is not negative a t any of the stored wavelengths. Experience with the HP 1040A detector has indicated that drift in the detector base line can result in negative absorbance being recorded. Proper base line subtraction to remove this
detector offset is the most significant practical limitation in determining the accuracy of the quantitative results obtained. The problem of base line correction is not limited to curve resolution. It is also a factor in determining the accuracy when quantitation is accomplished by using any method to separate partially resolved peaks.
ACKNOWLEDGMENT The authors wish to express their appreciation to Steve George of Hewlett-Packard for loan of a 1040A detector. Registry No. Nia, 98-92-0; Rib, 83-88-5. LITERATURE CITED Davis, J. M.; Giddings, J. C. Anal. Chem. 1983, 5 5 , 418. Yost, R.; Stoveken, J.; MacLean, W.J. Chromafogr. 1977, 734, 73. Poile, A. F.; Coulon, R. D. J . Chromatogr. 1981, 204, 149. Knorr, F. J.; Thorsheim, H. R.; Harris, J. M. Anal. Chem. 1981, 5 3 , 821. (5) McCue, M.; Malinowski, E. R. J . Chromatogr. Sci. 1983, 2 7 , 229. (6) Lawton, W. H.; Sylvester, E. A. Technomefrics 1971, 73, 617. (7) Sharaf, M. H.; Kowalski, 8.R. Anal. Chem. 1981, 5 3 , 518. (8) Sharaf, M. A.; Kowalski, B. R. Anal. Chem. 1982, 5 4 , 1291. (9) Spjmtvoll, E.; Martens, H.; Volden, R. Technomefrics 1982, 2 4 , 173. (10) Martens, H. Anal. Chim. Acta 1879, 772, 423.
(1) (2) (3) (4)
Received for review September 15,1983. Accepted January 24,1984. The work was supported in part by the Department of Energy. D.W.O. was a Chevron Research Fellow during the time this work was performed.
Statistical Approach for Estimating the Total Number of Components in Complex Mixtures from Nontotally Resolved Chromatograms David P. Herman, Marie-France Gonnord, and Georges Guiochon*
Laboratoire de Chimie Analytique Physique, Ecole Polytechnique, 91128 Palaiseau Cedex, France
A recently proposed statistical model of peak overlap in complex mlxtures Is tested by computer simulation. The theory, orlglnaily conceived by Davis and Glddings, assumes that solute retentlon In complex mixtures Is random and can thus be described by Polsson statistics. Consistent with this model, the present computer slmulatlon results indicate that by deflnlng the number of observed peaks, p , In a chromatogram as being the number of occurrences of peak maxima, over a limited range of peak capacities, Nc, the number of observed peaks Is related In a simple manner to the columns peak capacity through the relationship In ( p 1) = In ( m 1) m/(Nc - I),where m is the actual true number of components (resolved and unresolved). Signlficant deviations from the log-llnear relationship between peak number and reclprocai peak capaclty are however observed when m / ( Nc 1) ratios are greater than 1.0. A method of determining this ratio knowing neither m nor Nc independently is provided. The theory is tested to the extent that real-world complex mixture chromatograms can be made to adhere to Its underlying assumptions by using TLC retentlon data from the literature and from data generated hereln for a crude oil sample and one of its dlstlliates by GLC. The theory is also applied to estimate the total number of components present In the crude oil samples and the probability that any one observed peak In their optimum efficiency chromatograms are quantifiable singlets.
-
-
-
0003-2700/84/0356-0995$0 1.50/0
Column performances and technologies have reached the point where high efficiency columns (e.g., half a million theoretical plates in open tubular capillary gas chromatography) have come into common use and are now commercially available. The primary impetus behind their development has been the growing need to analyze real mixtures of increasing complexity. Despite these very high efficiencies, many complex mixtures have been shown to contain many unresolved peaks. Although it is the state of the art of the chromatographer’s skill to adjust phase selection and column characteristics so as to improve the resolution of overlapping components, when a large number of compounds within a single sample have to be analyzed experience shows that it is virtually impossible to resolve all of them on any one single stationary phase. Hence, our ability to perform qualitative and quantitative analyses of complex mixtures based solely on chromatographic data is severely limited. For example, accurate quantitative analyses based upon peak height or peak area measurements either require that each chromatographic peak corresponds to a single identifiable chemical component (Le., require extremely high column efficiencies) or that the detector response be highly specific to the compounds of interest. The latter solution, although viable, results in an overall loss of chemical information and does not allow an estimate of the total number of components to be made. Real-time spectral generating detectors on the other hand, dramatically increase the amount of available chemical data 0 1984 American Chemical Society