Multivariate resolution of overlapped peaks in liquid chromatography

Local principal component models, rank maps and contextuality for curve resolution ... Generalized rank annihilation method ... Matthias Otto , Hans B...
0 downloads 0 Views 701KB Size
299

Anal. Chem. 1986, 58,299-303

Multivariate Resolution of Overlapped Peaks in Liquid Chromatography Using Diode Array Detection Walter Lindberg* and Jerker Ohman Department of Analytical Chemistry, University of Umei, Umei, Sweden

Svante Wold Research Group for Chemometrics, University of Umei, Umei, Sweden

A curve resolution program for diode array data based on the partial least-squares (PLS) method has been developed. The method is primarily useful for the resolution of two constituents. I t Is also appilcabie to more constituents If spectral regions exlst where the constituents only pairwise overlap. Thls approach is compared with the commonly employed method of principal components analysis (PCA) followed by a rotation of the solution. Two versions of the rotation program have been constructed: one where the rotation parameters are set by the computer and a second where the parameters are selected by the operator In an interactlve way. PLS and lnteractlve PCAlrotatlon perform well and are comparable with respect to resolving power, while the automatic PCA/rotation method Is less successful. Owlng to the use of prior informationfrom the chromatogram in PLS, those vectors are more similar to real spectra and chromatograms than are the nonrotated PCA vectors. Possible Implications of this fact in more complex chromatograms are discussed.

Incomplete resolution of chromatographic peaks and difficulties in assessment of peak purity are problems often met in the chromatographic separation of complex samples. With detectors giving three-dimensional chromatograms, such as the diode array detector, new possibilities to address such problems are opened. The extra dimension of spectra information can be used to increase selectivity, in the simplest case, by selecting a specific wavelength for each constituent. If the situation is more complex, methods such as multivariate curve resolution can be applied. Thus constituents can be mathematically resolved and quantified even when no specific wavelength is available. Principal Components Analysis. A method of common use in multivariate curve resolution is principal components analysis (PCA) and the closely related factor analysis. If linear response between signal and constituents is assumed, these methods can be used to find the number of constituents under a peak, without prior information of what these compounds are. Another prerequisite is that the retention times of the constituents are different. The principal components (PC:s) describe the systematic part of the 3-D chromatogram, which under ideal experimental conditions will only consist of the peaks and spectra of the constituents. These are, however, not true spectra and chromatographic profiles, but each is a linear combination of all peaks and spectra. This means that the PC:s must be transformed (rotated) so that a physically meaningful solution is obtained. The transformation is accomplished by adding chemical information of what is known about the system, such as nonnegativity of peaks and spectra and maximum dissimilarity of the constituents. This vector rotation is the crucial step of the method. However, since the criteria often are too weak to give a unique solution the step is fraught with difficulties. The degree of success depends

on the uniqueness of each spectrum, the overlap between peaks, and the noise of data. In our experience the situation also worsens if the linearity assumption of the response is violated, since then there will be more significant principal components than constituents in the overlapped peak. Partial Least Squares (PLS). In order to diminish some of the difficulties associated with PCA/rotation, we have investigated the utility of the 2-block PLS method for curve resolution. PLS is a method for describing relationships between blocks of variables. The philosophy of the method is that maximum variance of both blocks should be explained, while still maintaining the maximum correlation between them. In analytical chemistry the PLS method has been used for calibration and prediction purposes in quantitative analysis (1, 2). In curve resolution one block is the unresolved 3-D chromatogram, while the other is containing information about each constituent individually. As in ordinary calibration, a relationship between these blocks can be calculated. This model is subsequently used for prediction of the 3-D chromatogram for each constituent. In quality, PLS shows some similarities to target factor rotation, which is based on regressing the principal component scores against a priori information about the system (3). We note that PLS scores are more informative about the actual system than PC scores, provided that the a priori information fed into the PLS analysis is accurate. In this paper we will describe and exemplify the use of PLS for curve resolution and compare these results with the PCA/rotation method. The method is mainly applicable for two-constituent mixtures, but it will also be useful in mixtures of more constituents, if areas with only two overlapping constituents can be found in the 3-D chromatogram.

THEORY In the following presentation the 3-D chromatogram is arranged with spectra at each time point as rows in the matrix and the chromatograms at each wavelength as columns. Boldface capital letters represent matrices; boldface lower case letters represent vectors; and scalars are denoted by small italic lower case letters. PCA/Rotation. The principal componentswere calculated by using the NIPALS algorithm. In this method the components are calculated one by one in their order of explaining the variance in data. After each component, its significance is tested by cross validation, and the procedure is terminated after the calculation of the last significant component. The NIPALS algorithm and cross validation are presented in detail elsewhere ( 4 , 5). Rotation of the principal components was implemented according to concepts described in the literature (6-8). Nonnegativity of spectra and chromatograms was used primarily for ranking the response. In some cases other described criteria were also added, but these did not, in combination with the nonnegativity criteria, perform well in our data sets. The calculation of the rotation matrix was done iteratively

0003-2700/86/0358-0299$01.50/00 1986 American Chemical Society

300

ANALYTICAL CHEMISTRY, VOL. 58, NO. 2, FEBRUARY 1986

by using the simplex method (9), until the preset requirements were fulfilled. For comparison with the automatic rotation method, we also made an interactive program that allowed the manual setting of the rotation matrix elements. The program was constructed so that a coordinate system is drawn on the screen. For the selection of the rotation parameters a cursor is moved in the coordinate system and at any combination of parameters the corresponding peaks and spectra can be shown on the screen. In this way difficult parameter values can be entered manually and tested until a satisfactory result is obtained. The concepts of the PCA/rotation are as follows: (1)Decompose the 3-D chromatogram, Y,into the minor matrices, T and B, which describe the systematic part of Y

Y=TB+E where E is the unexplained part of Y. (2) Rotate the matrices T and B so that the true chromatograms and spectra are obtained Y = TB Y = TIB where I is the identity matrix Y = TRR-lB TR is then equal to the resolved chromatograms while R-’B is equal to the resolved spectra: chromatograms = TR and spectra = R-lB. For two constituents R is a 2 x 2 matrix. Because of the special properties of the inverse of such matrices, the spectrum of constituent 1 and the chromatogram of constituent 2 together change independently of the spectrum of constituent 2 and the chromatogram of constituent 1. This is not the case for matrices of higher rank, which means that this simple relationship is not applicable in such cases. Partial Least Squares. The concepts of PLS are based on the work by Wold and provide a general framework to study relationships between blocks of variables (10). The method has been adopted for chemical purposes (11)and has previously been used in, for instance, calibration problems. In multivariate calibration the question is to find a model that relates signals from analytical instruments/sensors to concentrations of chemical constituents so that predictions of future concentrations can be estimated as accurately as possible. Curve resolution can be readily formulated in similar terms, where the problem is to relate a priori information of chromatographic peaks and spectra to the 3-D chromatogram and use this model to predict 3-D chromatograms for each constituent individually. This curve resolution method has basically three steps. 1. Prior Information. Two conditions are essential for the a priori information used, namely that it is accurate and informative. Soft criteria such as nonnegativity of chromatographic peaks and spectra usually are accurate, but due to low informative power the range of solutions becomes fairly broad. On the other hand, with models assuming a functional relationship of, for instance, peak shape, model errors will cause a biased result. As an alternative we found it suitable to use the prior information from the actual 3-D chromatogram. This was achieved by calculating models for each constituent from data, chosen as selectively as possible. A time window at the edge of the peak is selected (Figure 1). Thereafter the most selective wavelength vector in the time window is copied from the matrix to form a separate variable. In this way a block containing the most selective time window (XI) and one vector of the most selective

tl

I

t2

I I

I bl b2

Flgure 1. Top view of a 3-D chromatogram showing two overlapped consthuents. Retention time is shown on the vertical axis, and spectra are depicted horizontally. For a presentation of the matrix operations see “prior information” in the Theory section.

wl‘*i c 2 c1

Flgure 2. Calibration of the 3-D chromatogram (Y) against the extracted prior information (X) thus obtaining latent vectors of the matrices, which subsequently are used to predict the constituents Individually.

wavelength (yl)is obtained. Subsequently, for this window a PLS relationship is calculated, where the resulting loadings (b1) provide an estimate of the spectrum of the constituent. The same procedure is repeated at the other side of the overlapped peak thereby giving an estimate (b,) for the second constituent. Details of the PLS algorithm used in this section and the calibration section below is described elsewhere (11). 2. Calibration. In the calibration step the estimates of the spectra (b) from both windows are used as X variables in a PLS calibration with the 3-D chromatogram as the Y block (Figure 2). In this way latent vectors, u and c, corresponding to the peaks/chromatograms of the 3-D chromatogram, respectively, are calculated. Also obtained are the weight vectors, w, expressing the combination of the spectra b giving the best calibrated model. Two components,representing the two overlapped peaks, are extracted. Hence, a relationship between the prior information and the 3-D chromatogram is established. Due to the nonorthogonality of the constituent spectra, the latent vectors, c, still do not comprise the “true” chromatograms but must be further “rotated, which is done as described below. 3. Prediction. To calculate the 3-D chromatogram of constituent 1,the spectrum estimate (bl) from the particular window is entered in the model, while the other (b,) is set to zero. The scores of the X matrix are calculated and, to obtain the individual 3-D chromatogram, are multiplied with the matrix relating X and Y (D) and the loadings of the Y block (c). Thus a new Y matrix of only one constituent is formed. The algorithm for the prediction step is, in the notation of Figure 2, as follows: start with PLS component j = 1; (1)Y

ANALYTICAL CHEMISTRY, VOL. 58, NO. 2, FEBRUARY 1986

301

= 0; (2) t, = wj’X (w from the calibration step); (3) Y = Y

+ dlcltj; (4) if j = 2 then terminate; else (5) X = X - wjtj; ( 6 ) j = j + 1, return to (2). For calculation of the second constituent the same procedure is utilized, and the first spectrum estimate is set to zero while the second estimate is entered. In comparison with the two PCA rotations, described above, this method can be regarded as a semiautomaticrotation, since the only decision made by the operator is the selection of time windows and its most selective wavelength. All succeeding steps can be readily automated. An alternative approach is to omit the prediction step and make an interactive rotation of the PLS solution analogous to PCA rotation.

90% MeOH

a

A

EXPERIMENTAL SECTION Apparatus. The chromatographic system consisted of a Constametric I11 pump (LaboratoryData Control), a Rheodyne 7125 injection valve, and an LKB 2140 diode array detector. For separation a 150 X 4 mm Nucleosil C18 column with 5 pm particle diameter was used (Macherey-Nagel). Data recording and manipulation were provided by an IBM-PC microcomputer with LKB software. Chemicals. Methanol, anthracene, and phenanthrene were all of p.a. quality and used as received. Procedure. Standard mixtures of anthracene and phenanthrene were prepared and chromatographed with eluents of methanol/water ratios ranging from W/10 to 98/2. Thus different grades of overlap were obtained, reflecting different complexities for the statistical resolution. Spectra from 200 to 300 nm were sampled on-linein 1-sintervals With maximum spectral resolution (4 nm SBW). The stored data were analyzed by the standard software package of the detector, and curve resolution was done by software developed at this laboratory.

, _._

40

Time/sec

b

90% MeOH

Timelsec

40

~

90 % MeOH

RESULTS AND DISCUSSION In order to evaluate the PLS method for curve resolution and for comparison with the PCA/rotation method, anthracene and phenanthrene were chosen 89 model substances. By change of the eluent composition from 90/10 to 9812 in methanol/water ratio, various degrees of overlap were obtained. Anthracene and phenanthrene both have their absorption maximum at 250 nm, albeit with a higher absorption for anthracene. Another maximum is at approximately 220 nm, but in this region the absorption of phenanthrene is higher. Also at wavelengths above 250 nm phenanthrene absorbs, while anthracene does not. The absorbance never exceeded 0.7 AU, and thereby a linear response was obtained in all cases. Contrary to some previous authors we did not normalize the spectra (12,13). The advantage of normalization is that for two constituents, only one principal component is needed to describe data. The second constituent in this way is obtained as the total spectrum minus the first component. The drawback with this approach is that the estimates of spectra and peak shapes will not be independent, i.e., errors, random or deterministic, in one estimate necessarily will also be present in the other. This problem will be most apparent if the assumptions of linear response and no background are not fully met. Nonlinearities may arise as a function of high absorbance as well as from influence of stray light due to unequal lamp intensity over the spectral range. Thus, in this way nonlinearities in both rows and columns may occur. This will result in more principal components than constituents and subsequently decrease the accuracy of the estimates of peaks/spectra. If base-line drift is present this also will cause an increased number of components with similar problems. Finally, if there is a constant background from, e.g., improper zeroing, the systematic deviation also will distort the chromatograms/spectra. With these possible problems we found it advantageous to develop a method where a model was calculated separately for each constituent, thus lowering the

40

Figure 3. Deconvolution using PLS: (a) original 3-D chromatogram, (b) and (c) corresponding resolved peaks.

risk of propagating the errors to both constituents. Semiautomatic PLS Method. Figure 3a shows the recorded 3-D chromatogram of anthracene and phenanthrene chromatographed by the weakest eluent (90% MeOH). The compounds are clearly distinguished as two peaks with a shorter retention time for phenanthrene (left). The corresponding resolved peaks in parts b and c of Figure 3 show that peak shape, the location of the peak, and spectra are identical with those for the individual constituents and thus give a completely satisfactory result. In Figure 4a, a stronger eluent (95% MeOH) is used, and thereby peaks with a higher degree of overlap are obtained. In this case the overlap is so severe that only one peak can be recognized, albeit with a slightly skewed shape indicating two constituents. These increased difficulties are also seen in the resolved peaks. In Figure 4b, showing phenanthrene, there is a hump in the middle of the spectrum at the position of anthracene. In this area of the 3-D chromatogram anthracene has its maximum absorbance. A corresponding valley is observed at the phenanthracene position of the anthracene chromatogram (Figure 4c), thus indicating that in this area the two peaks are mixed up by the model. Nevertheless, the result probably would be sufficiently accurate even for quantitative analysis. This degree of overlap showed to be the limit for an accurate resolution with the PLS method, as it is presently implemented. For higher overlaps (98% MeOH), defiencies in the PLS method resulted in the peak shapes becoming somewhat distorted and the individual chromatograms no longer equating to the measured chromatogram. However, the spectra were still adequate, thus

302

ANALYTICAL CHEMISTRY, VOL. 58, NO. 2, FEBRUARY 1986

98 % MeOH

95 % MeOH

40

9 8 % MeOH ' 3

40

Figure 4. Deconvolution using PLS: (a) original 3-D chromatogram, (b) and (c) corresponding resolved peaks. providing qualitative information. PLS/Rotation. Resolution by interactively rotating the PLS vectors from the calibration step was also investigated. These attempts gave similar results as rotating the PC vectors, which is described below. The rotation angle was, however, smaller with the PLS vectors. This confirms that by using prior information from the chromatogram and the PLS method a model of closer similarity to the physical reality is obtained than by using a principal components model. PCA/Rotation Method. The same data sets were also analyzed by PCA/rotation using the simplex method to find the rotation matrix. This approach was in general less successful, which was seen as shoulders or even double peaks of the "resolved" constituents. The reason for this is most probably that the nonnegativity criteria for chromatograms and spectra provide insufficient information to give a unique solution. When attempts to add other criteria were made, problems with obtaining an appropriate weighting between them arose. We believe that this illustrates the general problem of expressingthe quality of a complex system in one criterion. On the other hand if the PC vectors were rotated by using an interactive approach, good results were obtained even where the semiautomatic PLS method was less successful. This is exemplified in parts a-d of Figure 5, where the chromatogram with the 98% MeOH eluent is shown. For unknown constituents, the utility of this excellent result may, however, be limited, since shapes of the spectra are now known in advance and the shape of consecutive peaks is usually similar. Hence, it is difficult to judge when the peaks are pure in the different rotated solutions. Figure 5d shows the rotated spectra and chromatograms as they are depicted on the screen during

d

Chromatogram 1

Chromatogram 2

Spectrum 1

Spectrum 2

Figure 5. Deconvolution using PCA and interactbe rotation: (a)original 3-D chromatogram, (b) and (c) corresponding resolved peaks. (d) Rotated vectors showing the chromatogram and spectrum of each constituent. interations. Thus the resolved 3-D chromatograms in parts b and c of Figure 5 are obtained by multiplying the particular chromatogram by its corresponding spectrum. CONCLUSIONS In these data sets PLS gave accurate results in all situations of practical importance. Because of the two-block arrangement, with the possibility to effectively use prior information, we believe that PLS is a flexible approach to curve resolution problems. Provided that the prior information is accurate, PLS should be advantageous to PCA/rotation, in particular, in situations where other systematic variations are present. This is often the case for real samples, where for instance other

Anal. Chem. 7986, 58,303-307

unknown constituents may interfere, base line may drift, or nonlinearities in some regions of the 3-D chromatogram may occur. These theoretical standpoints must, of course, be verified experimentally.

ACKNOWLEDGMENT Loan of an LKB 2140 rapid spectral detector from LKBProdukter AB, Bromma, Sweden, is greatly appreciated. LITERATURE CITED (1) (2) (3) (4)

Otto,M.; Wegscheider, W. Anal. Chem. 1985, 5 7 , 63.

Llndberg, W.; Persson, J. A.; Wold, s. Anal. Chem. 1983, 5 5 , 643. McCue, M.; Malinowski, E. R. Anal. Chlm. Acta 1981, 133, 125. Wold, S.;Sjostrbm, M. I n “Chemometrics: Theory and Application”; Kowalskl, B. R., Ed.; American Chemical Society: Washington, DC, 1977; ACS Symposium Serles, No. 52.

(5) (6) (7) (8) (9)

(IO) 111) . , (12) (13)

303

Wold, S . Technometrics 1978, 2 0 , 397. Lawton, W. H.; Sylvestre, E. A. Technometrlcs 1971, 13, 617. Knorr, F. J.; Futrell, J. H. Anal. Chem. 1979, 5 1 , 1236. Melster, A. Anal. Chim. Acta 1984, 161, 149. Nelder, J. A.; Mead, R. Compuf. J. 1985, 7 , 308. Wold, H. I n “Systems under Indirect Observation, Part 11”; Joreskog, K. G., Wold, H., Eds., North-Holland: Amsterdam, 1982. Wold. S.:Martens. H.: Wold. H. I n “Proceedinos on the SvmDoslum Matrix Pencils, Pitea, 1982”;’ Ruhe, A,, Kdgstroh, B., Eds.; -SpringerVerlag: Berlln and Heidelberg, 1983. Osten, D. W.; Kowalski, B. R. Anal. Chem. 1984, 5 6 , 991. Martens, H. Anal. Chlm. Acta 1979, 112, 423.

RECEIVED for review April 24,1985. Accepted August 27,1985. Financial support from the Swedish Natural Science Research (NFR) and the NationalSwedish Board for Technical Development (STU) is gratefully acknowledged.

Separation of Styrene-Methyl Methacrylate Random Copolymers According to Chemical Composition and Molecular Size by Liquid Adsorption and Size Exclusion Chromatography Sadao Mori,* Yoshitaka Uno, and Masami Suzuki Department of Industrial Chemistry, Faculty of Engineering, Mie University, Tsu, Mie 514, Japan

The copolymers, P(S-MMA), were flrst fractionated according to chemical compostllon by liquid adsorptlon chromatography (LAC). Next, the molecular welght dlstrlbution of the LAC fractlons was analyzed by slze exclusion chromatography. Sillca gel having a pore slze of 30 A was used as an adsorbent and mixtures of 1,2-dichloroethane (DCE) and chloroform (Including 1% ethanol as a stablllzer) were used as the moblle phase for LAC. The InHlal moblle phase for LAC was DCE and then the content of chloroform In the mobile phase was Increased stepwlse up to 100%. When DCE was used as the mobile phase, the copolymers adsorbed on the external surface of slllca gel. With Increasing a chloroform content In the moblle phase, the copolymers desorbed as a function of their composklon. The early-eluted fractlon In LAC had lower molecular welght averages and higher styrene content than those tor the untractionated copolymer. The late-eluted fraction had the opposite values.

It is well-known that copolymer properties are affected by composition in addition to molecular weight. Most copolymers have a chemical composition distribution (CCD) and a molecular weight distribution (MWD). Although MWD of homopolymers can be measured by size exclusion chromatography (SEC) rapidly and precisely, accurate information on MWD of copolymers cannot be obtained by SEC alone. This is because separation in SEC is achieved according to the sizes of molecules in solution and the molecular weights of the copolymers are not proportional to molecular size unless the composition fluctuation is negligible and the chemical structure is the same across the whole range of molecular weights (I). Information on these distributions (MWD and CCD) should be obtained by separating the copolymer by composition independently of molecular weight and then determining MWD of each fraction. Or, inversely, MWD is

determined first, independently of composition and then CCD of the same molecular weight species is measured. This is the principle of cross-fractionation,which can be performed by means of a combination of several chromatographic methods. Some workers used SEC for the first fractionation, and as the second chromatographic method, thin-layer chromatography (TLC) ( 2 , 3 )and high-performance precipitation liquid chromatography ( 4 , 5 ) were applied. Balke and Pate1 (6, 7) performed cross-fractionation by orthogonal chromatography (a combination of SEC-SEC) in which different mobile phases were used. Inagaki and his co-workers (8) used the combination of column adsorption chromatography (first fractionation) and SEC (second fractionation). They separated styrene-methyl methacrylate graft copolymer by LAC using the mixture of ethyl acetate and benzene as the mobile phase. Styrene-methyl methacrylate random copolymer was also separated according to chemical composition by LAC on silica (9). The aim of the present work was to investigate the chromatographic techniques first to separate styrene-methyl methacrylate random copolymer according to chemical composition and then to measure MWD of each fraction. The combination of high-performance liquid adsorption chromatography (LAC) with SEC was applied in this work. A 1,2dichloroethane-chloroform/silica gel system was used for LAC.

EXPERIMENTAL SECTION High-Performance Liquid Adsorption Chromatography. LAC measurements were performed on a Jasco TRIROTAR high-performance liquid chromatograph (Japan Spectroscopic Co., Ltd., Hachioji, Tokyo 192, Japan) suited for gradient elution with a gradient programmer Model GP-A30. A variable-wavelength ultraviolet absorption detector Model UVIDEC-100 IV was used at a wavelength of 254 nm. The column of 50 mm length and 4.6 mm i.d. was packed with silica gel with a pore size of 30 A and a mean particle diameter of 5 p m (Nomura Chemical Co., Seto 489, Japan) by a high-pressure high-viscosityslurry-packing technique. The number of theoretical plates (N)of the column

0003-2700/86/0358-0303$01.50/00 1986 American Chemical Society