Three-component curve resolution in liquid chromatography with

Oliver Steinbock, Bettina Neumann, Brant Cage, Jack Saltiel, Stefan C. Müller, .... Glick , Benjamin. ... J. R. Conder , Sonia. ... Pacholec , and Li...
0 downloads 0 Views 1MB Size
971

Anal. Chem. 1985, 57,971-985

of some electroactive compounds that are to be used as reference standards. However, all electroactive compounds may not give a stoichiometric response since hydrodynamic voltammograms that lack well-defined limiting-current regions are commonly reported in the literature. The crushed RVC electrode yields optimum results for the purity determination of compounds in the 100-400 nmol injected range, a range with convenient sample handling and high accuracy and precision. For trace analysis with standards, however, the higher figure of merit for the CoulOchem detector indicates that it or a similar coulometric detector with an electrode area smaller than the area of the crushed RVC detector might be superior. The downstream amperometric electrode is useful for determining the coulometric yield and for monitoring the condition of the crushed RVC electrode (after 1 month of use, a sudden l w in yield of the crushed RVC electrode as detected by the downstream electrode necessitated repacking the crushed RVC).

system from Environmental Science Associates, Bedford, MA, is greatly appreciated. Registry No. C, 7440-44-0; acetaminophen, 103-90-2;L-dopa, 59-92-7; ascorbic acid, 10504-35-5;hydroquinone, 123-31-9;2,5dihydroxybenzoic acid, 490-79-9.

ACKNOWLEDGMENT

RECEIVEDfor review November 2,1984. Accepted January 22, 1985.

The loan of the Coulochem Model 5100A dual electrode

LITERATURE CITED Johnson, D. C.; Larochelie, J. Talanta 1073, 20, 959-971. Strohi, A. N.; Curran, D. J. Anal. Chem. 1970, 57, 1050-1053. Curran, D. J.; Tougas, T. P. Anal. Chem. 1084, 58, 672-678. Schieffer. G. W. Anal. Chem. 1080, 56, 1994-1996. (5) Andrews, R. W.; Schubert, C.; Morrison, J.; Zink, E. W.; Matson, W. R. Am. Lab. (FalrfieM. Conn.) 1982, 74, 140, 142-144, 146, 148-151. (6) Lankelma, J.; Poppe, H. J . Chromatogr. 1076, 125, 375-386. (7) Takata, Y.; Muto, G. Anal. Chem. 1973, 45, 1864-1868. (8) Hagihara, B.; Kogoh, K.; Saito, M.; Shiraishi. S.; Hashlmoto, T.; Tagawa, K.; Wada, H. J. Chromatogr. 1083, 281, 59-71. (9) Roe, D. K. Anal. Lett. 1983, 76, 613-631. ( I O ) Miner, D. J.; Kissinger, P. T. Biochem. Pharmacal. 1070, 28, 3285-3290. (11) Behner, E. D.; Hubbard, R. W. Clln. Chem. (Wlnston-Salem, N.C.) 1079, 773. 249-261. (12) Schieffer, 0.W.; Wheeler, 0.P.; Cimino, C. 0. J . Liq. Chromatogr. 1084, 7, 2659-2669. (1) (2) (3) (4)

Three-Component Curve Resolution in Liquid Chromatography with Multiwavelength Diode Array Detection Bernard Vandeginste,* Raymond Essers, The0 Bosman, Joost Reijnen, and Gerrit Kateman Department of Analytical Chemistry, University of Nijmegen, Toernooiveld, 6525 ED Nijmegen, The Netherlands

Diode array detection in HPLC produces a data matrlx of spectra (rows) and chromatograms (columns). The fundamentals of mathematical curve resolution on a data matrlx of maximally three coeluting components are presented. The curve resolution Is based on a factor analysis of the spectra or chromatograms of the mixtures. The three factors wkh the minimal area to norm ratio are good estimates of the spectra or chromatograms. Threecomponent curve resolution whlch Is directly applied on the chromatograms is preferable because of the better quality of the estimated spectra and resolved chromatograms. The performance of the method is demonstrated on a number of model systems of poiycycllc aromatlc hydrocarbons and on the separatlon of protelns.

The introduction of the multiwavelength diode array detector in HPLC has extended the potentials of the method to those of a true hyphenated method such as gas chromatography-mass spectrometry (GC-MS). Conventional detectors such aa a single or dual wavelength detector in HPLC or a flame ionization detector in GLC produce relatively poor qualitative information in comparison to specific qualitative methods as MS, NMR, and IR. In many cases several separations are required under various conditions before a conclusive answer is obtained for complex mixtures. In HPLC with diode array detection the potentials of UVVis spectrometry for qualitative analysis are combined with

the good separation capabilities of chromatography in order to unscramble the composition of the mixture. As long as all compounds are well separated no special difficulties arise in obtaining the requested qualitative and quantitative information. Problems begin with the Occurrence of partially separated compounds in the mixture. In this case, signals obtained with detectors equipped with a single sensor; i.e., a single wavelength detector in HPLC, or a FID in GLC, do not offer much possibility for extracting the qualitative and quantitative information. The less the compounds are separated, the more powerful chemometric methods should be applied in order to derive relevant parameters such as retention times and peak areas. Nonlinear regression or curve fitting is the most advanced method in that respect, but with very clear limitations (1). Firstly a mathematical model should be available of the shape of the chromatographic elution profiles. Secondly, ambiguous results are obtained under conditions of a very poor separation. A fast recording of complete spectra at short time intervals during the elution of the compounds yields a two-dimensional data matrix. The spectra, which are obtained during the coelution of several compounds together, are linear combinations of the spectra of the compounds in the mixture. Therefore when the identities of the compounds are known and their spectra are available, the contribution of each compound in the measured spectra can be calculated by solving an overdetermined system of linear equations by a least-squares method (2). A plot of these concentrations as a function of the elution time (spectrum number) gives the

0003-2700/85/0367-097 1$0ISO10 @ 1985 American Chemlcal Society

ANALYTICAL CHEMISTRY, VOL. 57, NO. 6, MAY 1985

972

FS-

olution for the resolution of difficultly separable proteins are described. All software has been incorporated in one of the commercially available softwater packages for the operation of a diode array detector and is available for a microcomputer.

3

I

THEORY

~

3

Elutiontime ( m i d -->

Flgure 1. Thirteen-component system of polycyclic aromatlc hydrocarbons in 80% methanoVwater (M80P13). Plot of the maximum

values in the spectra (maxplot).

concentration profiles from which the concentrations of the compounds can be derived. The result is a resolution of the overlapping concentration profile without any supposition of the peak shape. When the identities of the compounds are unknown or their spectra are not available, aforementioned resolution of the concentration profiles should be preceded by a determination of the number of compounds in the peak cluster and an estimation of their spectra. The latter is the subject of this paper. Because of the linear additivity of the signals, the number of coeluting compounds can be determined by a principal components analysis (3). Recently, it has been demonstrated that besides the number of compounds, the pure spectra can be derived as well. Kowalski (4) adapted the method developed by Lawton and Sylvestre (5) for the curve resolution of two-component data matrices obtained in GC-MS ( 4 ) and HPLC-UV-Vis (6). Chen and Hwang (7) developed an algorithm for a threecomponent curve resolution in GC-MS. Both methods first estimate the pure spectra of the compounds, whereafter the relative contributions of the compounds in the overall signal are evaluated. In the case of chromatography this gives the elution profile of each of the components. Basically the “pure spectra” are estimated by an extrapolation of the two (or three) “purest measured spectra” until negative absorbances would appear. The assumptions on the elution profiles are that all concentrations are nonnegative but no assumptions are made on the shape of the elution profiles. This paper describes a modification of the three-components curve resolution algorithm of Chen and Hwang (7) adapted for data matrices obtained in HPLC combined with a diode array detector (8). Furthermore an alternative curve resolution method is presented where the elution profiles of the compounds are directly estimated from a factor analysis, whereafter the pure spectra are calculated. The advantage of this method is that impure spectra are no longer extrapolated to zero absorbance at one of the wavelengths and that the inclusion of additional but realistic constraints on the elution profiles helps in improving the results of the factor analysis. These assumptions are that besides the elution profiles of each of the compounds being positive they also contain only one maximum and have a minimal area (under a given normalization). The performance of the two- and three-component curve resolution technique in HPLC is demonstrated on the resolution of a number of two- and three-component clusters which are obtained for an 8- and 13-component model system of polycyclic aromatic hydrocarbons (PAH). Furthermore practical results of curve res-

When NS spectra are recorded at NW wavelengths during the elution of a mixture of compounds from a HPLC, a NS*NW nonnegative data matrix D is obtained. The rows represent spectra Mi of various mixtures in various proportions, which are obtained at selected times ( t i ) during the elution. The columns of the data matrix are chromatograms (Rj)recorded at fixed wavelengths (Aj). According to the Lambert-Beer law (the absorbances of) the mixture spectra are linear combinations of the NC spectra of the pure (but unknown) compounds ( S O ) , ..., S(NC)) or “pure” spectra. Equally all chromatograms (Rj)are linear combinationsof the concentration (or elution) profiles of the pure compounds (E(l), ..., E(NC)) or “pure profiles”. In principle two approaches are possible for the multivariate analysis of the data matrix D: either the spectra are considered to be a point (or vector) in the space spanned by the NW wavelengths (this is called Q mode) or the chromatograms are considered to be a point (or vector) in the space spanned by the NS elution times (called R mode). Irrespective of the mode one chooses, the first step in curve resolution is the selection of that part of the data matrix which belongs to a given peak cluster. Because it is very improbable to find one of the wavelengths being sensitive for all compounds, no clear picture of the presence and location of all peak clusters can be obtained by plotting one of the columns of the data matrix. Clusters of coeluting substances, however, are easily detected from a plot of the maximal value (absorbance) of each row (spectrum) in the data matrix as a function of its row number (time) (Figure 1). By a cursor-controlled indication of the beginning and the end of a peak cluster, all spectra belonging to one cluster are easily selected and form a so-called “cluster”. The number of components in a “cluster” is determined by a principal componentsanalysis. Therefore one calculates the eigenvectors and eigenvalues of the dispersion matrix D’D (in Q mode) or DD’ (in R mode) (9). The eigenvectors are orthonormal and span in decreasingorder a part of the variance in the data (3). In the Q mode, the original data matrix D can be represented in a space spanned by NW orthonormal eigenvectors, or D = B,.V, (1) where V, is the NW*NW Q-mode factor loading matrix of the NW factors, and B, is the NS*NW Q-mode factor score matrix which gives the contribution of a particular factor in a given spectrum. In the R mode the data matrix is represented in a space spanned by NS orthonormal eigenvectors, or D’ = H = B,-V, (2) where V, is the NS*NS R-mode factor loading matrix of the NS factors, and B, is the NW*NS R-mode factor score matrix. The number of components (NC) in the cluster corresponds with the number of factors necessary to reconstruct the matrices D or D’ within the-noise in the data. Thus the length llD - Dll of the differences beJween the data matrix D and the reconstructed data matrix D with NC factors should be within the limits expected from the noise, where D = B,.V, (3) where V, is a NC*NW matrix (Q mode) of the NC first ei-

ANALYTICAL CHEMISTRY, VOL. 57, NO. 6, MAY 1985 Thota-Phi-plot

a

“7

Normalized opootrum 1

b

Normalized opeotrum 2

b

Elutionproffloo

C

973

Flgure 2. (a)Spectra of a two-component peak cluster, projected on the plane defined by the two first eigenvectors of the data matrix (M80P8P3): (4 4 2) are the coordinates of the purest measured spectra; (4 4 11) are the coordinates of the “estimatedpure spectra”. (b) “Estimated pure

,,

spectra”. (c) Resolved elution profile. genvectors and B, is the NS*NC Q-mode factor score matrix. The scores are calculated by a least-squares procedure, which becomes very straightforward because of the orthonormality of the NC eigenvectors in V, namely

B, = D*V,‘

(4)

Although the measured spectra are linear combinations of the first NC eigenvectors of V, they do not represent the pure spectra. These vectors are “abstract factors” or “abstract spectra”. Equally the factor scores are not true concentrations. In the R mode the data matrix H = D’ is reconstructed with the first NC eigenvectors of V,, where

H = B,*V, (5) V, is here a NC*NS matrix (R mode). The scores are calculated by

B, = H-V,’

(6)

Here all measured chromatograms are linear combinations of the first NC eigenvectors of V, but they do not represent the true chromatograms. The next problem is to transform abstract factors and scores into physically meaningful spectra and chromatograms. The classical way of rotating the abstract factors into physically meaningful spectra (IO)requires the restriction that each pure component spectrum contains a peak in a region where the other pure componentsdo not absorb. This method does not exploit the typical profile of the relative contents of the compounds during the elution in a peak cluster. One knows that the first spectrum in a two-component HPLC peak cluster contains the most of the first component and the last spectrum the most of the second component. The same logical sequence is found in a three-componentsystem as well: purest

I; mixtures of I and 11; mixtures of I, 11, and 111; mixtures of I1 and 111; finally, the purest 111. On the basis of this property, Lawton and Sylvestre (6) developed a method to extrapolate the purest measured spectra of a two-component system into the estimated pure spectra, and Chen (7)proposed a similar method for a three-component system. Here we present a theoretical development which unifies both methods for the estimation of the pure spectra (Q mode). Furthermore an alternative method is presented which estimates the pure elution profiles first (R mode) and derives thereafter the pure spectra from these estimates. In the section on resulta and discussion some examples are given to demonstrate the advantages of that method.

A. Q Mode: Sequence Is: Estimate Pure SpectraResolve Elution Profiles. Let us assume a system with NC components. Each spectrum &fi of a two-component and a three-component cluster can be written as two-component cluster:

Mi = b& + b& + i;i

(7)

three-component cluster:

Mi = bl,;V1 + b&

+ b& + isi

(8)

TI,VQ,and 7 3 are the first three eigenvectors in V,, hereafter called V. The factor scores 6; are found by solving )ji

MiV’

I

(9)

where 61 is a row of the NS*NC factor score matrix B, and Ti is the vector of the residuals between model and observations.

974

ANALYTICAL CHEMISTRY, VOL. 57, NO. 0, MAY 1985 THETA-PHI PLOT

-0. i

Therefore a third constraint has been formulated, namely: (constraint 3) The “estimated pure spectra” are the simplest spectra found in the solution space. The simplest spectra have the smallest area to norm ratio. In the rare case that less or more than three minima are found, the spectra with maximal absorbance are considered to be the simplest spectra. a. Two-Component Cluster. Any spectrum with a score @j for which r$l < 4i C 42 represents a mixture which is less pure than the measured spectra with scores cpl and &. Obviously any spectrum with a score 4 with cp < d1 or cp > c $ ~ is purer than the spectra with the scores @1 and dz. The application of constraint 1however gives two boundaries for the extrapolation, namely, the scores of the first (I) estimated pure spectrum are (5)

#’;

-0.3

I I I

-0.5

I

-0. B

I

-0.1

I

4.U

I I

-a.

I

I

9

I

where ulj and uZj are respectively the jth element of the first and second eigenvector. The scores of the second (11)estimated pure spectrum are

MBOP13PB 3 COMP. Q-MUDE Figure 3. Two-dimensionalrepresentation (In polar coordlnates) of the spectra of a three-component peak cluster projected on the space spanned by the three first eigenvectors of the data matrix: 8-4plot.

After normalization of the vector 6; and the spectrum Mi on a norm equd to one (llhiIl = 1, IlMill= 1)one can transform hi into polar coordinates: two-component cluster:

Mi = cos $iVl + sin diV2+ ri

(10)

bl,II

Because all elements of the first eigenvector are positive, these two sets of score pairs correspond with these two linear combinations of the two first eigenvectors which give somewhere a zero value. The outer limits for 4 of two “estimated pure spectra” are therefore (Figure 2a)

three-component cluster:

M i= cos oiV1 + sin Bi cos $;V2 + sin Bi sin $;V3 + fi (11)

q = B sin $

= arcsin bz,I

(144

cbII = arcsin b2,11

(14b)

$I

where 0 C 4; C 2a and 0 C 8, < a/2. Consequently all spectra of a two- and three-component cluster can be represented in a two-dimensional plane, described by the values of 8 and 4. The spectra of a two-component and three-component cluster are points (p,q) in a two-dimensional 8-4 plot where 8 is the length of a vector from the origin and 4 is the angle between the vector and the 4 = 0 axis. For the two-component case 8 is given an arbitrary value (Figures 2 and 3). p = 0 cos 4 (124 (12b)

A first approximation of the pure spectra is obtained from the fact that and 42 represent the two “purest measured spectra” in the two-component system (Figure 2a). In the three-component case it is usually observed that the spectra form two sides of a triangle of which the corner (Figure 3) and both extremes represent the three “purest measured spectra”. For the extrapolation of the “purest measured spectra” into the “estimated pure spectra”, the contraints formulated by Lawton and Sylvestre (5) are imposed: (constraint 1) all absorbances of the “estimated pure spectra” are nonnegative; (constraint 2) all components contribute to the measured signal in a positive way. It will be discussed below that the solution of the threecomponent system requires the selection of three spectra from all candidates which satisfy constraint 1 (solution space).

= (1 - bz,I12)1’2

resulting for the “estimated pure spectra”

or

Checks on a possible validation of the other constraints are discussed with the three-component system. b. Three-Component Cluster. On the analogy of the previous two-component system, the three “estimated pure spectra” are found by extrapolating the scores 4 and 8 until one of the constraints is no longer satisfied. For each value of 4 (0 C cp C 2a) a value of 8 can be calculated for which one of the elements of the spectrum becomes zero (i.e., for a larger value of 8, constraint 1would be violated). Instead of finding a direct solution of three spectra, a line of (8-4)values is obtained which,obey constraint 1 having a zero in its spectrum (Figure 4b). By the application of constraint 3 the solution can be reduced to three (8-4)pairs. Namely a plot of the area of all spectra corresponding with the scores (e-$) on the triangle shows three distinct minima (Figure 4b). These minima correspond to the three simplest spectra in the set: (81-&), ( ~ I - ~ I I )and , (8IIr4rIr).

ANALYTICAL CHEMISTRY, VOL. 57, NO. 6, MAY 1985 975 THETA-PHI PLOT WCUL41ED ELUTIONPROi .CS 3F PURC COYPONEIITS

AREA TO NORU PLOT (WINPLOT)

r+b

" \ -i' I

:3

d

A

h

d

./

I :::

.. ".....".....""... m.-

*

I

I.

I , "

I " I

..."""..

5 0 . .

m.mc u c u u i E o EtunouPmFitEs

ESTIYATFO PURE S P E r n

I . -

I,.

/I"1

of

PURE COYPONENTS

-

Flgure 4. (a) Three-component 0-4 plot of cluster M80P13P6 (Q mode). (b) Line of (0, $) pairs of candidate "estimated pure spectra". (c) Area $ 11), and (0111, $ 111) are the coordinates of the estimated pure spectra. to norm ratio of all candidate "estimated pure spectra". (Or, 4 (d) Estimated pure spectra. (e) Resolved elution profiles, uslng "estimated pure spectra" from (d). (f) Corrected estimated pure spectra. (g) Corrected

(e,,,

elution profile, using the corrected "estimated pure spectra". The "estimated pure spectra" are therefore S(1) = cos eIVl sin eI cos $IV2 sin eI sin

+

+

$1V3

S(2) = cos eIIV1+ sin

eII cos $IIVz + sin 811 sin $IIV3 S(3) = cos eIIIV,+ sin eIII cos $IIIVz+ sin eIIIsin 4111V3 (164 or

D

S = X . V where S =

and

Typically, the score pairs of the "estimated pure spectra" coincide with the corners of the solution line. It may occur that although constraints 1and 3 are satisfied, constraint 2 is still violated. A check on constraint 2 requires the calculation of the contribution of each of the components in the overall measured signal. Because all measured spectra are linear combinations of the "estimated pure spectra", it can be stated that

sin o I sin 01 sin $1 elI sin eII cos @TI sin elI sin @n eIII sin e m cos @rII sin enI sin

cos 0 1

C-S

(17) where D is the NS*NW matrix of the measured spectra, C is the NS*3 matrix of the contributions of the "estimated pure spectra" in the measured spectra, and S is the 3*NW matrix of the "estimated pure spectra", with the least squares solution C' = (S.S'I-1S.D' (18) In fact C represents the curve-resolved chromatogram. A complete example of curve resolution is given in Figure 4.

976

ANALYTICAL CHEMISTRY, VOL. 57, NO. 6, MAY 1985

I$’

straightforward curve resolution yields the spectra (S(l),S(2), and s(3)). Comparison of the spectra with the true spectra shows that s(2)is the spectrum of A, B can be considered to be a combination of d and s(3)(Figure 5b), while N can be considered to be a combination of B and S(1). Thus all mixtures can be adequately described as linear combinations of &1),&2), and d, namely

+ S(3) N = y.B + S(1) N = x . y A + y S ( 3 ) + S(1) B

or

= x.A

(19)

It should thus be noted that although S(1)and S(2)are bad “estimated pure spectra”, better estimates of the pure spectra can be obtained by taking the proper linear combination of the three “estimated pure spectra” (S(l),5(2),and S(3)). These linear combination can be derived from an inspection of the elution profiles found when S(1),&2), and S(3) are substituted in eq 18. Let us assume that l(1,l)...l(3,3) represent the heights of the three elution profiles at the locations of the three coinciding maxima (see Figure 5c). Thus l(i,j) is the height of the profile belonging to spectrum SO‘)0’ = 1,3) at the location of the ith maximum (i = 1,3). The linear combinations to be taken are

(i;;)= (::j:; y; ;;;:2$(LJ E(1,1) 41,2) 4193,

S*(l)

El”tl~fl1..

1

Homllmd COIr”n

S’l

d

s’3

L

I

Flgure 5. (a) True spectra of anthracene ( A ) , benzene (a),and naphthalene-(K!). -(b) Estimated pure spectra of A , 6 ,and n, reEpectiyely, S(l),S(2), and S ( 3 ) . (c) Elution proflles calculated with S(l),S(2), and S ( 3 ) . /(l,l),/(2,1), /(3,1), ..., /(3,3)are the correction factors. (d) Corrected estimated pure spectra. (e)Elution profile using the corrected estimated pure spectra.

S(1)

(20)

After the correction it is necessary to renormalize the obtained corrected spectra: 119*(i)ll= 1. The result of such a correction procedure applied on the anthracene, benzene, and naphthalene (ABN) system is shown in Figure 5d and demonstrates that good “estimated pure spectra” are now obtained. When the number of spectra (NS) in the peak cluster is smaller than the measured number of wavelengths (NW), corresponding elements of the &-mode eigenvectors (PI...V3) will be zero. Consequently no extrapolation to the pure spectra is possible because the first constraint is fulfilled for all measured spectra. When NS is relatively small in case that narrow elution profiles are measured, NW should be kept low, which limits the resolution of the “estimated pure spectra”. However an improved resolution of the “estimated pure spectra” can be obtained by carrying out the following procedure. The final results of curve resolution when NS C NW are (i) “estimated pure spectra” at NS wavelengths and (ii) the resolved elution profiles in NS points. The resolution of the “estimated pure spectra” can be enhanced to the full NW wavelengths, measured for the mixtures. Namely each chromatogram (column HIin the original data matrix D), recorded at one of the wavelengths 0‘)which was not included in the set of NS wavelengths used for the curve resolution can be written as (214 HI = SIJC1 + S Z J C 2 + S3,,C3 or HI = CS] (21b) which can be solved for s l J ,s Z J ,and sa,,, namely

Our experience is that the resolved elution profiles usually satisfy constraint 2 but may contain double maxima (Figure 4e). The reason is that although the true pure spectra are inherently simpler than each of the mixture spectra, this does not necessarily imply that the pure spectra should coincide with the simplest spectra on the solution triangle. Let us consider the incomplete separation of a mixture of anthracene (A), benzene (B),and naphthalene (iij) (ABN). A

(::) c3

S.=

el,cz,

szj =

(c’.c)-l.c’.H,

(22)

where and are the three elution profiles (NS points) or columns of the NS*3 matrix C as obtained from eq 18. Rl is the chromatogram (NS data points) at wavelength j . This

ANALYTICAL CHEMISTRY, VOL. 57, NO. 6, MAY 1985

977

obtained in the R mode represent now abstract elution profiles. Each chromatogram iii can be written as a linear combination of the abstract factors VI, (and $), which are the first eigenvectors of V,, hereafter called V. two-component cluster:

v2

three-component cluster:

Ri= cos eiVl + sin ei cos 4iV2+ sin ei sin 4iV3+ F~ (24)

PI

r

The values of C$ and 0 are calculated by a least-squares procedure and all NW chromatograms can be represented in a two-dimensional plane (see Q mode), The two-component chromatograms lie on a circle in the plane, and the three-componentchromatograms can be represented using the e-$ plot projection given in eq 11 (see Figure 7a). In an identical way as explained for Q-mode curve resolution, abstract factors are transformed into real factors-here the elution profiles-by extrapolating the “purest measured elution profiles” (these are the chromatograms with one dominant peak) into the estimated elution profiles of the pure compounds, herein called “estimated pure profiles”. Constraints which limit the number of solutions to two or three profiles are (1)the elution profiles are nonnegative and (2) the elution profiles with the smallest area to norm ratio are selected. The application of constraint 1 yields directly the scores ($1, $11) of the two ”estimated pure profiles” (E(1) and E ( 2 ) ) in the two-component system, while a line of solutions is obtained for the three-component system. By the application of constraint 2, the three pairs of scores (eI, &), (Bit q$I), and (0111, $111) are obtained for which the simplest elution profiles are obtained by substituting these scores in eq 25, which is an analogue of eq 15

vl-v2

E(1) = cos 81Vl + sin eI cos d1V2+ sin

eI sin 4IV3 E ( 2 ) = cos eIIVl + sin en cos 411V2+ sin oII sin 411V3 E(3) = cos eIIIVl+ sin dIII cos 4111V2+ sin eIII sin 4rIIV3 (254 or

E = X . V where E =

and Flgure 6. (a) Three-component 0-4 plot of cluster M80P13P6 (Q mode) and line of (0, 4 ) pairs of candidate of pure spectra. (b) Estimated pure spectra (20 wavelengths, HP85). (c) Resolved elution profile. (d) Corrected estimated pure spectra. (e) Corrected elution profile. (f) Spectra at full resolution alter interpolation with the resolved elutlon profiles.

is the jth column of the original data matrix D, which has not been included in the set of NW wavelengths used for the curve resolution. (slj,sZj, and s3j) are the absorptivities of the “estimated pure spectra” a t wavelength j. The results of such a procedure are shown in Figure 6d-f.

B. R Mode: Sequence Is: Estimate Elution Profiles-Resolve Pure Spectra. The NC eigenvectors

eI

sin 81 cos $1

sin oI sin

011

sin 011 cos 411

sin elI sin $E

eIII sin e, cos @IU sin eIII sin The sequence of an R-mode curve resolution is depicted in Figure 7. It is noted that each combination of the scores (0, 4) in eq 25 yields normalized (norm = 1) “estimated pure profiles”. Consequently the three “estimated pure elution profiles” have approximately equal heights (for equal half height widths). On the analogy of eq 17, the pure spectra (at NW wavelengths) are derived from the measured chromatograms (columns in the data matrix D) and the three “estimated pure

978

ANALYTICAL CHEMISTRY, VOL. 57, NO. 6, MAY 1985 THETA-PHI PLOl

AREA TO NORM PLOT [MINPLOT)

:L .,7

Y B O P IA-MODE ~~ CORRECT 3 COMP

ESTIMATED PURE SPECTRA

f S T I Y A T I D PURE SPECTRA

WCULATED ELUMNPIX)FlLIS of PURE COYRUENTJ

e

e

Y

NORMALIZES ELUTIONPROFILES OF PURE COMPONENTS

d

8

IO

lV

#O

Ls

I

E

YO

US

GO

Si

(0

T;MC IN ~ I Ytm1m8

Figure 7. R mode curve resolution, cluster M80P13P6: (a)three-component (8,4)plot of the chromatograms; (b) line of

(8,4)pairs of candidate ~ "estlmated pure elution profiles"; (c)area to norm ratio of all candidate "estimated pure elutbn profiles", (Or, 4 J, (811, 4 11), and ( 8 1$J~lIr)~ coordinates of the estimated pure elution profiles; (d) estimated pure elution profiles (normalized);(e) estimated pure spectra; (f) rescaled estimated pure elution profiles. elution profiles", &1),&2), and E ( 3 ) )by solving the following set of equations: Hi = U l , i . E ( l )

I

-

+ U,,j.E(2)t a , , i * E ( 3 )=

( a i , i a ~ , i a ~ , i ) ( ~ (( ~2 6) )

i=l,NW where f l i is the measured chromatogram at wavelength i (NS data points) and uli, u2,i,and u3,iare the respective absorbances of the three pure spectra at wavelength i. For all chromatograms H = A-E (27) where A is the NW*NC matrix of the absorbances of the pure

spectra. A is estimated by a least-squares procedure

A' = E." = S

(28)

Because the measured chromatograms were not normalized, unnormalized spectra will be obtained. The norms of these calculated pure spectra however are the scale factors with which the elution profiles B(1), E(2), and E ( 3 ) should be multiplied in order to obtain the true contributions of each of the compounds in the total signal. The scaled elution profiles are therefore = E(i).IIS(i)ll for i = 1, 2, and 3 (29) The condition for the application of R-mode curve resolution is that the number of wavelengths (NW) in the data matrix D is larger than the number of spectra in the cluster considered. This condition is easily met. From that condition it follows that "estimated pure spectra" are directly obtainable with the highest resolution attainable with the diode array

ANALYTICAL CHEMISTRY, VOL. 57, NO. 6, MAY 1985

a

..

STRING

.............. i............... OAOHON /.. ........., :’ ’ s ..: :OETECT i .............. ; ..... ...... :............. .3 .. ... ... .. .... . . ..... ................... ...

i’”:

979

~

l e ~I I ~ ~ ~ ~ l ~

SELECT ~

~

~

(

I

I

I

~

7

*

*

EVALl.JP

/

c

i“ :;:

................... .. ..............;: .. : : i EVALUA ; :: ...............:.:: ................ EVALUM ?! .............. :. .............:: j EVALUl f ............... I

iY\/\\,

t f PLOT

.

...................:i .................

SUAVE

eCOR

[.WAVE)

DETECT

i

LEA

sEVALUC



!

5

j ,( :.

I

I /

..................... ..

I

: .: . ................... i’ .................. i EVALUH :.......y .......

..................

eELUTI SUAVE

.. ............-----........ . . A

llD

IX

...

I-

EXPERIMENTAL SECTION

\

of

I

Flgure 9. Three-component Q-mode resolution of A65P13P6: (a) three-component (e, 4 ) plot of the spectra; (b) line of the (0, 4 ) pairs of candidate “estimated pure spectra”; (c) area to norm ratio of all candidate estimated -pure spectra.

tf PLOT

i EVALUA

I

,-

A5S,s13.~b 3 ( O ! P Q-L!CPc

................... ..’: .

twTpa 1

0

:

-

i EVALUP

:,wilmr;

\.;

I

. .. ......

10 IORV +LOT

ra

Figure 8. Modules for curve resolution added to the (a) HP 0104010301 (cartridge) software and (b) HP 01040-10001 (floppy disk) software.

detector. When the HP85 is used the elution profiles are only calculated over maximally 20 points, and a subsequent recycling of the results is necessary to obtain more values for the elution profiles. This procedure is equivalent to eq 21 and 22 for the enhancement of the spectra in the Q-mode curve resolution. Each measured spectrum (row of matrix D), but not included in the curve resolution, can be written as

ai

Mi = C,,.S(l)+ C@’S(2)+ Ci,3‘S(3)

(30)

which can be solved for Cl,i, C,c, and C3,iwhich is a point of the three resolved profiles, thus

where &1), s(2), and s(3) are the three rows of the 3*NW data matrix S. It should be remarked that the elution profiles indicate the relative contribution of the compounds in the signal in absorbance units but the area under the curves does not indicate the relative concentration of the compounds. The reason is that the absorbtivity coefficients which relate absorbance to concentration have not been determined. A clear advantage of the R-mode curve resolution is the fact that the constraint of the area to norm ratio is more realistic for elution profiles than for spectra. The smallest area for a given norm is found for these solutions on the solution triangle which represent the sharpest single peaked elution profiles. A calibration procedure is being tested for quantitative analysis.

The chromatogramswere obtained with equipment consisting of a M6000A pump (Waters Associates, Milford, MA), a Rheodyne-7125 injector, and home packed columns filled with ODSHypersil 5 pm (Shandon). A Hewlett-Packard Model 1040A diode array UV-Vis detector was used in conjunction with a HP85 microcomputerfor the acquisition and processing of the spectra. Standard HP software was used for the operation of the detector from a built-in tape-cartridge (package HP 01040-10001) and from a floppy disk as well (package HP 01040-10301). Softwarefor the curve resolution has been incorporated in both HP software packages. A main-frame version has been developed as well. The structure of the HP85-program is shown in Figure 8. All routines in the solid frame are extensions which are called in an overlay structure. All routines with an “s” in front of the name are &-moderoutines and routines with an ”e” are R-mode routines (see Theory). Routines with none of these are shared in both modes. A specific routine in the tape version is the storage of spectra, which allows the user to dump all spectra stored in the buffer of the 1040A detector (maximal 30 spectra recorded over 115wavelengths),into a file for further processing. Because of the limited storage capacity of the detector buffer and the slow transfer speed to the tape, only one full buffer (30 spectra) can be transferred and stored during a run. Tape operation, therefore, is appropriate for the data acquisition and storage of one peak cluster per run only. Unlimited storage of spectra (200 wavelengths) on floppy disk with fast data transfer between buffer and disk is provided with the standard HP software for disk. Specific routines which have been incorporatedin the disk system are string, select, and maxchr: “string” decodes the HP coded strings, which contain the spectral information, into decimal notation (ASCII). . “maxchr”creates a file which contains the maximal absorbance of each spectrum and plots the so-called “maximum chromatogram.” “select” allows a cursor-controlled selection and storage of spectra which belong to a given cluster. The cursor is therefore positioned at the beginning and end of a peak cluster from the “maximum chromatogram” plotted on the screen. The common routines for the tape and disk system are as follows: “(s/e)EVALUC”controls the subprogramsfor curve resolution. “(s/e)COMAT”selects the data file for curve resolution and

4

C U C U L I l E D ELUiiOhPROF'.fS

Or 'Ulii COYWNLNTS

-7

3

YlloPiJn 2 j ~ /

rL*RJF

Flgure 10. Summary of "estimated pure spectra and resolved elution profiles (R mode) found for system M80P13.

ANALYTICAL CHEMISTRY, VOL. 57, NO. 6, MAY

1

2

1985 981



L

Flgure 11. True spectra of the compounds In the 13-compound system.

Table I. Overview of the Curve Resolution of the Seven Cluster System M80P13 cluster

no. of components

1 2

solvent

3 4

5 6

7

1 4 2 2 3 1

estimateda pure spectra 1 2, 3, 4, and 5 6 and 7

8and9 10, 11, and 12 13

“The estimated pure spectra (Figure 10) can be compared with the corresponding true spectra given in Figure 11 (same numbering). calculates the dispersion matrix D’D (in Q mode) or DD’ (R mode). “HQRII”, calculates the eigenvalues and eigenvectors of the dispersion matrix. The maximum dimensionalityof the dispersion matrix is 20 for the HP85 version (90for the main-frame version). This limits the Q-mode curve resolution (on spectra) to the inclusion of maximal 20 wavelengths, and the R-mode curve resolution to 20 spectra (1cluster), when a HP85 is used. “tfPLOT” makes the 6-4 plot and calculates the scores of the “estimated pure spectra” (Q mode) or “estimated pure elution profiles” (R mode). “eELUTI” derives the normalized “estimated pure elution profiles”. “eCOR”calculates the “estimated pure spectra” and the values of the elution profiles for spectra not included in the curve resolution. Recombinationof eventual fragments of the pure spectra is automatically carried out by positioning the cursor at the locations of the maxima of the elution profiles. “sCOR” derives the normalized pure spectra and calculates the elution profiles. The routine calculates the absorbance of the “estimated pure spectra” at wavelengths not included in the curve resolution. Recombination of eventual fragments of the pure spectra is automatically carried out by positioning the cursor at the locations of the maxima of the elution profiles. “(s/e)WAVE” at each wavelength of the user’s choice the separate chromatograms, overall chromatogram and measured chromatogram are plotted. RESULTS AND DISCUSSION The performance of the proposed curve resolution method

has been tested on a number of two- and three-component clusters of polycyclic aromatic hydrocarbons separated with mobile phases of various constitution. An application of the separation of proteins is discussed as well. In Figure 1the maximum plots are given of the separation of respectively an 8- and 13-component system under the conditions described in the Experimental Section. The peak clusters are coded as follows: M80P13P6 means that the mobile phase was 80% methanol in water (MSO), and the sixth peak cluster (P6) of the 13-component system (P13) is considered. A65 is 65% acetonitrile in water, T40 is 40% tetrahydrofuran in water. (A) Q-Mode Three-Component Curve Resolution (QCR-3). Without proof Chen and Hwang (7) stated that the three “estimated pure spectra” are found in the three corners of a triangular line of (0-6) values which bounds the region where constraint 1 is satisfied. However in many instances a rather diffuse solution line is obtained, with more than three corners (M80P13P6, Figure 4b) or without any corner at all (A65P13P6, Figure 9b), which requires the formal approach we described for finding the three solutions. A plot of the area to norm ratio of all spectra represented by these 6-4 pairs reveals that three of these corners coincide with the three minima of the area to norm ratio. In other words these corners coincide with the three simplest spectra. Q-mode curve resolution is demonstrated on the resolution of the M80P13P6 cluster. This cluster contains 56 spectra recorded at 87 wavelengths (230-402 nm, resolution 2 nm). Processing with an HP85 allows including a maximum of 20 wavelengths in the curve resolution. Figure 6a gives the 8-4 plot of all spectra, surrounded by a line of (0-4) pairs which satisfy constraint 1. Note the bilinear shape of the plot of the measured spectra. The two spectra on both ends of the two lines are the purest fist and last component of the cluster, while the spectrum in the corner is the compound which elutes between those two. The three spectra, corresponding with the three (6-4) pairs with minimum area to norm ratio, are calculated by applying eq 16 and are shown in Figure 6b. The resolution of the obtained “estimated pure spectra” is poor because of the large encoding interval. By use of more extensive computing facilities, a maximum of 56 (number of spectra in the cluster) can be included which gives spectra with higher resolution

082

ANALYTICAL CHEMISTRY, VOL. 57, NO. 6, MAY 1985

....,*,,.

-eP,

I

cow

calculating the absorbances of the "estimated pure spectra" at the 67 wavelengths which were not included in the curve resolution. The final result (Figure 6f) is in close agreement with the true spectra (Figure 11) (spectra no. 10, 11, and 12) and no doubt is left for their identification out of a set of 13. Of course the curve resolution gives in addition the exact sequence of the elution of the compounds. (B) Three-ComponentR-Mode Curve Resolution. The R-mode curve resolution leads to the same solution for the M80P13P6 cluster but in less steps. The limitation is now set on the number of spectra (maximal 20) when a HP85 is used. Although in principle all wavelengths can be included, limited memory space of the HP85 allows the inclusion of maximally 60 wavelengths. All zero columns and rows of the data matrix D should be eliminated before curve resolution. All chromatograms can be represented in a (19-4)plot by applying eq 24 and are surrounded by the line which is the boundary of the region where constraint 1is fulfilled (Figure 7a,b). Note the absence of any structure in the points which represent the measured chromatogramsand the sharp corners in the solution line, which generally occur in the R-mode curve resolution. The three "estimated pure elution profiles" coinciding with the three corners (Figure 7d) are normalized and do not contain false side maxima. By the application of eq 28 the "estimated pure spectra" of the three components are calculated (Figure 7e), which are very similar to the spectra obtained in the Q mode (Figure 4f). As explained in the theoretical section the elution profiles should be multiplied with the norms of these spectra in order to obtain the resolved elution profiles in their correct ratio. The result is in good agreement with these obtained in the Q mode. It should be warned that the resolved elution profiles do not represent concentration profiles. The latter needs a calibration with mixtures of known constitution. An overview of the results of curve resolution (Rmode) on all clusters in the M80P13 system are shown in Figure 10 and Table I, which can be compared with the true spectra given in Figure 11. (C) Curve Resolution with a Faulty Number of Compounds in the Model. Several methods have been proposed

1

" 5 1 SOYP R-YOILI

YaoRVI4 I CDYP AS 2 ' O W R-nOsE

Figure 12. One-component cluster treated as a two-component system (M8OP8P4).

(Figure 4d). The elution profiles obtained by solving eq 18 are clearly impure (Figure 6c). The adequate description of the last elution profile (called component 1by the computer program) requires the combination of spectra 1and 2. Qually the description of the middle elution peak (component 2) requires the combination of spectra 2 and 3. This shows that the "estimated pure spectra" only represent fragments of the true pure spectra with which better "estimated pure spectra" can be assembled. Combination of the fragments with the ratios of the heights of the three elution profiles at the positions of the three maxima, according to eq 20, yields the spectra and resolved elution profiles given in Figure 6d,e. Improvement of the resolution of the spectra is obtained by lHElA-'rI

AREA 10 NORM PLOT (HINPLOT)

PLOT

U00P13P1 2 COMP 49 3 CGUF R-MODE

.

x

1