“Orthogonality” in Comprehensive Two-Dimensional Separations

Life Science, United States Military Academy, West Point, New York 10996, and Department of ... The concept and definition of orthogonality in the con...
0 downloads 0 Views 83KB Size
Anal. Chem. 2007, 79, 7924-7927

Observations on “Orthogonality” in Comprehensive Two-Dimensional Separations Nathanial E. Watson,†,‡ Joe M. Davis,§ and Robert E. Synovec*,†

Department of Chemistry, Box 351700, University of Washington, Seattle, Washington 98195, Department of Chemistry and Life Science, United States Military Academy, West Point, New York 10996, and Department of Chemistry and Biochemistry M/S 4409, Southern Illinois University, Carbondale, Illinois 62901

The concept and definition of orthogonality in the context of comprehensive two-dimensional (2D) separations are interesting topics of active discussion. Over the years, several approaches have been taken to quantify the degree of orthogonality, primarily to serve as a metric to optimize (and compare) comprehensive 2D separations. Recently, a mathematical function was reported that is qualitatively instructive for the purpose of providing such a metric. However, the mathematical function has some quantitative shortcomings. Herein, we both explore and partially correct this function. The orthogonality metric, referred to previously and herein as the orthogonality, O, was mathematically related to the fraction of the 2D separation space occupied by compounds (i.e., fractional coverage) and the peak capacity, P, for one dimension of the 2D separation. The fractional coverage, f, is simply related to the percentage coverage, which is equal to 100%(f). Our main finding was that the values for O as a function of P for a given percentage coverage achieve a constant value at large P but deviate severely to lower O values at small P. For comprehensive 2D separations operated such that the second dimension is at small P, the findings we report have consequences for those who consider applying the O metric. Finally, it is discussed that the percentage coverage may be a better metric to gauge the extent to which the compounds in a given sample mixture have been disseminated in the 2D separation space.

compounds and 2D peak capacity. Furthermore, it is appealing to correlate the %Coverage to the notion of optimizing the extent to which two separations provide complementary information in a comprehensive 2D separation, through the concept of orthogonality between the two separation dimensions. This appears to be the approach taken in a recent report.3 However, the definition of orthogonality, and quantification of it, is a matter of much discussion, and several interesting approaches have been described, with varying degrees of general utility.4-7 Herein we consider some issues that arose when we attempted to apply the definition of orthogonality (reported as a mathematical function) by Gilar and co-workers.3 The definition is qualitatively instructive but has some quantitative shortcomings, which we both explore and partially correct. Furthermore, we provide some insight for the general practitioner of comprehensive 2D separations who is interested in optimizing the “orthogonality” of 2D separations, albeit keeping in mind that orthogonality is a slippery subject.

The basic premise of applying comprehensive two-dimensional (2D) separations is to obtain more information than from a more traditional one-dimensional separation.1,2 To this end, the dissemination of compounds in a sample mixture as much as possible in the 2D separation seems to be a reasonable means to maximize the information content, or at least to enhance the potential for maximizing the information content. Thus, a reasonable working hypothesis is that the percentage coverage (or %Coverage) of the 2D separation should be maximized for a given number of mixture

where Pmax ) P1 × P2 is the number of contiguous bins spanning the 2D space, with P1 and P2 equaling the one-dimensional peak capacities of the two dimensions and Pmax equaling the 2D peak capacity. In the referenced report, the discussion was simplified by the assumption, P1 ) P2 ≡ P, resulting in a square 2D separation space. Here, we retain this simplification, with the recognition that P1 often exceeds P2 in practice, such that, in the event P1 > P2, eq 1 must be applied piecewise to the 2D separation, with each piece having a first-dimension capacity equaling P2. In

* Corresponding author. Tel.: +1-206-685-2328. Fax: +1-206-685-8665. E-mail: [email protected]. † University of Washington. ‡ United States Military Academy. § Southern Illinois University. (1) Giddings, J. C. Unified Separation Science; John Wiley & Sons, Inc.: New York, 1991. (2) Bushey, M. M.; Jorgenson, J. W. Anal. Chem. 1990, 62, 161-167.

(3) Gilar, M.; Olivova, P.; Daly, A. E.; Gebler, J. C. Anal. Chem. 2005, 77, 6426-6434. (4) Schoenmakers, P.; Marriott, P.; Beens, J. LC-GC Eur. 2003, 16, 335-339. (5) Massart, D. L.; Kaufman, L. The Interpretation of Analytical Chemical Data by the Use of Cluster Analysis; John Wiley & Sons: New York, 1983. (6) Slonecker, P. J.; Li, X.; Ridgway, T. H.; Dorsey, J. G. Anal. Chem. 1996, 68, 682-689. (7) Liu, Z.; Patterson, D. G.; Lee, M. L. Anal. Chem. 1995, 67, 3840-3845.

7924 Analytical Chemistry, Vol. 79, No. 20, October 15, 2007

THEORY In our attempts to utilize the mathematical procedure to quantify orthogonality, using the variable O outlined in the cited report,3 we recognized that eq 3 of this report has shortcomings. This equation is recited as eq 1 below

O)

∑ bins - xP

max

0.63Pmax

10.1021/ac0710578 CCC: $37.00

(1)

© 2007 American Chemical Society Published on Web 09/19/2007

eq 1, ∑bins is the total number of bins (i.e., separation space) occupied by compound retention times, as illustrated in Figure 3 of the cited reference.3 This figure illustrates three different types of coverage: (A) the so-called nonorthogonal situation of compounds occupying bins along the diagonal only, (B) the hypothetical situation of a single compound occupying each bin center, and (C) the situation in which bins are randomly occupied, resulting in 63% bin coverage. The measure of orthogonality per eq 1 was intended to span from 0 to 1, for nonorthogonal to fully orthogonal conditions, respectively. In terms of a percent orthogonality, O% was intended to span from 0 to 100%. In an attempt to describe and correct the shortcoming(s) in eq 1, the following derivations are useful to consider. First, since Pmax is defined by P 2 in the cited reference and also above, one can simplify eq 1 to

O)

∑ bins - P 0.63P 2

(2)

It is also useful to define fractional coverage of the occupied separation space, i.e., the fraction of bins occupied by compound retention times

f)



bins

P2

(3)

and relate fractional coverage to percentage coverage (%Coverage).

% Coverage ) 100%(f )

(4)

Equations 1-4 are consistent with eq 3 in the cited report3 (eq 1 herein) and merely define some of the corresponding detail more concisely. Hypothetically, and ideally, orthogonality as defined by O should be independent of P, i.e., constant for a given %Coverage. Unfortunately, this is not the case as will be demonstrated. One shortcoming of eq 2 (as defined from eq 1) is in the denominator; the value of 0.63 does not appear valid for all P. The maximum value of O, equal to 1, purportedly is realized at f ) 0.63, the value proposed by the authors of ref 3 as a reasonable limit based on the %Coverage of randomly distributed mixture compounds equal in number to Pmax8 (the origin of the number, 0.63, is discussed later). By “reasonable limit”, the authors of ref 3 implied that f ) 0.63 is likely the best %Coverage that one can expect and should be used to define the optimal orthogonality, O ) 1. Utilizing the following modification of eq 2

O)

∑ bins - P (X)P 2

(5)

the variable, X, is substituted for 0.63. With eq 3, one can express ∑bins in eq 5 as f P 2, leading to the conclusion that for O ) 1 and f ) 0.63, (8) Davis, J. M. J. Sep. Sci. 2005, 28, 347-359.

X ) 0.63 - 1/P

(6)

The authors’ original equation (eq 1) states that X should be a constant equal to 0.63, but eq 6 shows that this is only true in the limit as P f ∞. Thus, at P ) 10 as applied in the cited report,3 the value of X in eq 5 (per eq 6) is 0.53, and not 0.63, in order to satisfy the O ) 1 boundary condition. In an attempt to correct this shortcoming, eq 6 is substituted into eq 5, producing eq 7.

O)

∑ bins - P 0.63P 2 - P

(7)

As P f ∞, the number of occupied bins (f P 2) greatly exceeds P for any complex multicomponent mixture (which relates to the practical situation faced regarding a suitably diverse sample), and eq 7 reduces to eq 8

O)

∑ bins 0.63P 2

(8)

which interestingly is equal to f /0.63. We now consider the other boundary condition, O ) 0, from the cited report.3 It is determined as the sum of only the number of bins along the diagonal of the square 2D space (i.e., as Σ bins ) P), equating to a totally correlated 2D separation (the so-called nonorthogonal situation of compounds occupying bins along the diagonal only). Thus, the O ) 0 boundary condition is related to fractional coverage (eq 3) by eq 9

f ) 1/P

(9)

Substitution of the condition, Σ bins ) P, into eq 7 does indeed result in the proper answer of O ) 0. However, as we shall see, the orthogonality equation (eq 7, as evolved from eq 1) still does not behave ideally as a function of P. RESULTS AND DISCUSSION In order to characterize the equations described above, a few instructive plots were created. Figure 1 was calculated at six %Coverage values (1, 10, 25, 40, 55, and 63%) by evaluating ∑ bins for the specified f and different 2D peak capacities P 2 using eqs 3 and 4, substituting these calculations into eq 1 (or, equivalently, eq 2), and then plotting the results. As is evident based on consideration of eqs 5-7, the resulting O values are skewed at low P, as P goes to zero. Figure 2 displays eq 6 graphically and shows how X is reduced drastically as P approaches low values, in relation to its limiting value of 0.63 as P f ∞. Finally, Figure 3 displays the results of applying eq 7 at the same %Coverage values utilized in Figure 1, with ∑ bins calculated as before. Here, eq 7 has been properly corrected for changes of X as a function of P, which is to say the orthogonality equation has been corrected for the O ) 1 boundary condition. Indeed, a constant value of O ) 1 is observed in Figure 3 for a 63% coverage, and the severity of skew at small P is reduced as the %Coverage approaches this limit. However, this correction has not solved the overall problem. The reason that eq 7 is still problematic, and the overall approach to Analytical Chemistry, Vol. 79, No. 20, October 15, 2007

7925

Figure 1. Orthogonality (O) versus peak capacity (P) of either first or second dimension, plotted utilizing the original equation (eq 1 expressed as eq 2) at 1, 10, 25, 40, 55, and 63% coverage. In the limit as P f ∞, the relation between O and P stabilizes, but short of the limit skew exists and is significant at small P values. It is important to note that both boundary conditions have shortcomings and the “maximum” 63% coverage (based on a random compound distribution) does not correspond to O ) 1 until P approaches ∼100 or more.

Figure 3. Orthogonality (O) versus peak capacity (P) of either first or second dimension, plotted utilizing the “more correct” eq 7 at 1, 10, 25, 40, 55, and 63% coverage. Even with the corrections derived herein the correspondence between %Coverage and O is skewed at low P. The O ) 1 boundary condition has been corrected, but the O ) 0 boundary condition is still problematic. The O values do approach their limits with better behavior with the corrected orthogonality equation (eq 7). Table 1. Percentage Coverage Values (%Coverage) Corresponding to the Zero Orthogonality Boundary Condition, O ) 0

Figure 2. Equation 6, X, versus P. The value of X approaches 0.63 as P f ∞. Short of P f ∞, application of the limiting value, X ) 0.63, to all P in eq 5 (per eqs 1 and 2) causes some of the skew in O depicted in Figure 1. Application of the correct value of X, expressed by eq 6, leads to Figure 3.

define orthogonality as such remains compromised, is that at low P the %Coverage, corresponding to filling only the bins along the diagonal, changes drastically. These results are depicted in Table 1 and are merely the solution of eq 9 at the listed values of P. Whereas the O ) 1 boundary condition conforms well to 63% coverage per the “more correct” eq 7, the O ) 0 boundary condition is a moving target, dependent upon the value of P. This compromises the notion that orthogonality, as defined by O, should be independent of P. Thus, the equations for defining orthogonality appear limited, primarily because the minimum %Coverage is a function of P and not a constant. The problem is most severe at small P, whereas O values at large P are asymptotically correct. In the prior report,3 the authors described an example with the convenient value of P ) 10, which inadvertently leads the reader to believe that 10% coverage is the limiting value per Figure 3 in the cited report.3 In reality, however, this is only true at P ) 10 and nowhere else (as shown in Figure 3 of 7926 Analytical Chemistry, Vol. 79, No. 20, October 15, 2007

P

%Coverage

P

%Coverage

1 2 3 4 5 10 25

100.0% 50.0% 33.3% 25.0% 20.0% 10.0% 4.0%

50 100 200 500 1000 ∞

2.0% 1.0% 0.5% 0.2% 0.1% 0.0%

both this and the referenced report). Indeed, the value for orthogonality actually goes negative as P is sufficiently reduced, if the %Coverage is low enough as demonstrated in Figures 1 and 3. Another limitation of both eqs 1 and 7 at small P is their statistical nature. The number, 0.63, in these equations is the average fractional coverage f in a 2D separation containing randomly dispersed compounds equal in number to the 2D peak capacity Pmax.3 For small P, the compound number is also small, and the fractional coverage for any one separation may differ from the average f due to random fluctuation for small compound numbers. For example, 2000 simulations of 100 randomly dispersed retention times in 100 bins (P ) 10) shows that individual fractional coverages range from 0.51 to 0.74, whereas 2500 randomly dispersed compounds in 2500 bins (P ) 50) shows that individual values of f vary only from 0.61 to 0.66. The dispersion of f in the former case illustrates another bias in O values determined for small P. We do not mean to be unduly critical of eq 1 or the concepts underlying its derivation. The authors of ref 3 have indeed provided a semiquantitative measure of orthogonality valid for large P. However, the shortcomings of this orthogonality metric merit recognition, lest misleading conclusions be drawn. Even though a P ) 1 is not realistic (per Table 1), comprehensive 2D separation systems operating at low P in the second dimension

are actually quite common. Given a large peak capacity is an important goal in 2D separations, achieving a large peak capacity in the second dimension is difficult while still meeting the criteria of performing a comprehensive 2D separation,2,9 combined with simultaneously running the first-dimension separation at a nearoptimal separation efficiency. Certainly, one can run the first dimension under conditions aimed to achieve overly broad peaks, followed by “longer” second-dimension runs (and hence higher second-dimension peak capacity). However, for many optimized comprehensive 2D separations in which sampling from the first dimension to the second dimension is performed in “real time,” e.g., 2D liquid chromatography (LC × LC) and 2D gas chromatography (GC × GC), it is common to have a peak capacity P2 in the range of 5-10,10-12 with P1 larger than P2. The piecewise application of eqs 1 and 7 (i.e., by breaking the first-dimension P1 into n subsections each of P2 such that n ) P1/P2) to such systems having small P2 would be misleading per the conclusions drawn from Figures 1 and 3 as well. For example, using the “more correct” eq 7, plotted in Figure 3, at %Coverage ) 40% (f ) 0.40), the limiting value for O is 0.63, while at P ) 5 (O ) 0.47) and at P ) 10 (O ) 0.57), the value for O differs from this limit. On the other hand, for comprehensive 2D methods using capillary electrophoresis (e.g., CE × CE13 and LC × CE14), a relatively larger P in the second dimension is often achieved. While maintaining comprehensiveness,2,9 the larger P in the second dimension can be attributed to use of either a “stop-flow” sampling mode for interfacing the two columns13 or the combination of a lesser efficient LC first dimension followed by a very efficient CE second dimension.14 Further improvements to the equation for O are possible. For example, eq 7 could be generalized by replacing the number, 0.63, by 1 - exp(-R), where R is the ratio of the compound number to Pmax.8 This expression is the expected fraction of occupied bins for a random distribution of 2D retention times, applicable to all ratios of compound number and Pmax. This generalization would remove the somewhat arbitrary case considered by the authors of ref 3, R ) 1 (i.e., 1 - exp(-1) ≈ 0.63). However, details to this improvement are not pursued here. Finally, one should recognize the value of O varies with the number of bins occupied by compounds, regardless of P or the specific equation used. Consequently, it depends on experimental details affecting compound detection, such as sample recovery, the collection of eluent fractions, and various parameters such as peak width, separation speed, and detector response. These may differ among different laboratories, giving different O values. The conclusion is true for any measure of orthogonality, however, and not just the measure predicted by O. Many of these points (9) Khummueng, W.; Harynuk, J.; Marriott, P. J. Anal. Chem. 2006, 78, 45784587. (10) Porter, S. E. G.; Stoll, D. R.; Rutan, S. C.; Carr, P. W.; Cohen, J. D. Anal. Chem. 2006, 78, 5559-5569. (11) Harynuk, J.; Marriott, P. J. Anal. Chem. 2006, 78, 2028-2034. (12) Pierce, K. M.; Wood, L. F.; Wright, B. W.; Synovec, R. E. Anal. Chem. 2005, 77, 7735-7743. (13) Kraly, J. R.; Jones, M. R.; Gomez, D. G.; Dickerson, J. A.; Harwood, M. M.; Eggertson, M.; Paulson, T. G.; Sanchez, C. A.; Odze, R.; Feng, Z.; Reid, B. J.; Dovichi, N. J. Anal. Chem. 2006, 78, 5977-5986. (14) Hooker, T. F.; Jorgenson, J. W. Anal. Chem. 1997, 69, 4134-4142.

regarding the influence of experimental details on O were previously considered.3 Most of the comments above address shortcomings of a specific expression of orthogonality. In addition, the use of orthogonality itself as a metric of 2D separations merits general discussion. Any 2D separation containing randomly distributed compounds approaches orthogonality (as defined in a mathematical sense) as compound number increases, with a correlation coefficient r that approaches zero (the angle between two vectors describing the relation among 2D coordinates is cos-1 r,5 equaling 90° for r ) 0). As the parameter R is varied, both f and the fraction of resolved peaks in a random 2D separation are altered, even though orthogonality (or near orthogonality) is preserved. Thus, orthogonality alone is an incomplete measure of separation quality. This is emphasized further by the distribution of 100 retention times among 100 bins in Figure 3B and C of ref 3: In a mathematical sense, both figures contain retention times approaching orthogonality (r f 0), but the former, with structured retention times and f ) 1, corresponds to a better separation than the latter, with random retention times and f ) 0.63. In spite of its traditional use in 2D separations, orthogonality in a mathematical sense is a poor metric of separation quality. We think the concept of %Coverage or f introduced by the authors of ref 3 is a better metric than orthogonality. However, a constraint must be imposed on its use. For example, the f of random 2D separations, 1 - exp(-R), can be increased simply by increasing R. For a given compound number, this can be achieved by decreasing bin number and increasing bin size. Such action would incorporate retention times in a greater fraction of bins but also would degrade the separation, since the (decreasing) bin number equals the 2D peak capacity. Instead, one should seek to maximize %Coverage for a given R (or Pmax) by disseminating mixture compounds as completely as possible over the 2D separation space. A similar argument can be given for correlated 2D separations. CONCLUSIONS It may not be prudent to utilize the “orthogonality” concept as described, for the purpose of quantifying the extent to which the two columns in a 2D separation system are performing well together. Alternatively, one idea is to just simply utilize %Coverage for this purpose, given a specific R (or Pmax). Furthermore, %Coverage appears to be a good metric to discuss how well the 2D separation space is filled, while having nothing to do with “orthogonality” in a mathematical sense. From a practical standpoint, the reader must also recognize that an appropriately large test set of a wide range of compounds is required to utilize the %Coverage metric and that the size of this test set increases with P. If readers desire to deal in more depth with the concept of orthogonality, it is suggested that they reference Massart,5 Slonecker et al.,6 and Liu et al.7 for further discussions, albeit with different approaches taken. Received for review May 22, 2007. Accepted August 16, 2007. AC0710578

Analytical Chemistry, Vol. 79, No. 20, October 15, 2007

7927