Anal. Chem. 2000, 72, 2869-2874
Supersaturated Designs for Robustness Testing Yvan Vander Heyden, Siriporn Kuttatharmmakul, Johanna Smeyers-Verbeke, and Desire Luc Massart*
ChemoAC, Pharmaceutical Institute, Vrije Universiteit Brussel, Laarbeeklaan 103, 1090 Brussel, Belgium
Supersaturated designs are factorial designs in which the number of factors examined exceeds the number of experiments performed. They do not allow an estimation of individual effects, since even the main effects are confounded. However, the total variance of a response estimated from such a design can in principle be used as a measure for the robustness of the method. A number of case studies were examined to determine whether the variance estimated from a supersaturated design is similar to the one from a Plackett-Burman design. This was found to be the case, which means that the estimated variance describes well the variation in the response caused by the variation in the factors. In the robustness tests traditionally performed and described in the literature, a screening design, such as, for instance, a Plackett-Burman or a fractional factorial design, is performed.1-5 From the design results, the effect of each factor on the response is calculated and statistically evaluated. The aim of the test is to verify the robustness of a method by identifying the factors that could be responsible for the nonrobustness of the method, i.e., the factors that cause a large change in response for a small change in their levels. For large numbers of factors, one has to perform a screening design with relatively many experiments.6 However, in most laboratories, there is a tendency to limit the number of experiments as much as possible and in the literature it is seen that the seven-factor, eight-experiment Plackett-Burman design (27-4 fractional factorial design) is often used.2,7-9 It could be interesting to have the opportunity (i) to examine even more factors with the same number of experiments than in the commonly used screening designs or (ii) to examine the same number of factors * Corresponding author: (e-mail)
[email protected]; (fax) (+32) 2 477 47 35. (1) Wahlich, J. C.; Carr, G. P. J. Pharm. Biomed. Anal. 1990, 8, 619-623. (2) Sun, S. W. Experimental and statistical approach for validating a test procedure in a pharmaceutical formulation. Ph.D. thesis, Laboratoire de Chimie Analytique, Faculte´ de Pharmacie, Montpellier, 1993. (3) Chaminade, P.; Feraud, S.; Baillet, A.; Ferrier, D. S.T.P. Pharma Prat. 1995, 5, 17-35. (4) Virlichie, J. L.; Ayache, A. S.T.P. Pharma Prat. 1995, 5, 49-60. (5) Altria, K. D.; Filbey, S. D. Chromatographia 1994, 39, 306-310. (6) Vander Heyden, Y.; Questier, F.; Massart, D. L. J. Pharm. Biomed. Anal. 1998, 17, 153-168. (7) Youden, W. J.; Steiner; E. H. Statistical Manual of the Association of Official Analytical Chemists; The Association of Official Analytical Chemists: Arlington, 1975; pp 33-36, 70-71, 82-83. (8) Fabre, H.; Meynier de Salinelles, V.; Cassanas, G.; Mandrou, B. Analusis 1985, 13, 117-123. (9) Mulholland, M.; Waterhouse, J. J. Chromatogr. 1987, 395, 539-551. 10.1021/ac991440f CCC: $19.00 Published on Web 05/24/2000
© 2000 American Chemical Society
in less experiments than what is possible in the screening designs. Supersaturated designs are factorial designs with N experiments in which the number of factors examined is higher than N - 1.10 The use of supersaturated designs could make robustness testing in method validation more attractive due to the reduced workload compared to the usual screening designs. Supersaturated designs do not allow estimation of the effects of the individual factors because of confounding between the main effects. However, estimation of the separate factor effects is not necessarily required in robustness testing.11-13 The ICH (International Conference on Harmonisation of Technical Requirements for the Registration of Pharmaceuticals for Human Use) guidelines,13 for instance, define robustness as follows: "The robustness of an analytical procedure is a measure of its capacity to remain unaffected by small, but deliberate variations in method parameters and provides an indication of its reliability during normal usage." The total variance of the responses in a design could be used as a measure for this capacity. The variations in method parameters that are introduced during the execution of a design for robustness testing will increaseson the condition that at least one factor is affecting the responsesthe total variance of the response compared to the one of replicated measurements at method (nominal) conditions, which are not affected by these variations. In a traditional approach, one is not evaluating this increased variance but is immediately trying to identify the source(s) of this disturbance from the estimated effects. However, no reliable effect estimates can be made from a supersaturated design. Therefore, only the total variance of the response could potentially serve as an indicator for the robustness of the method. As far as we know, there are no applications of supersaturated designs in analytical chemistry and certainly not for robustness testing. However, to be able to use supersaturated designs in robustness testing, it is necessary to answer the question, “Can the variance of responses, estimated from a supersaturated design, be used as a measure for the robustness of the method, i.e. does the variance represent the variability introduced by the examined factors?”. This question was examined in this paper. (10) Booth, K. H. V.; Cox, D. R. Technometrics 1962, 4, 489-495. (11) van de Vaart, F. J.; et al. Het Pharmaceutisch Weekblad 1992, 127, 12291235. (12) Caporal-Gautier, J.; Nivet, J. M.; Algranti, P.; Guilloteau, M.; Histe, M.; Lallier, M.; N'Guyen-Huu, J. J.; Russotto, R. S.T.P. Pharma Prat. 1992, 2, 205239. (13) ICH Harmonised Tripartite Guideline prepared within the International Conference on Harmonisation of Technical Requirements for the Registration of Pharmaceuticals for Human Use (ICH), Text on validation of Analytical Procedures, 1994 (http:/www.ifpma.org/ich1.html).
Analytical Chemistry, Vol. 72, No. 13, July 1, 2000 2869
Table 1. Construction of Two Supersaturated Designs from the N ) 12 Plackett-Burman Design. Branching Column: G exp
Plackett-Burman Design for 11 Factors (N ) 12) A B C D E F G H I J -1 1 1 -1 1 -1 -1 -1 1 1 1 -1
1 -1 1 1 -1 1 -1 -1 -1 1 1 -1
-1 -1 -1 1 1 1 -1 1 1 -1 1 -1
1 1 -1 1 -1 -1 -1 1 1 1 -1 -1
exp
A
B
First Supersaturated Design C D E F H
1 5 8 10 11 12
1 -1 1 -1 1 -1
1 -1 1 1 -1 -1
-1 1 -1 1 1 -1
exp
A
B
Second Supersaturated Design C D E F H I
2 3 4 6 7 9
-1 1 -1 -1 1 1
1 -1 1 -1 -1 1
1 1 -1 -1 -1 1
-1 1 1 1 -1 -1
1 1 -1 -1 1 -1
1 -1 1 -1 1 -1
1 1 1 -1 1 1 -1 1 -1 -1 -1 -1
-1 -1 1 1 1 -1 1 1 -1 1 -1 -1
1 -1 1 -1 -1 -1 1 1 1 -1 1 -1
1 -1 -1 1 1 -1
1 1 -1 1 1 -1 1 -1 -1 -1 1 -1
-1 1 1 1 -1 1 1 -1 1 -1 -1 -1
1 2 3 4 5 6 7 8 9 10 11 12
1 1 1 -1 -1 -1
1 1 -1 1 -1 -1
-1 1 1 1 -1 -1
-1 1 1 -1 1 -1
-1 1 -1 -1 -1 1 1 1 -1 1 1 -1
1 -1 -1 -1 1 1 1 -1 1 1 -1 -1
J
K
-1 1 1 -1 1 -1
1 1 -1 1 -1 -1
-1 -1 1 1 1 -1
J
K
-1 -1 -1 1 1 1
1 -1 -1 1 1 -1
THEORY Supersaturated designs can be constructed in several ways.10,14,15 Some of the designs are constructed by applying a specific criterion in order to approach orthogonality as much as possible.10,15 Another class of supersaturated designs is constructed via half-fractions of Hadamard matrices or of Plackett-Burman designs, which are a special class of Hadamard matrices.14 The latter method was chosen by us. A Plackett-Burman design with N experiments examining N - 1 factors can be split in two supersaturated designs examining N - 2 factors in N/2 experiments. Let us consider the Plackett-Burman design with 12 experiments (Table 1). One column (e.g., column G in this example) is defined as branching column, i.e., the column to split the Plackett-Burman design in two supersaturated designs. A first supersaturated design is created by taking the experiments where the branching column is at the (-1) level. Deleting the branching column from the experimental setup results in a N/2 experiments, N - 2 factors supersaturated design, i.e., in the example a design with 6 experiments and 10 factors (Table 1). A second supersaturated design can be obtained by doing the same for the experiments with the branching column at (+1) level (Table 1). The Plackett-Burman designs always require a multiple (k) of four experiments, which allows one to examine up to 4k - 1 (14) Lin, D. K. J. Technometrics 1993, 35, 28-31. (15) Lin, D. K. J. Technometrics 1995, 37, 213-225.
2870 Analytical Chemistry, Vol. 72, No. 13, July 1, 2000
no. of factors
no. of expts in supersaturated designs
supersaturated designa created from
6-10 10-18 12-22 14-26 18-34 22-42 24-46 26-49 30-58
6 10 12 14 18 22 24 26 30
N ) 12, f ) 11 N ) 20, f ) 19 N ) 24, f ) 23 N ) 28, f ) 27 N ) 36, f ) 35 N ) 44, f ) 43 N ) 48, f ) 47 N ) 52, f ) 51 N ) 60, f ) 59
K
I
-1 -1 1 1 -1 1
Table 2. Supersaturated Designs as a Function of the Number of Factors Examined
a
All Plackett-Burman design.
factors. If less than 4k - 1 factors are considered, dummy factors complete the design. Dummy variables are imaginary variables for which the change from one level to the other has no physical meaning. In Table 2, supersaturated designs with up to 30 experiments are shown. Often a given number of factors can be examined in different supersaturated designs, i.e., designs with different numbers of experiments. For instance to examine 18 factors, the design with 10 experiments is the most economic one, while this number of factors could also be evaluated in supersaturated designs with 12, 14, and 18 experiments. If the total number of factors that potentially can be examined in a supersaturated design exceeds the number of factors that actually has to be evaluated, the design is completed with dummies as was also the case in Plackett-Burman designs. The selection of a particular column of a Plackett-Burman design as branching column has no influence on the properties of the obtained supersaturated design.14 In other words, any column of a Plackett-Burman design can be used to create a supersaturated design. The only exception is when the N ) 52 Plackett-Burman design is used, which is not a design of the cyclic type.14 For the latter design, the first column is deleted and then one of the other columns is used as the branching column.14 This results in a supersaturated design with 49 columns (potential factors) and 26 experiments. No supersaturated designs can be created from the PlackettBurman designs with N ) 16, 32, 40, and 56. These PlackettBurman designs were created via foldover, and their half-fractions result in designs in which each column is present twice.14 The supersaturated designs of Table 2 can be expanded to the original Plackett-Burman design by performing the missing fraction, which is another supersaturated design, and adding the branching column as dummy factor column. This is possible if one wants to examine the effects of the factors in case the variance indicates nonrobustness of the method. Such an expansion of designs is, for instance, also described in ref 6, but for fractional factorial designs. EXPERIMENTAL DATA Four Plackett-Burman designs selected from the literature were chosen to create supersaturated designs. The first three designs were N ) 12 Plackett-Burman designs of which the first
Table 3. Design of Ref 3 with the Responses Recalculated on the Basis of the Effects Found in the Literature and Arbitrarily Chosen Average Results responses exp 1 2 3 4 5 6 7 8 9 10 11 12
A
B
C
D
E
factors F
1 -1 1 -1 -1 -1 1 1 1 -1 1 -1
1 1 -1 1 -1 -1 -1 1 1 1 -1 -1
-1 1 1 -1 1 -1 -1 -1 1 1 1 -1
1 -1 1 1 -1 1 -1 -1 -1 1 1 -1
1 1 -1 1 1 -1 1 -1 -1 -1 1 -1
1 1 1 -1 1 1 -1 1 -1 -1 -1 -1
G
H
I
d1a
d2a
area
height
plate no.
asymmetry factor
retention time
-1 1 1 1 -1 1 1 -1 1 -1 -1 -1
-1 -1 1 1 1 -1 1 1 -1 1 -1 -1
-1 -1 -1 1 1 1 -1 1 1 -1 1 -1
1 -1 -1 -1 1 1 1 -1 1 1 -1 -1
-1 1 -1 -1 -1 1 1 1 -1 1 1 -1
90.1 93.2 107.8 92.3 102.4 97.9 108.5 102.4 96.8 107.7 101.0 100.0
76.8 56.7 100.5 93.6 175.6 78.0 152.7 99.9 73.2 85.9 139.4 67.7
10016 9505 8626 12061 12620 9526 10357 8612 8681 9886 10116 9994
1.15 0.95 0.95 1.00 0.99 0.92 1.12 0.94 0.91 0.96 1.17 0.93
8.98 11.04 8.90 10.14 9.66 10.08 9.60 8.34 8.80 12.70 8.90 12.86
100.0
100.0
10000
1.00
10.00
av (b0) a
d1 and d2 represent dummies.
and the second contained five dummies16,17 and the third design two.3 The responses examined in the first two designs describe the behavior of a chromatographic method and concern retention times, capacity factors, resolution, peak areas, tailing factors, and numbers of theoretical plates. The measured responses in the third Plackett-Burman design were peak area, peak height, number of theoretical plates, peak asymmetry factor, and retention time. To be able to calculate variances from both the Plackett-Burman and supersaturated design, the experimental results were recalculated from the published effects by using a first-order model and arbitrarily defining average results for each response, i.e., definining the b0 coefficient of the first-order equation. The design and the recalculated results are shown in Table 3. The fourth Plackett-Burman design contains 28 experiments, 24 factors, and 3 dummies. Here too, the results were recalculated from the reported effects. The design, the effects, and the recalculated results are shown in Table 4. RESULTS AND DISCUSSION In the literature,15 supersaturated designs are applied under the assumption that the number of factors investigated should be high (easily 100) and the number of significant ones should be low. In robustness testing, the number of factors to be examined usually is relatively moderate while the number of potentially significant factors can be relatively high. Therefore, estimated effects would not be reliable and application in robustness testing will be restricted to studying the variances of the responses from the supersaturated designs. To answer the question formulated in the introduction, an examination was made of whether the variance of a response estimated from a supersaturated design is similar to that calculated from a Plackett-Burman design in which the same factors were examined. The idea behind this approach is that since a PlackettBurman design allows an acceptable estimation of factor effects, (16) Vander Heyden, Y.; Hartmann, C.; Massart, D. L.; Michel, L.; Kiechle, P.; Erni; F. Anal. Chim, Acta 1995, 316, 15-26. (17) Vander Heyden, Y.; Luypaert, K.; Hartmann, C.; Massart, D. L.; Hoogmartens, J.; De Beer, J. Anal. Chim. Acta 1995, 312, 245-262.
the total variance of its responses also will reflect the robustness of the method toward the examined factors. Therefore, the variance of a supersaturated design examining the same factors should be comparable to provide similar information and to be potentially usable in robustness testing. For that reason, the variances were compared in an F-test. The critical F-value when comparing variances estimated from N experiments (representing the Plackett-Burman design) and N/2 experiments (representing the supersaturated design) at a given R level (0.05 and 0.1 in our case) is equal to FN-1,N/2-1 or to FN/2-1,N-1, depending on which measured variance is largest. This value and the use of an F-test is statistically only correct when measurements from which the variances are estimated are normally distributed and when the variances originate from independent samples. In our examples, this is not necessarily the case since the results from a design can be influenced by significant factor effects that cause the data not to be normally distributed. Moreover, both samples for which the variances are compared are not completely independent since the supersaturated design is generated from the Plackett-Burman design with which it is compared. Therefore, the use of the F-test is only indicative in this case, to demonstrate the (lack of) similarities between the variances from both types of designs. From the Plackett-Burman designs described under Experimental Data, all possible supersaturated designs were constructed using the dummy factors as branching columns. This means that for the four cases studied, respectively, 10, 10, 4, and 6 supersaturated designs were created. Since a dummy factor is an imaginary variable, i.e., a factor without an effect on the response, it will not contribute significantly to the total variance of the design results. Also, the interactions confounded with the dummy factor, even the two-factor interactions, can in robustness testing be considered negligible.6,16,17 Therefore, occasional differences in the variances from the Plackett-Burman design and the supersaturated designs are not expected to be due to the influence of the branching column. Splitting the design using a column containing a significant method factor on the other hand will reduce the variance in the Analytical Chemistry, Vol. 72, No. 13, July 1, 2000
2871
2872
Analytical Chemistry, Vol. 72, No. 13, July 1, 2000
1 1 -1 -1 -1 -1 1 1 1 1 -1 1 1 1 -1 1 1 -1 -1 -1 1 -1 1 -1 -1 1 -1 -1
1
3
1 -1 1 -1 -1 -1 1 1 1 -1 1 1 1 -1 1 1 -1 1 -1 1 -1 1 -1 -1 1 -1 -1 -1
2
-1 1 1 -1 -1 -1 1 1 1 1 1 -1 -1 1 1 -1 1 1 1 -1 -1 -1 -1 1 -1 -1 1 -1
1 1 1 1 1 -1 -1 -1 -1 1 1 -1 1 -1 1 1 1 -1 -1 1 -1 -1 -1 1 -1 1 -1 -1
4
1 1 1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 -1 -1 1 -1
5 1 1 1 1 -1 1 -1 -1 -1 1 -1 1 -1 1 1 1 -1 1 1 -1 -1 -1 1 -1 1 -1 -1 -1
6 -1 -1 -1 1 1 1 1 1 -1 1 1 -1 1 1 -1 1 -1 1 -1 1 -1 -1 1 -1 -1 -1 1 -1
7 -1 -1 -1 1 1 1 -1 1 1 -1 1 1 -1 1 1 1 1 -1 -1 -1 1 -1 -1 1 1 -1 -1 -1
8 -1 -1 -1 1 1 1 1 -1 1 1 -1 1 1 -1 1 -1 1 1 1 -1 -1 1 -1 -1 -1 1 -1 -1
9 -1 -1 1 -1 1 -1 -1 1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 1 1 1 -1 1 1 -1 -1
10 1 -1 -1 -1 -1 1 -1 -1 1 -1 1 1 -1 -1 -1 1 1 1 1 1 -1 -1 1 1 -1 1 1 -1
11 -1 1 -1 1 -1 -1 1 -1 -1 1 -1 1 -1 -1 -1 1 1 1 -1 1 1 1 -1 1 1 -1 1 -1
12 -1 1 -1 -1 -1 1 -1 1 -1 1 1 1 1 1 -1 -1 -1 -1 1 1 -1 1 -1 1 1 1 -1 -1
13 -1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 1 1 -1
14
factors
1 -1 -1 -1 1 -1 1 -1 -1 1 1 1 1 -1 1 -1 -1 -1 1 -1 1 -1 1 1 1 -1 1 -1
15 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 1 1 1 1 -1 1 1 -1 1 1 -1 1 -1 1 -1
16 -1 -1 1 -1 -1 1 1 -1 -1 -1 -1 -1 1 1 1 -1 1 1 -1 1 1 -1 1 1 1 1 -1 -1
17 1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 1 1 1 1 -1 1 1 -1 1 1 -1 1 -1 1 1 -1
18 1 -1 1 1 1 -1 1 1 -1 -1 -1 1 -1 1 -1 -1 1 -1 1 1 -1 -1 -1 -1 1 1 1 -1
19 1 1 -1 -1 1 1 -1 1 1 1 -1 -1 -1 -1 1 -1 -1 1 -1 1 1 -1 -1 -1 1 1 1 -1
20 -1 1 1 1 -1 1 1 -1 1 -1 1 -1 1 -1 -1 1 -1 -1 1 -1 1 -1 -1 -1 1 1 1 -1
21 1 1 -1 1 -1 1 1 1 -1 -1 1 -1 -1 -1 1 -1 1 -1 1 1 1 1 1 -1 -1 -1 -1 -1
22
-1 1 1 1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 1 1 -1 -1 -1 -1
23
1 -1 1 -1 1 1 1 -1 1 1 -1 -1 -1 1 -1 1 -1 -1 1 1 1 1 -1 1 -1 -1 -1 -1
24
effect -26.07 -20.07 -6.79 36.50 12.50 -11.79 -13.50 29.64 15.21 14.21 -3.64 -15.79 -22.93 -30.93 -86.36 -22.93 -42.79 -13.64 -3.50 -48.79 -8.36 -32.21 -11.93 -12.36
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
exp
Table 4. Plackett-Burman Design with 28 Experiments and with the Recalculated Response for the Effects Given in Ref 18
6.00
1 1 -1 1 1 -1 1 -1 1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 1 1 1 1 -1 -1
d1
1 -1 1 1 -1 1 -1 1 1 1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 1 1 1 1 -1 1 -1
d3
465.5 498.8 574.4 601.4 579.4 520.6 418.0 504.1 514.6 433.0 534.7 502.3 464.6 488.3 429.7 610.1 596.7 519.9 440.4 429.1 387.8 550.0 397.7 502.3 435.5 535.9 404.6 660.6
Yi
4.00 -5.00 500.0
-1 1 1 -1 1 1 1 1 -1 -1 -1 1 -1 -1 1 1 -1 -1 -1 -1 -1 1 1 1 -1 1 1 -1
d2
Table 5. Example, for One Response, of the Calculation and Interpretation of Variances from a 12 Experiments Plackett-Burman Design with 5 Dummies and from Its Supersaturated Designs design Plackett-Burman supersaturated 1 supersaturated 2 supersaturated 3 supersaturated 4 supersaturated 5 supersaturated 6 supersaturated 7 supersaturated 8 supersaturated 9 superstaurated 10 design minimal s2: design 2 critical one-tailed F11,5(R)0.05) critical one-tailed F11,5(R)0.10) maximal s2: design 1 critical one-tailed F5,11(R)0.05) critical one-tailed F5,11(R)0.05)
variance 3.557 5.104 2.622 3.347 4.471 4.177 3.627 4.504 3.318 4.425 3.365 variance 2.622 5.104
F-value 1.36 4.71 3.28 1.43 3.20 2.45
supersaturated designs compared to that of the Plackett-Burman design since the effect of the significant factor and its contribution to the variance is removed from the data set. Therefore, such columns should not be used as branching columns. The variances of the N results in the Plackett-Burman designs and those of the N/2 experiments from the supersaturated designs were calculated. For the Plackett-Burman design with 12 experiments and 5 dummies of refs 16 and 17, 10 different supersaturated designs were created. For one response, the variances of the 12 results from the Plackett-Burman design and of the 6 from each of the 10 supersaturated designs are shown in Table 5. The extreme (minimal and maximal) variances of the supersaturated designs were compared with that of the Plackett-Burman design by means of F-tests. In general, the variances from the supersaturated designs are distributed around the one from the Plackett-Burman design, as can, for instance, be observed in Table 5. From the large number of responses determined for the case studies from refs 16 and 17, in a first instance eight responses were selected. Depending on the response examined, zero to three factor effects were found significant in the Plackett-Burman design. For the case study from ref 3, five responses were examined and one for the design of ref 18. The results of the F-tests for the extreme variances are shown in Table 6. The extreme variances (smallest and largest) estimated from the supersaturated designs were never significantly different at the R ) 0.05 significance level from the one estimated from the Plackett-Burman design. However, the F-test is not very sensitive and relatively large differences between variances are required before they are considered statistically different, partly due to the small numbers of degrees of freedom available. Nevertheless, it can be observed from Table 6 that for most responses the ratio between the extreme variances from a supersaturated design and that from the Plackett-Burman design does not deviate much from 1. In 39 cases out of 43 (or 91%), the ratio between the two variances is smaller than 2.0, and in 21 cases (or 49%), it was
smaller than 1.5. This indicates that in most cases the supersaturated design yields a variance that is similar to the one found from the Plackett-Burman design. Additionally, for the Plackett-Burman designs with 12 experiments and five dummies,16,17 147 different responses determined on the different chromatographic peaks were considered. In the Plackett-Burman designs, varying numbers of significant factor effects were found, ranging from none to almost all factors having a significant effect. For each response, the variances of the results in the 10 supersaturated designs were determined and again the extreme ones were considered. None of the maximal variances was found to be significantly larger than that from the PlackettBurman design at the R ) 0.05 significance level and only five differed by more than a factor 2.0 (i.e., by a factor 1.4 in s). From the 147 minimal variances on the other hand, at R ) 0.05 significance level, 11 were significantly smaller than the one from the Plackett-Burman design, and at the R ) 0.10 level, 26 differences were found. Relative to expectations (5% of 147 is 8), this still is not high. One must expect that now and then the minimal standard deviation is underestimated in the supersaturated designs. Indeed, when a dummy effect is relatively large, due to a cumulation of small but real interaction effects confounded with the dummy, then use of this column as a branching column removes part of the variance from the data set compared to that observed in the Plackett-Burman design. On the basis of the above results, one can expect that in the practical case where one applies a supersaturated design, the variance obtained from the experimental results is an acceptable measure to describe the method variability that was introduced by varying a given set of factors within certain limits, as requested in a robustness test. To draw conclusions concerning the robustness of the method, a decision criterion to determine a limit or critical variance is required. This means that the analyst should have an idea about the variance that can be expected when the method is not affected by nonrobust factors. Since the variance in a robustness test design is an estimate of the reproducibility (or intermediate precision, depending on the context) of the method studied, a reference variance that also estimates reproducibility could be applied. If the variance from a robustness test design would be significantly larger than the reference variance, the method cannot be considered robust. Interlaboratory studies to determine the reproducibility of a method are not available at the moment a robustness test is executed, since the aim of a robustness test is to avoid problems during such studies.7 Therefore, other estimates for the reproducibility should be used. Variance estimates that usually already are available when a robustness test is performed are the repeatability variance and sometimes one of the intermediate precision estimates.19 Horwitz et al.20 made different predictions for the reproducibility, based on the repeatability results of a method on one hand, and on the concentration of the substance to be determined on the other. In the first case, a reproducibility estimate is proposed based on the experimentally determined (18) Box, G. E. P.; Draper, N. R. Empirical Model-Building and Response Surfaces; J. Wiley: New York, 1987; p 175. (19) ISO, International Sandard, Accuracy (trueness and precision) of measurement methods and results. Part 3: Intermediate measures of the precision of a standard measurement method, 1st ed.; ISO 5725-3; 1994. (20) Horwitz, W.; Kamps, L. R.; Boyer, K. W. J. Assoc. Off. Anal. Chem., 1980, 63, 1344-1354.
Analytical Chemistry, Vol. 72, No. 13, July 1, 2000
2873
Table 6. F-Values and Critical Values for Comparison of the Minimal, s2(min), and the Maximal, s2(max), Variances from the Supersaturated Designs with the One from the Plackett-Burman Design F-values from data 1 response 1 2 3 4 5 6 7 8
from data 2
from data 3
from data 4
for s2(min)
for s2(max)
for s2(min)
for s2(max)
for s2(min)
for s2(max)
for s2(min)
for s2(max)
3.97*
1.67 1.60 1.44 1.19 1.29 1.43 1.58 1.69
1.42 1.83 1.59 2.03 1.71 1.67 1.82 1.47
1.48 1.65 1.42 1.63 1.60 1.52 1.65 1.42
1.13 1.61 4.17a 1.16 1.11
1.29 1.49 1.79 1.31 1.30
1.50
1.41
2.39 1.35 b 1.11 1.36 1.62 1.98
critical values R ) 0.05 R ) 0.10 a
F11,5
F5,11
F11,5
F5,11
F11,5
F5,11
F27,13
F13,27
4.71 3.28
3.20 2.45
4.71 3.28
3.20 2.45
4.71 3.28
3.20 2.45
2.40 1.97
2.11 1.78
Significant at R ) 0.10 level. b Minimal variance was larger than Plackett-Burman variance.
repeatability variance, and in the second, a theoretical variance is predicted based on the expected concentration of a sample. The above estimates could be used to determine a decision criterion that allows one to make conclusions about the robustness of the method, based on the variance of a response estimated from a supersaturated design. The possibility to define critical variances from different error estimates is a problem that also occurs in the evaluation of the significance of effects in a screening design approach.16,17,21 In both situations, the analyst should use a criterion that is neither undernor overestimating the experimental error. When problems with the robustness of a method are observed from a supersaturated design, one can easily switch to the traditional approach by going through the procedure of Table 1 in the opposite way, which would create a Plackett-Burman design, by means of the execution of the remaining supersaturated design as mentioned in the Theory section. Estimation of the effects from the Plackett-Burman design would then allow one to identify the factors responsible for the nonrobustness of the method. CONCLUSIONS For the case studies performed, the variances for a response estimated from the supersaturated designs are similar to those from the Plackett-Burman designs. Therefore a supersaturated design could be used in robustness testing to estimate the variance caused by the variation introduced in the factors examined. Robustness tests can be performed at several stages in the lifetime of a method. Originally, robustness tests were performed (21) Nijhuis, A.; van der Knaap, H. C. M.; de Jong, S.; Vandeginste, B. G. M. Anal. Chim. Acta 1999, 391, 187-202.
2874 Analytical Chemistry, Vol. 72, No. 13, July 1, 2000
at the end of method validation just before interlaboratory studies were executed. However, nowadays there is a tendency to apply the test much earlier in the lifetime of the method, namely, at the end of method development or early in the validation procedure. The supersaturated designs then should be considered as a way of detecting gross problems at an early stage in method validation or even during the development phase. They could also be of interest when a large number of factors has to be examined so that the usually performed screening designs (PlackettBurman or fractional factorial designs) require a too high number of experiments to be feasible. Supersaturated designs could be part of a robustness strategy in which such a design is executed first and expansion to a screening design is only done, if required, after evaluation of the results of the supersaturated design. The origin of problems indicated by the supersaturated design should then be revealed from the corresponding Plackett-Burman design which allows the evaluation of the main factor effects. Application of these designs seems tempting in areas of analytical chemistry in which the requirements for method validation are not so strict as, for instance, in pharmaceutical analysis, but in which one is anyway prepared to validate one’s methods though in a limited number of experiments. ACKNOWLEDGMENT Y. Vander Heyden is a Postdoctoral Fellow of the Fund for Scientific Research-Vlaanderen (FWO-Vlaanderen). Received for review December 15, 1999. Accepted March 30, 2000. AC991440F