Efficient Visual Mixture Design of Experiments using Property

Jan 9, 2009 - Mixture design is a design of experiments (DOE) tool used to determine the optimum combination of chemical constituents that deliver a d...
1 downloads 8 Views 3MB Size
Ind. Eng. Chem. Res. 2009, 48, 2245–2256

2245

Efficient Visual Mixture Design of Experiments using Property Clustering Techniques Charles C. Solvason,† Nishanth G. Chemmangattuvalappil,† Fadwa T. Eljack,‡ and Mario R. Eden*,† Department of Chemical Engineering, Auburn UniVersity, Auburn, Alabama 36832, and Department of Chemical Engineering, Qatar UniVersity, Doha, Qatar

Mixture design is a design of experiments (DOE) tool used to determine the optimum combination of chemical constituents that deliver a desired response (or property) using a minimum number of experimental runs. While the approach is sufficient for most experimental designs, it suffers from combinatorial explosion and visualization difficulties when dealing with multicomponent mixtures. To circumvent these problems, a recently developed design technique called property clustering is applied. In this type of design the properties are transformed to conserved surrogate property clusters described by property operators, which have linear mixing rules even if the operators themselves are nonlinear. Product and process property targets are then used to describe a feasibility target region. To solve the mixture design, components are mixed according to property operator models in a reverse problem format until the mixture falls within the feasibility target region. Once candidate solutions are found, they can be screened with additional criteria per the experimenter’s preference. The degree of accuracy of this modeling technique depends heavily on the ability of the property operator models to adequately describe the property within the studied design space. This work utilizes linear Scheffe and Cox models as property operators of a mixture design to demonstrate the benefits of applying property clustering to chemometric techniques. 1. Introduction The terms product synthesis and design designate problems involving identification and selection of compounds or mixtures that are capable of performing certain tasks or possess certain physical properties. Since the properties of the component or mixture of components dictate whether or not the design is useful, the basis for solution approaches in this area should be based on the properties themselves. However, the performance requirements for the desired component are usually dictated by the process, and thus the identification of the desired component properties should be driven by the desired process performance. Whereas numerous contributions have been made in the areas of molecular synthesis and computer-aided molecular design (CAMD) by Harper and Gani,2 Marcoulaki and Kokossis,3 and Eljack et al.4 among others, little focus has been on utilizing experimental design techniques in situations where property prediction tools are insufficient in describing the mixture’s properties. Early in experimental mixture design, Scheffe5,6 and Cox7 developed techniques to obtain property operator models while minimizing experimental runs or design points utilizing simplex diagrams of the chemical constituent design space. However, visualization of the solution in the chemical constituent design space is prone to combinatorial explosion. Viewing the problem in the property space avoids this problem while also offering insights into the effectiveness of the design.

understanding of the experiment in cases of combinatorial explosion will be explored. The two most common mixture designs, Scheffe canonical models and Cox polynomial models, are evaluated. The results of the exercise will be used to develop additional techniques for utilization of PCR and PLS models under combinatorial explosion. 3. Experimental Design Design of experiments (DOE) is a form of experimental design that utilizes statistical methods to plan and execute informative experiments.8 Postulating a model to represent the response surface, experimental design points are placed in areas where observations can be collected to which the model can be fitted. The procedure may require much iteration until the fitted equation is determined by the experimenter to be sufficient.9 The most effective choice of model and location of design points is the focus of the experimenter. Most often, the polynomial model is selected to represent the response surface since it can be expanded through a Taylor series to improve accuracy.9 A first- or second-degree model is usually chosen to represent the surface since it requires fewer observations. Third-degree or higher ordered models are seldom utilized due to the number of experiments required. The point estimate forms of the models are listed in eqs 1 and 2. u

y ) βo +

2. Objective

∑βx

(1)

i i

i)1

The overall objective of this contribution is to integrate the property clustering framework with existing mixture design techniques. Specifically, the use of property clusters to visualize the response in the property domain to aid in the physical * To whom correspondence should be addressed. E-mail: [email protected]. † Auburn University. ‡ Qatar University.

u

y ) βo +



u

βixi +

i)1

u

∑∑β xx

ij i j

(2)

iej jgi

The response y is the point estimate of unbiased estimator of the response. βo, βi, and βij are the point estimates of the regression coefficients. The total number of i or j chemical constituents is u. To find the regression coefficients, the method

10.1021/ie800877d CCC: $40.75  2009 American Chemical Society Published on Web 01/09/2009

2246 Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009

of least-squares regression is applied by maximizing the model sum of squares and minimizing the residual sum of squares to improve model fit.10 Forms of least-squares regression can be found in many techniques such as classical least squares (CLS), inverse least squares (ILS), multiple linear regression (MLR), principal component regression (PCR), and partial least squares applied to latent surfaces (PLS). To derive the regressor solution for CLS, it is convenient to switch to matrix notation. Rewriting eq 1 in terms of matrix notation, Y ) XB

(3)

where Y, B, and X are matrices of the predicted responses, estimated regression coefficients, and component fractions, respectively. The least-squares solution is then written as eqs 4 and 5, which is the central result of CLS analysis. XTY ) XTXB

(4)

(5) B ) XTY(XTX)-1 The same procedure can be extended to second-order and third-order models. 3.1. Mixture Design of Experiments. Mixture design of experiments (MDOE) is an extension of DOE that utilizes chemical constituents as the factors in the design. The first mixture designs were developed by Scheffe5,6 as simplex-lattice designs which many researchers considered to be the foundation of mixture design.9 To develop these designs, Scheffe5,6 noted that the location of the response of a mixture made up of exactly zero constituents must be identically zero meaning that the coefficient βo in eqs 1 and 2 is zero. By definition the constituent fractions must also sum to 1 and each constituent fraction must lie between 0 and 1.

Figure 1. Theoretical second-order response surface for a three-component mixture showing the experimental design points.9

rather, they represent the pure properties of the constituents and are no longer orthogonal. The true property estimate, or orthogonal effect, is found by taking the difference in the value of the response at pure and infinitely dilute solutions while holding all other constituents constant. To execute a mixture design using canonical models, Scheffe5,6 proposed using lattice points and centroid points on a simplex diagram as shown in Figure 1. This technique simplifies the least-squares calculation of regression coefficients. Since the design is a boundary design with at least as many design points as regression terms, then the estimated terms may be taken directly from the responses at each design point.11 For example, suppose that thread elongation of yarn is explored using a simplex-lattice design with six experiments as shown in Figure 1. Postulating a second-order model, linear regression was performed and the following expression describing the response surface was estimated.

u

∑x )1 i

(6)

0 e xi e 1

(7)

y ) 11.7x1 + 9.4x2 + 16.4x3 + 19.0x1x2 + 11.4x1x3 - 9.6x2x3 (12)

i)1

Equation 6 imposes a colinearity effect by removing the independence of the factors. While it does not affect the utilization of the model, it does impact the interpretation of the regression coefficients. Combining the observations about the regressor βo with the constraint of eq 6, Scheffe arrived at the canonical models shown in eqs 8 and 9. u

y)

∑β x

* i i

(8)

i)1

u

y)

∑ i)1

u

β*i xi +

u

∑∑β xx

* ij i j

(9)

ii

These canonical models effectively remove the quadratic and higher terms from analysis and leave behind only the modified pure component properties and interaction effects. However, it * * must be noted that the modified regressors βi and βij do not represent only pure component properties or only interaction effects. On the contrary, because of the colinearity introduced in the derivation of the canonical models, the modified regressors are combinations of the pure component, interaction, and quadratic effects, as shown in eqs 10 and 11. β*i ) β0 + βi + βij

(10)

β*ij ) βij - βii - βjj

(11)

It should be noted that, for canonical models, the regression coefficients are no longer represented by half the response;

The canonical regressors can be read directly from the responses at the mixture points, but the pure component, interaction, and quadratic effects must be calculated separately. In his paper in 1971, Cox noted that Scheffe’s simplex-lattice and simplex-centroid models require pure components and a range of constituent fractions from 0 to 1. He points out that Scheffe’s models have the following drawbacks:7 (1) If two replicate experiments on the same system have the same expected responses except for a constant difference between replicates, it is obvious that by fitting separate replicates all of the regression coefficients will be different in the two replicates. (2) The absence of squared terms makes it meaningless to consider the direction and magnitude of curvature of the response to a particular component. (3) The interpretation of the regression coefficients is in terms of the responses for simple mixtures. In addition Kettaneh-Wold12 points out that the removal of the constant term to generate the canonical model makes it impossible to center these models which often leads to an illconditioned XTX matrix. Since an ill-conditioned matrix can lead to errors during inversion, it is of the utmost importance that the constituents be independent when performing CLS and MLR regressions. Colinearity may also result from additional constraints such as the upper and lower bounds on components.12 To address these issues, Cox7 proposed a variable transformation with an arbitrary reference mixture that would allow for the use of existing polynomial models with additional constraints involving a reference mixture called the standard mixture.

Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009 2247

xi ) si + ∆i

(13)

∆j sj xj ) sj (1 - si)

(14)

Rewriting the first- and second-degree models, eqs 1 and 2, in terms of the change in constituent i, ∆i, results in eqs 15 and 16 which provide a direct link between the regressors and the position of the design points to the reference mixture.9 u

y(x) ) y(s) +

∑ i

u

y(x) ) y(s) +

∑ i

( )

( )

βi ∆ 1 - si i

βi ∆+ 1 - si i

u

∑ i

(

(15) βii

(1 - si)2

)

∆i2

(16)

where y(x) is the expected response at design point x and y(s) is the expected response at the standard reference mixture. As Smith and Beverly13 point out, the gradient, or change in response per unit change in xi, at s along the Cox-effect direction is called the effect of xi, provided xi is free to range from 0 to 1. It has also been shown that the response at the standard mixture is equivalent to βo regressor.9 Likewise, the transformed variables can be represented as p

∑ i

p

∑ i

(

( )

βi ∆) 1 - si i βi

)

∆2i ) 2

(1 - si)

u

∑β x

i i

(17)

i

u

ψk(yk)M )

∑ ψ (y ) x k

k i i

(19)

i)1

Although the property operator equation must be linear, the property operator itself may be nonlinear.1 This step is often difficult as many properties cannot be described adequately by linear models. Often transformation functions, decompositions, and other mathematical operations are utilized to generate the appropriate model for the application.15-17 In this situation the experimenter must weigh the benefits of the method versus the loss of solution certainty. In most cases, this caveat leads the experimenter to limit the application of this method to highvolume screening designs. After the property operator equations are defined, the method is universal. It is presented in detail by Eden et al.;1 only highlights will be presented here. As shown in eq 20, the property operators are nondimensionalized by dividing by a reference property operator.1 Ωki )

ψk(yk)i ψk(yk)ref

(20)

This step serves the purpose of scaling the property design space to facilitate a better graphical understanding of the problem. As shown later in this paper, it is also used to ensure a solution to mixtures whose property operator equations contain negative regression coefficients. The nondimensionalized properties are then summed into the augmented property index (AUP).14 p

u

∑β x x

i i i

(18)

AUPi )

ki

(21)

k)1

i

The analysis can also be extended to other directional changes, resulting in the polynomial expressions identical to that of eqs 1 and 2, but with the added benefit of the major colinearities being removed. The resulting response surfaces generated by either the Scheffe canonical eqs 8 and 9 or the Cox polynomial eqs 1 and 2 have been shown by Cornell9 to be identical. While the transformation of the original Scheffe polynomials removes the primary colinearity introduced by eq 12, it leaves the secondary colinearities such as those introduced by constraints on the constituent ranges. Kettaneh-Wold11 suggests that the best solution may be to refrain from interpreting the coefficients and rely on the predictions only but notes this solution is not acceptable in practice since the interpretation of regression coefficients is a necessity when the objective is to find component effects in screening situations. It is in this arena that Kettaneh-Wold11 suggests the use of decomposition techniques such as PCR and PLS. However, there is another method called property clustering that may help in the interpretation of the regressors before applying decomposition techniques.

∑Ω

A cluster is then defined by dividing the nondimensionalized property by the AUP, as shown in eq 22. Cki )

Ωki AUPi

(22)

The balance of the method can be found in Eden et al.1 For systems of up to three properties, a ternary diagram, or simplex, is used in much the same way as a mixture design to graphically represent the property clusters as shown in Figures 2 and 4. Each vertex represents a pure property cluster. In situations where more than three properties are needed to describe the system, an algebraic approach may be used.4,18,19 The properties

4. Property Clustering Property clustering was developed as a tool for tracking properties in a conserved manner by Shelley and El-Halwagi.14 It was later applied to process and product design by Eden et al.1 As a design tool, the technique challenges conventional design by reversing traditional mixture designs to solve for chemical constituents that meet property constraints. This type of formulation is known as a reVerse problem formulation.1 Essentially, the technique converts properties into conserved surrogate property clusters that are described by property operators, which have linear mixing rules, even if the operators themselves are nonlinear. In eq 19, the property, y, is described by a linear property operator equation.

Figure 2. Property cluster simplex diagram with property clusters at the vertices, pure component effects represented as independent discrete streams, and the product targets represented as a sink.1

2248 Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009

Figure 3. Example of a reverse problem formulation including both mixture and molecular design parameters (Solvason et al., 2008).

of each pure component, as predicted from the property operator equations, are plotted on the simplex. Also shown in Figure 2 is a feasibility region, which is mapped to the design space using targets specified by the product designer. In this case three properties are described by six unique points which form the boundaries of the feasibility region.1 To solve the design problem, the components are mixed according to linear mixing rules.1 u

CkM )

∑δC

i ki

(23)

i)1

where the relative cluster arm δi is defined using the proof for interstream conservation of clusters given by Eden et al.1 and Shelley and El-Halwagi,14 resulting in eq 24. δi )

xi AUPi AUPM

(24)

Here each cluster arm δi can be found visually from the relative lengths of the lever arms as shown in Figure 2. To successfully meet mixing criteria, mixtures must meet three conditions:1 Rule 1. The cluster value of the source (or mixture of sources) must be contained within the feasibility region of the sink on the cluster ternary diagram. Rule 2. The values of the AUP for the source or mixture of sources and the sink must match. Rule 3. The flow rate of the source (or mixture of sources) must lie within the acceptable feed flow rate for the sink. The original rule 3 must be modified slightly to meet the criteria of mixture design. Since each source must be considered as a pure component, then the constituents are subject to the additional constraint of summing to unity as imposed by eq 12. Therefore, a more appropriate statement of rule 3 is the following: Rule 3. The constituent fraction of the candidate mixtures must match the constituent fractions of the sink. In addition to these three rules, the experimenter may wish to apply supplemental constraints, such as limiting the mixture size, type, or cost. These constraints assume an optimization role in the design to further reduce the generated list of potential candidate mixtures.

5. Integration of Property Clustering with Mixture Design As discussed earlier, property clustering is a transformation technique which facilitates a reverse problem solution (see Figure 3). Utilizing property clustering in a reverse problem solving role not only avoids combinatorial explosion but also offers the potential for solving process, mixture, and molecular design problems simultaneously.4,17 The key to successful modeling with property clustering is to utilize appropriate property operator models. Traditionally these have been based on established semiempirical property models tailored to a linear format. However, in situations where models with the necessary degree of accuracy do not yet exist, chemometric models drafted from experimentation are used. These models can be incorporated into the existing property clustering framework while providing additional benefits not found in the traditional experimental design. First, the experimental design points on which the model is based can be mapped into the property cluster design space. This visualization technique is appropriate regardless of the number of components investigated, thereby providing a means for ensuring the space has been properly explored. The procedure for evaluating the efficiency of coverage is related to how near the optimum is to the design space. It would be unusual to measure this as most designs utilize pure component measurements and involve interpolation. However, should they not, gradient, nongradient, and specialized optimization techniques can be used to find the optimum and a new mixture design can be conducted at the predicted optimum.20 Second, most experimentally based models utilize regression coefficients as estimates of the effects of each property on the response. Depending on the type of regression utilized, the interpretation of these regression coefficients can be quite difficult. In the case of Scheffe and Cox models, although the regression coefficients are different, the response surfaces obtained are the same.9 The reason for this result is that the regression coefficients represent different entities in each model. In the Scheffe model, each constituent in the cluster space represents a combined effect comprised of contributions from the pure component property, colinear effects, and nonlinear effects. The Cox models remove some of the colinearities in mixture design, but depend on good

Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009 2249

Figure 4. Property cluster simplex diagram of a seven-component system using property operators derived from an Acetaminophen excipient design.21 Clusters 1-3 represent the properties repose angle, water content, and compressibility.

experimental design controls to limit the effect of other colinearities and nonlinearities. Visualizing the problem in property space consolidates each component’s impact on the mixture, aiding in the ability to screen components. For systems of up to three properties, the entire experimental design and all associated regression coefficients can be represented on a single property cluster diagram. Additional properties are either represented with additional diagrams or solved algebraically. Consolidating the numerous effects of the various components on the response into distinguishable values may be used to surmise which of the components has the largest effect and is thus most important. 5.1. Visual Solution Using Scheffe Models. Avoiding the interpretation of the regression coefficients for now, the Scheffe model can be used for modeling the design space. Using the regression coefficients as estimates of the effects, then the nondimensionalized property operator expression for the firstorder model becomes the following: Ωk )

βk * ψk(yk)ref

(25)

Since the regression coefficients for the Scheffe model are combinations of multiple effects, they have the potential to have negative values. It can be theorized that if the coefficients only

Figure 5. Details of negative property cluster space for a three-property system.

represented pure component values, then the property would always be positive. Therefore, a negative regression coefficient is indicative of a strong colinearity and/or nonlinearity effect overwhelming the weak linear effect. This situation can cause problems within the traditional property clustering framework, but is not unsolvable. In Figures 4 and 5 a ternary cluster diagram containing property models with negative regression coefficients is shown. In the situations with negative regression coefficients, the combined effect of the component resides outside the ternary diagram. Hence, when using models comprised of regression coefficients, some solutions may require mixing with chemical constituents outside of the cluster space defined by the ternary diagram. The region outside of the cluster space is defined as negative cluster space and the region inside the simplex shall here to for be referred to as positive cluster space. The total number of negative cluster regions, NR, is a function of the number of properties, P. NR ) 2P

(26)

For a three-property solution the negative cluster space is comprised of six distinct regions. These regions are of two types: those with property clusters greater than 1 are called type I regions, and those with negative property clusters are called type II regions. Cki > 1, type I regions

(27)

Cki < 0, type II regions

(28)

These regions are related to one another since, in the derivation of property clusters, Eden et al. notes that the clusters must sum to 1.1 p

∑C

ki ) 1

(29)

k)1

For the three-property example eq 29 implies that two negative property clusters equate to a third cluster with a value greater than 1. Likewise, two clusters with a value greater than 1 equate to a third cluster with a value less than 1. The number of different types of (positive and negative) cluster regions NT is related to the number of properties evaluated according to eq 30. NT ) P

(30)

Property clusters in the positive cluster regions may also be estimated from three negative clusters and three clusters greater than 1, respectively. These values may only be used to ascertain

2250 Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009

effects and may not be used in mixing without violating the monotonically increasing rule for mixing proposed by Shelley and El-Halwagi14 and Eden et al.1 To discuss this limitation, it is noteworthy when dealing with negative property clusters to first investigate the constraints on the augmented property index. In eq 21, the nondimensionalized properties are summed to create the AUP. While a negative regression coefficient may create a negative property operator, the advent of such constructs must be constrained by the following rule: Rule 4. All AUP values of the components must be positive. AUP > 0 (31) Without the constraint of eq 31, a negative AUP is mathematically possible. When applied to the linear mixing rules, it gives an infeasible solution to the lever arm and violates eq 23, interstream conservation. This situation is therefore avoided by adjusting the reference values of the properties until the constraint of eq 31 is met. The procedure for executing this reference selection can get quite complicated for more than k> 3 properties. The procedure can be written mathematically as follows:

(33) y(x) ) y(s) + yz where y(s) is the response at the standard mixture and yz is the pseudo property value that represents the property, k, contribution to the mixture. It has been shown that the response at the standard mixture is equivalent to the βo regressor.9 Normalizing the Cox property operator model results in eqs 34-36. Ωzk )

βk ψk(yk)ref

(34)

Ωsk )

βko ψk(yk)ref

(35)

Ωk ) Ωsk + Ωzk

(36)

Likewise the augmented property index is rewritten as eq 37. (37) AUP ) AUPs + AUPz Redefining the property cluster of eq 22 in terms of the pseudo property cluster Czk and the standard property cluster Csk gives eqs 38 and 39. Czk )

u

max

∑ AUP

Fz )

p

∑Ω

ki

The procedure essentially finds the set of reference property operators that give the largest values of AUP. Although the adjusted references give different values for the clusters, the underlying property values are unchanged. This is an important aspect of property clustering as the flexibility to adjust the property operator reference values allows negative regressors to be utilized. 5.2. Visual Solution Using Cox Models. The above discussion is a valid solution for mixing and developing lists of candidate solutions. However, if the objective is to screen constituents, then some knowledge of the pure component effects will be needed in order to reduce additional experimentation each time a new constituent is added to the list of candidates. Since the objective of the screening design is to understand the effects of the constituents on the mixture, then the component clusters need to represent the pure property values void of colinearities and nonlinearities. One method for removing the colinearities is to utilize the Cox modifications on the Scheffe canonical models. In this method the major colinearity introduced by eq 12 is removed. It has been shown by Cornell9 that the response surfaces for the Scheffe and Cox models are identical. This result is also true of the property clustering solutions utilizing the Scheffe and Cox models. Rewriting eq 17 in terms of a pseudo property value yz results in eq 32 p

i

( )

βi ∆) 1 - si i

AUPz AUP

(40)

AUPs (41) AUP Fz is the pseudo correction factor and Fs is the standard correction factor. These are combined in eq 42 to give the relationship between the true property cluster, the pseudo property cluster, and the standard property cluster. Fs )

k)1



Ωsk

(39) AUPs The sum of these clusters will not give the true cluster, Ck, as defined in eq 28, without correcting for the different AUP values. This is done using a set of correction factors as shown in eqs 40 and 41.

AUPi > 0 i ∈ u AUPi < M i ∈ u ψk(yk)ref > 0 k ∈ p ψk(yk)i Ωki ) ψk(yk)ref AUPi )

(38)

AUPz

Csk )

i

i)1

s.t.

Ωzk

u

∑βx )y

z

i i

(32)

i

By inserting eq 32 in eq 21, the property operator expression assumes the form of eq 33,

Ck ) FsCsk + FzCzk

(42)

Rewriting eq 42 in terms of the mixture with pseudo properties and standard properties gives eq 43. u

∑xΩ i

CkM )

z ki

i

AUPM

+

Ωsk s AUPM

s FM

(43)

Inserting the cluster definition of eq 39 in eq 43 and rearranging gives eq 44. u

∑xΩ i

s s CkM ) CkM - FM

z ki

i

AUPM

(44)

Noting that the left-hand side of eq 44 is the same as the product of the pseudo correction factor and the pseudo property cluster of the mixture, the equation is rewritten as eq 45. u

∑xΩ i

z z CkM ) FM

z ki

i

AUPM

(45)

Inserting the pseudo cluster definition to remove the property operator in favor of the cluster and rewriting the AUP of the mixture in pseudo terms gives eq 46.

Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009 2251 u

∑ x AUP C

z z i ki

i

z CkM )

i

z AUPM

(46)

Equation 47 assumes a familiar form of the relative cluster arm using the pseudo relative cluster arm δz. δz )

xiAUPzi z AUPM

(47)

The pseudo relative cluster arm maintains the monotonically increasing criteria imposed by Eden et al.1 provided that all of the augmented property indices used in the solution of the problem are positive, a constraint placed on the solution by eq 31. Combining eqs 46 and 47 into the final form for mixing pseudo clusters gives eq 48. u

z CkM )

∑δ C

z z ki

(48)

i

Visually inspecting the pseudo mixing rule shows that the relative cluster arms are indicative of a mix involving a pseudo feasibility region. The pseudo feasibility region is defined in the same manner as the true feasibility region but corrects for the standard mixture property values using eq 42. Often this pseudo region is located in the negative cluster space. The resulting pseudo relative cluster arms represent the addition of the pseudo components to move the mixture into this region, only now each pseudo component better represents its contribution to the mixtures’ properties, void of most colinearities. It can be seen in Figure 6 that the candidate solutions found using traditional property clustering solution with Scheffe property operators are the same as the candidates found using the pseudo property clustering solution with Cox operators. This is expected since the normal Scheffe and Cox mixing designs achieve the same property response plots.9 Hence, for ease of use, the traditional property clustering method should be used with the Scheffe property operators, while the pseudo property clustering method using Cox property operators should be reserved for interpretation of component effects on the mixture properties. An interesting benefit of using pseudo property clustering with Cox models over the traditional method is the ability to visualize the component’s impact on the mixture properties simultaneously. In the traditional techniques, each response plot is mapped onto the component space. Ignoring the combinatorial explosion issue for a moment, it can be seen that when two or more isoproperties are parallel or conflict in direction, it can be difficult to know where to move the mixture visually. This problem is compounded exponentially when multiple components are evaluated, leading researchers to either limit the number of components in the experiment or use powerful statistical techniques such as PLS. Pseudo property clustering offers a medium ground between the two methods and in some cases can be used in conjunction with PCR and PLS to further clarify solutions, especially when performing screening designs. Rules governing the interpretation of the cluster points in the property cluster space are listed as follows: Rule 5. The visual distance from a pseudo component cluster point to the pseudo feasibility region, as defined by the product targets and the standard mixture, is indicative of the relative magnitude of the component’s effect on the response. Rule 6. If the pseudo constituents lie on opposite sides of a line which passes through the standard reference mixture, then the constituents are said to be inversely related.

5.3. Optimization. The determination of optimal conditions from an experimental design is among the most complex problems for a researcher.20 In many cases because of the complexity of the interactions, especially those across multiple scales, deterministic models are not sufficient and the optimization occurs by analytical or by a hybrid analytical-computational method. In the context of DOE, interpretation of the models, model parameters, and the effects of each of the components on the properties of the mixture is the objective of screening designs. Beyond screening designs, the prediction of the optimum from the property models can help to focus future experimentation and improve the interpretation of the property models. Reaching the optimum is more efficient if the obtained model is adequate.20 The F-test is most often used as a measure of the adequacy or lack-of-fit of the model. Increasing the number of model parameters is usually the preferred choice to improve the lack-of-fit, often resulting in quadratic, specialcubic, or even special-quartic models.22,23 For these higher order models, transformations would need to be applied in order to sufficiently linearize them for use in the property clustering algorithm. Once the model is deemed adequate, the region of applicability needs to be determined. Since the mixture design is performed in the local domain, then the resulting regression coefficients of the property models express factor effects exactly in that part of the domain. If the predicted optimum lies outside of the experimental region studied, then the experiments need to be recentered around the optimum. Many techniques exist to achieve this repositioning including gradient, nongradient, and other optimization methods. Gradient methods are based on the derivative of the response surface model and, as such, are used only when the model is deemed to be adequate. The most common gradient method is the method of steepest ascent which performs experiments at predetermined steps along the vector formed by the gradient of the response surface.20,22,23 The step size is set according to the component that has the largest effect. The method is sufficient for single property models. To handle multiple responses, the method uses weighting functions on the response surface property models to create an overall gradient function.20,24 The weighting factors and values are at the discretion of the experimenter, which is a less than ideal situation. Also, when the number of chemical constituents is large, then graphical representation of the method becomes difficult. Property clustering can overcome both of these shortcomings. Since property clustering converts the properties into conserved surrogate property clusters, the steepest ascent gradient method can be entirely performed in the clustering domain at reduced complexity. The terms used in the gradient expression are the regressors corresponding to each effect direction and remain unchanged as the solution is marched toward the predicted optimum represented as the feasibility region. Once the solution leaves the feasibility region, the method is stopped and a new MDOE may be conducted. In cases where the gradient method is unable to reach the feasibility region, either the adequacy of the model needs to be improved or a more realistic set of targets needs to be specified. If the adequacy of the model cannot be improved enough to be deemed sufficient and, as such, is deemed inadequate, then nongradient optimization is performed. Nongradient optimization searches for the optimum using a step-by-step comparison of obtained property values. One of the most common types of nongradient optimizations is the simplex self-directing method.20 This method works by first conducting a simple mixture design

2252 Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009

Figure 6. Three-component system of polyethylene (x1), polystyrene (x2), and polypropylene (x3) mapped to a cluster space using both Scheffe and Cox property operator models.

around an initial guess. Next the lowest response property value is dropped and a mirror-image experiment opposite of the dropped value is conducted. The procedure continues until the algorithm repeats the same experiments, indicating that the optimum is somewhere in the bounded region. The difficulty with this method is again the inability to visualize the march toward solution for systems with multiple chemical constituents and/or multiple properties. Visualizing the solution in the property cluster domain allows the experimenter to measure the progress of the solution toward the feasibility region regardless of the number of chemical constituents studied for up to three properties. For more than three properties, the method uses an algebraic method.18 Since, by definition, use of the nongradient method is indicative of an inadequate model, then the terms of

the model cannot be used to convert the measured properties to equivalent constituent fractions. Without this mapping ability, property clustering can only be used as a visualization tool. Unique to mixture design are the constraints on the factor space imposed by eqs 6 and 7. Combined with the knowledge of the pure components used in the design, these constraints usually preclude the domain-related optimum searches referenced above. Regardless, once knowledge of the adequacy of the feasibility region has been obtained, then a new MDOE can be performed. Good properties of the MDOE consist of orthogonality, rotatability, and symmetry about an experimental center point.20 The spacing of the experimental design points around the target optimum in the new MDOE is performed in a manner similar to the original MDOE. One exception,

Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009 2253 Table 1. Simplex-Lattice Design for the Design of a Polymer Blend of Spun Yarn chemical constituents

response properties

exp run

x1

x2

x3

x4

y1

y2

y3

1 2 3 4 5 6 7 8 9 10

0.96 0.50 0.50 0.73 0.50 0.73 0.50 0.50 0.25 0.57

0.02 0.48 0.02 0.25 0.25 0.02 0.25 0.00 0.25 0.17

0.02 0.02 0.48 0.02 0.25 0.25 0.00 0.25 0.25 0.17

0.00 0.00 0.00 0.00 0.00 0.00 0.25 0.25 0.25 0.08

11.75 10.69 13.91 11.22 12.30 12.83 10.82 12.57 11.99 12.01

9.70 11.20 10.80 10.45 11.00 10.25 10.66 10.44 11.26 10.64

1.29 1.14 1.18 1.22 1.16 1.24 1.20 1.22 1.14 1.20

Table 2. Scheffe and Cox Model Regression Coefficients Scheffe Models

Cox Models

i

ψ1(βi*)

ψ2(βi*)

ψ3(βi*)

ψ1(βi)

ψ2(βi)

ψ1(βi)

1 2 3 4 s

11.70 9.40 16.40 10.47

9.59 12.85 11.98 10.61

1.30 0.98 1.07 1.20

-0.297 -2.597 4.403 -1.524 11.997

-1.041 2.219 1.349 -0.024 10.631

0.104 -0.216 -0.126 0.004 1.196

however, is the choice of defining the experimental design region in terms of the chemical constituents or properties. Since the success of the design is judged by the ability to arrive at the feasibility region, which is defined in terms of properties, then it would follow that the new experimental design region should be conducted using properties of the feasibility region mapped back to the component space. Using these mixture ratios will most likely result in a loss of symmetry about the experimental design center point. Various optimality algorithms describing the variance profiles associated with each type of design can be used to quantify the effect of the loss of symmetry. It is the experimenter’s discretion whether to use the design points that better describe the feasibility region or the design points that are more symmetric. 6. Case StudysPolymer Blend for Spun Yarn The case study used to highlight this methodology is the development of a polymer blend of spun yarn for marine applications. The three properties evaluated were thread elongation (y1), knot-strength (y2), and density (y3). Four components were evaluated using a simplex-lattice design of 10 experiments. Data for components i ) 1, 2, 3 (low-density polyethylene (LDPE), polystyrene, and polypropylene) were compiled from Cornell.9 Data for a fourth component i ) 4, nylon 6,6, was collected from Rodriguez et al.25 The resulting MDOE is shown in Table 1. Using linear regression, first-order Scheffe and Cox models were developed for the properties thread elongation and knot strength. For the third property, density, a previously developed pure component property operator model was utilized.1 Fitting these models to the data set in Table 1 resulted in significant models with all individual component terms significant. The regression coefficients for the two model types are found in Table 2. To determine the specific product targets, it is important to understand the relationship between the end-use attributes and the measured properties. The product in this design is to be used in the spinnaker sheets and guys on a race sailboat. One of the important properties in this application is that the sheets and guys have some stretch so that the spinnaker will stay filled in a gust of wind, but not so much that it loses its designed

aerodynamic shape. It has been determined that this attribute is best observed with a thread elongation between 12 and 16 kg of force. The sheets and guys are also under an immense load and need a high breaking strength while maintaining some flexibility. This attribute has been determined to be best represented by a knot strength between 12 and 13 lb of force. Finally, during the setting and dousing of this spinnaker on a race boat, the sheets and guys may contact the water surface. If they are too dense, they may ride under the boat and foul the keel. If they are not dense enough, their diameter could change too much when put under load. On the basis of these attributes, the specific volume should be between 1.0 and 1.25 mL/g. Utilizing the regression coefficients found in Table 2 in the models of eqs 1 and 8 results in the response surfaces in the simplex diagrams of Figure 7. The experimental design points and their resulting property values are also plotted in Figure 7. The feasibility region is the region shown in green and yellow. Immediately obvious is the difficulty in the interpretation of the overall design because of the use of multiple charts. Likewise the influence of single chemical constituents is difficult to measure because they may have competing effects for different properties; adversely effecting one property to the benefit of another. Furthermore, should an additive be chosen to supplement the design, additional figures would be required to determine its impact and relationship to the current design. All of these conditions make for a less than ideal situation known as combinatorial explosion. To provide an easier method of which to examine the impact of components on all the properties simultaneously, the design is analyzed using property clustering. Using eq 25 for the Scheffe model and eqs 34 and 35 for the Cox model, a set of dimensionless property operators are created with a set of references chosen to ensure a positive AUP, as shown in Table 3. The dimensionless property operator models are then converted to clusters using eqs 22, 38, and 39. Next, the responses measured at each of the design points are converted to property clusters. Finally, the specified product targets are converted to a target feasibility region. All of these terms are then plotted in property clustering diagrams shown in Figures 8-10 with the vertices representing each of the three properties in their cluster forms. By reducing the complexity of the visualization of the design problem, it is now clear that the third chemical constituent, polypropylene, is closest to the feasibility region, followed by constituent 4 (nylon 6,6), constituent 2 (polystyrene), and constituent 1 (LDPE). The addition of polypropylene to a hypothetical mixture will do little to change the properties of the mixture, suggesting it should be used as a filler. Of the three remaining polymers, LDPE appears to have the largest impact on the mixture properties when looking at Figure 9. However, since the properties of Figures 8 and 9 were derived using Scheffe models, inherent colinearities exist. Converting the Scheffe canonical models to Cox polynomial models with a standard reference mixture at location (0.653, 0.173, 0.173, 0.083) removes the primary colinearity resulting in a better visualization of the effect of component i. Figure 10 illustrates the standard and pseudo clusters. Here it is confirmed that polypropylene has the smallest effect on the combined mixture properties. However, by removing most of the colinearity in the model, the result now clearly shows that polystyrene has the strongest effect on the combined mixture properties. It also shows that polystyrene and LDPE have inverse effects, a result not clear in Figure 7.

2254 Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009

Figure 8. Experimental design points of the case study, a four-component mixture of polyethylene, polystyrene, polypropylene, and nylon 6,6 mapped to property cluster space using the Scheffe model.

Figure 9. Highlighted region of polyethylene, polystyrene, polypropylene, and nylon 6,6 mapped to property cluster space. Table 3. Nondimensionalized Property Operators and References Scheffe models

Figure 7. Various simplex diagrams based on first-order models of the responses (a) thread elongation, (b) knot strength, and (c) specific volume. The product target regions are in green and yellow.

Evaluating the placement of the experimental design points in the property cluster space also offers insights into the design. In Figures 8 and 9 the experimental design points are translated to the property cluster space. A single experiment consisting of an equimolar tertiary mixture falls within the feasibility region.

Cox models

i

Ω1

Ω2

Ω3

Ω1z

Ω2z

Ω3z

1 2 3 4 s ref

0.23 0.19 0.33 0.21

65.13 87.27 81.36 72.03

89.94 67.80 74.03 83.03

15

15

1

0.00 -0.02 0.04 -0.01 0.11 107

-1.18 2.51 1.52 -0.03 12.01 0.8849

1.19 -2.48 -1.45 0.04 13.72 0.0872

Unfortunately, the rest of the design points are outside the feasibility region and none of the candidate mixtures fall within the AUP range. This inference can also be made when investigating Figure 7 but would require much more effort. To obtain the candidate solutions via linear mixing, the property operator models must be partially extrapolated which introduces unneeded error into the solution and suggests an insufficient design. To prevent model extrapolation, the design points should be repositioned so that they cover the entire feasibility region. The procedure for executing the repositioning must take into consideration the increase in accuracy of the property space at the expense of optimality of the component space. When mapping the boundary points of the feasibility region

Ind. Eng. Chem. Res., Vol. 48, No. 4, 2009 2255

presented. The recently introduced property integration framework has been extended to include experimentally derived property operator models: specifically first-order Scheffe canonical and Cox polynomial models. When interpretation of the chemical constituent’s impact on the mixture property is warranted, Cox derived property operator models are utilized such that the location of the pseudo chemical constituent relative to the pseudo feasibility region as defined by the property targets and the standard reference mixture is indicative of its impact on the mixture’s properties. The accuracy of the design is visually observed by placing the design points in the property design space. A significant result of the developed methodology is that for problems that can be satisfactorily described by just three properties, the experimental mixture design problems are analyzed visually on a simplex diagram, irrespective of how many chemical constituents are included in the search space. However, algebraicand optimization-based approaches can easily extend the application range to include more properties.

Figure 10. Clustering diagram for the case study showing the Cox polynomial model in negative cluster space.

back to the component space using the predetermined property operator models, it was found that some mixture solutions violated eq 7. This observation means that the boundaries of the feasibility region are unattainable for the existing response surface and need to be reparametrized. Two methods exist for the reparametrization: (1) increase the complexity of the model to improve its fit or (2) improve the estimates of the component effects by repositioning the effects over the optimum. Since the fit of the first-order models were found to be excellent with a p-value of