Optimal Selection of Raw Materials for Pharmaceutical Drug Product

Apr 4, 2013 - This predictive model is then embedded into an optimization framework to find the best combination among the pool of available materials...
3 downloads 13 Views 2MB Size
Article pubs.acs.org/IECR

Optimal Selection of Raw Materials for Pharmaceutical Drug Product Design and Manufacture using Mixed Integer Nonlinear Programming and Multivariate Latent Variable Regression Models ́ Salvador Garcıa-Mun ̃oz*,† and Jose Mercado‡ †

Pfizer Worldwide Research & Development, Groton, Connecticut 06340, United States Pfizer Global Manufacturing,Vega Baja, 00693, Puerto Rico



ABSTRACT: This work presents a mathematical approach to make the most efficient use of historical data from development experiments (or commercial manufacture) to select the ingredients for the composition of a new product or to select the materials from inventory for the manufacture of a new lot of finished product. The method relies in the construction of a latent variable regression model that will serve to predict the result of combining certain materials. This predictive model is then embedded into an optimization framework to find the best combination among the pool of available materials according to a well-defined objective and subject to the predefined constraints. The framework is illustrated with two successful applications: the selection of the formulation and process for the design of a new solid oral drug product with high drug concentration and a continuous improvement project in commercial manufacture seeking to select the optimal set of lots of raw materials from the inventory to be mixed together in the manufacture a new lot for a controlled release product, given certain targets of dissolution levels and subject to material availability.

1. INTRODUCTION The complexity of designing a new pharmaceutical solid oral drug product involves making decisions on (at least) three major areas: (i) incoming material physical properties (particle engineering), (ii) selection of ratios for incoming materials (formulation design), and (iii) manufacturing route and processing conditions (process development). The intrinsic dependency of these three areas and the generally large number of unknowns at the beginning of the development cycle makes this an even more challenging task. This challenge has driven researchers to develop tools and techniques to accelerate this decision-making process while fulfilling the needs of the business. In the particle engineering field, Iacocca et al.1 review the state of the art in the practices and techniques related to the engineering of the active pharmaceutical ingredient (API) particle. Their paper comprehensively covers the decision trees that are typically followed to determine processing routes and particle size specifications for the API in order to fulfill constraints on content uniformity, bioavailability and manufacturability. The formulation selection area involves other constraints mostly related to the mechanical properties of the blend of ingredients and their chemical and physical stability in the solid phase when mixed in certain proportions in the final product. The techniques currently used to advance the decisions in this area will generally involve small-scale laboratory testing techniques to determine mechanical profiles for the candidate materials or blends of materials2−4 and hence aid the decisions as to which materials (and in what ratio) should be combined to result in an acceptable solid drug product. Stability testing of the final product has also been extensively studied to determine the chemical behavior of the mixture and (a) identify material combinations that will prevent or minimize degradation paths © 2013 American Chemical Society

in the product and (b) determine acceptable shelf life under certain packaging conditions.5,6 The third of the major decision-making areas is the task of selecting the manufacturing route and the processing conditions for the selected unit operations. In the pharmaceutical practice, this task is traditionally broken into two sequential activities: (i) select the manufacturing route and (ii) select the operating conditions. Iacocca et al.1 present a brief discussion on some of the decision-making criteria in selecting one of the three common drug product manufacturing routes: direct compression, and dry or wet granulation. The outcome of these decision trees (which are fairly common in the pharmaceutical industry) are generally driven by the concentration of API, its particle size, and the mechanical characteristics of the blend. Once the route is selected, experimentally driven approaches are usually followed to determine operating conditions and envelopes of operation. This last step has seen a burst of research activity driven by the Quality by Design phenomena in the regulatory arena.7 From the three before mentioned areas (particle engineering, formulation selection, and process development), process development is perhaps the one that has been the least methodically studied in pharmaceutical sciences, compared to what has been achieved in the chemical and petrochemical sector. The selection of the manufacturing route and the identification of optimal operating conditions for large and complex chemical processes is nowadays done simultaneously using advanced mathematical tools developed largely by the process systems engineering community.8 Such an exercise is Received: Revised: Accepted: Published: 5934

November 19, 2012 February 13, 2013 April 4, 2013 April 4, 2013 dx.doi.org/10.1021/ie3031828 | Ind. Eng. Chem. Res. 2013, 52, 5934−5942

Industrial & Engineering Chemistry Research

Article

referred to as “process synthesis”9,10 and as a discipline is slowly making its appearance in pharmaceutical applications when appropriate models exist for the candidate unit operations.11 This introduction to the topic is given to frame the work presented herein, that is concerned with the second of the areas discussed above: the selection of raw materials and their ratios. And although the discussion so far is focused on the scenario of formulating a new product; a similar situation exists in commercial manufacturing when selecting which lots of materials from the available inventory are to be used to manufacture a new lot of drug product. This work presents the application of an overarching mathematical approach to determine optimal material selections and ratios considering two scenarios: (a) a new solid oral drug product in development with specific targets of tablet hardness for an immediate release dosage formprocess conditions are also optimized for this caseand (b) for the manufacture of a new lot of a drug product that is already in a commercial production; aiming to pass the specifications for dissolution for this controlled release dosage form. Our approach uses a latent variable regression model (LVRM)12 coupled within an optimization framework to identify plausible optimal solutions to each problem. The use of optimization techniques coupled with LVRM is wellestablished in the literature.13−18 We present two successful industrial cases of applying these methods to complex problems found in the pharmaceutical industry. Only a brief discussion of the major terms of the optimization framework are discussed herein, the reader is referred to a recent review by Tomba et al.19 for more details on the theory behind this approach. So far, few examples are found in the literature for the successful application of these techniques to the selection of materials for a new product.16,20 To our knowledge this is the first pharmaceutical application published where these methods are applied to guide the selection of materials for a new product, as well as the selection of materials from the available inventory to achieve target quality properties of a drug product. The paper is organized as follows: In section 2, we provide a general review of the mathematical components of a computer aided decision-making system. The details on our choice for the underlying predictive model are given in section 3, and in section 4, we present the use of this model to build the objective function and the constraints of our optimization framework; sections 5 and 6 present two cases where this framework was successfully applied; conclusions and final remarks are given in section 7.

the established objective function, while adhering to the constraints. The mathematical model necessary in this case must be able to predict the effect of processing (at defined conditions) a certain set of materials in a given proportion, onto the quality properties of interest. Some proposals in pharmaceutical literature involve the use of blind search methods based on algorithms developed in the artificial intelligence fields;21−23 some other approaches involve the use of multivariate latent variable regression methods (LVRMs) which offer the benefit of interpretability and are often referred to as gray models. An although such LVRM methods were developed to address mixture problems outside the pharmaceutical sector,24,16,25 these have been successfully applied to predict the behavior of powder blends in pharmaceutical applications.26 The use of LVRM to model mixture data can be traced back to the early work by Kettaneh-Wold;27 these early approaches would only consider the ratios of the ingredients used and the resulting change in the properties of the mixture. The work of Muteki and MacGregor28 introduced the use of the continuum established by the physical properties of the raw materials, the blending ratios, the process conditions, and the resultant quality attributes of the product. The case studies presented in this work used an LVRM as the backbone model to provide predictions. This involved a somewhat lengthy process of obtaining a coherent data set illustrated in Figure 1for each case study. This step

Figure 1. Blocks of data matrices from a material mixture scenario with processing conditions.

represented a significant effort since most of the information was contained in paper batch records and the necessary data had to be extracted to electronic tables by hand and later verified. And although an overarching mathematical solution is given to both cases, they do differ significantly in the nature of the data that was collected for each case. For the first case study, the objective was to accelerate the design of a new product. This implied making a choice on formulation and process conditions (the route of manufacture had already been selected and fixed). On the formulation aspect, the set of candidate excipients represented a collection of multiple different materials. As a result, the physical properties matrix (X in Figure 1) spanned a wide range of physical properties (material-to-material variability), the matrix R (Figure 1) represented the different formulations that were experimented on, and the process conditions matrix (Z) exhibited large changes as it was necessary to experimentally cover the operational range of the process. The second case discusses a continuous quality assurance exercise for legacy product that is already in commercial

2. MODEL-BASED OPTIMAL SELECTION OF MATERIALS AND BLENDING RATIOS Utilizing the computer to expedite a decision making process to select the best possible solution to a given problem statement involves four components: (i) a mathematical expression that can quantitatively relate decision variables with their effects (i.e., a predictive model) (ii) a quantitative measure for the desirability of a candidate solution (i.e., an objective function) (iii) a set of constraints to prevent impractical solutions (e.g., the summation of the mixture ratios must add one) (iv) an optimization engine to search among the multiple potential solutions to find the optimal one according to 5935

dx.doi.org/10.1021/ie3031828 | Ind. Eng. Chem. Res. 2013, 52, 5934−5942

Industrial & Engineering Chemistry Research

Article

production. The formulation and the processing conditions are fixed and cannot be changed. To minimize quality variability, a model was built to predict the quality of the product as a function of the lot-to-lot variability in the raw materials. As a result, the physical properties matrix (X in Figure 1) consisted of the physical properties for the multiple lots of the same type of excipients and the API; and since the process was not changed, there was no need for a Z matrix (Figure 1). And although the formulation was fixed (i.e., the total amount of material for each excipient was fixed by recipe), the amounts to be used of the different available lots for a single material were not. These within-material ratios became matrix R (Figure 1) and were the primary degree of freedom in the optimization exercise later on.

Figure 2. Rearrangement of the data using estimated properties of the mixture.

3. MODELING MIXTURE AND PROCESS DATA USING LVRM The complete data set collected in these scenarios (Figure 1) eventually contained data for L lots of finished product, that were processed at P process (and/or environmental) conditions, using M materials, that were characterized with N physical properties, resulting in a product that was characterized with Q quality attributes. Given these definitions, the dimensions of the matrices illustrated in Figure 1 are described in Table 1. The case studies

M

RXI(l , n) =

∑ R (l ,m) × X (l ,m) m=1

∀ l = [1 ... L] ∧ n = [1 ... N ]

(1)

The final collection of matrices can be fitted using a multiblock partial least-squares model (Figure 3). The parameters

Table 1. Descriptions and Dimension of Matrices Considered in the Material Mixture Problem matrix Z R X RXI Y

contains information on

rows

columns

process and/or environmental conditions used for each lot of final product ratio used of each material per each lot of final product, each element is a number between 0 and 1 physical/chemical properties for each material used weighted average physical properties for all materials for each lot obtained quality per each lot of final product

L

P

L

M

N L

M N

L

Q

Figure 3. Data structures in the multiblock model, model parameters, and prediction for a new observation vector.

estimated for this type of model consist in three loadings matrices (W*, P, and Q) and two matrices of scores (T and U). The resultant LVRM (eq 2) exhibits three important features: (i) it captures the any correlation across process conditions and material properties, (ii) it captures any correlations across quality attributes, and (iii) it provides a prediction of the response (in this case Y) as a function of the scores (T) of the regressor (in this case Z and RXI matrices). For more details on the theory behind LVRMs, the reader is referred to the work by Burnham et al.12 and Hosskuldsson.30

discussed here followed the approach suggested by Muteki et al.16 in which the matrices X and R and combined using an ideal mixing rule in order to accommodate all blocks of information into a special type of LVRM called multiblock partial least-squares model.29 For the case of a new product it is important to verify that the summation across ratios for a lot equals the unit; and for the case of the commercial product, the summation of the within-material ratios equals the unit for each lot. This ideal mixing rule is simply to create a new matrix with the weighted average material physical properties (Figure 2) for each of the L lots of finished product. This weighted average is calculated for each of the N physical properties for the M materials used per lot (eq 1). This new matrix of weighted average physical properties (RXI) of L rows and N columns is of course a function of the selected materials for each of the L lots.

[ZRXI] = TPT + E x Y = TQT + E y T = [ZRXI]W* U = YQ

(2)

Once the model is fitted, its parameters (loadings) can be interpreted to infer driving forces relating the effect of the physical properties of the blend onto the quality attributes of the product. This model is also used to provide a prediction of the quality attributes of the product (ŷ) given a completely new set of values for process conditions and the weighted average physical properties (illustrated in Figure 3 as znew and rxi) using eq 3. ŷ = Qτnew τnew = W*[z TnewrxiT]T 5936

(3)

dx.doi.org/10.1021/ie3031828 | Ind. Eng. Chem. Res. 2013, 52, 5934−5942

Industrial & Engineering Chemistry Research

Article

The numerical criteria used to determine the desirability of a candidate solution was the squared difference between the predicted and the target quality of the product (eq 5). The best solution is the one that results in the least squared difference between prediction and target. This quadratic objective function has many advantages from an optimization theory perspective, among others it is a continuously differentiable function with a unique global minimum.

Furthermore, it also provides two important diagnostics useful to assess the quality of the prediction: the squared prediction error (spex) and the Hotelling’s T2 statistic (HotT2) for the regressor (eq 4). speX =

∑ ([z TnewrxiT]T a ⎞2 ⎛ τnew ⎟ σ a=1 ⎝ a ⎠

− Pτnew )2

A

HotT 2 =

∑⎜

min((y ̂ − ytarget )2 )

(4)

(5)

The set of equations formed by the objective function and the predictive model (eqs 5 and 3) are already enough elements for the optimizer to conduct a search for a solution that minimizes the objective function; since all mathematical dependencies are well-defined. This solution, however, could be infeasible or impractical due to the absence of constraints. For example, a potential solution could very well involve blending an unreasonably high number of materials, or using process parameters outside of what the operational range for the equipment. Constraints for process parameters (greater than, less than or equal to) are simple to include (eq 6a) as all process parameters are explicitly considered in the model. Similarly, there might be constraints on the performance of the product that need to be considered (eq 6b).

The spex metric provides a measure of the degree of mismatch between the correlation exhibited across the elements of the new regressor (e.g., across process conditions, physical properties, or both!) as compared with the data used to fit the model. A low value of spex indicates that the correlations across the elements of the new regressor are consistent with those found in the original data set, irrespectively of the value magnitude of the elements. The Hotelling’s T2 on the other hand provides a metric of the overall magnitude of the elements of the new regressor. A low value of the HotT2 implies that the values of the elements in the new regressor are close to the mean value per column of the data used to fit the data (as all data is mean centered and autoscaled before fitting the model). The spex and HotT2 together provide a useful framework to determine whether a prediction is being performed within the capabilities of the model. This comfort region for the prediction is established by the confidence intervals for these two diagnostics, derived from the original data used to build the model. The methods to determine these confidence intervals are well-established31 and outside of the scope of this manuscript.

z new(p) ≤ znew _max (p) z new(p) ≥ znew _min(p) z new(p) = znew _fixed(p)

(6a)

y(̂ p) ≤ y _max (p)

4. OVERARCHING OBJECTIVE FUNCTION The next step after a predictive model is fitted and deemed a valid one (e.g., via cross validation and analysis of the model parameters) is to incorporate it within an optimization framework. This process is schematically illustrated in Figure 4. The objective for the optimization engine is to determine which of the available materials are to be selected to be part of this new lot and in what ratio should these materials be combined, subject to any process or environmental conditions; in order to obtain the next best lot of f inished product.

y(̂ p) ≥ y _min(p) y(̂ p) = y _fixed(p)

(6b)

Constraining the number of selected materials, or lots of materials, however is not straightforward since there is no explicit variable that keeps a count of the number of lots chosen in a candidate solution. The inclusion of this constraint requires the definition of a new set of binary variables (i.e., variables that can only be 1 or 0). Let us define a new binary vector rbin with MA elements (corresponding to the MA available materials); such that, if rbin(ma) is equal to 0 it means that material ma is not part of the candidate solution and vice versa, if rbin(ma) is equal to 1, then material ma is chosen to be part of the candidate solution. Furthermore, given the incoherence of having a mixing ratio greater than 0 for a material that is not chosen for the candidate solution; the constraint in eq 7 needs to be added to associate r with rbin in a consistent manner.

r(ma) < rbin(ma)

(7)

Once these binary variables are defined and consistently associated with the corresponding mixture ratio, the following constraints can be added to limit the total number of materials (eq 8a), or lots of material J (eq 8b), that are selected in a candidate solution. Figure 4. Historical data and dependency of the mixture properties for a new candidate solution from the materials selected from a superset of available materials.

∑ rbin(m ) ≤ max _number_of_materials a

ma

5937

(8a)

dx.doi.org/10.1021/ie3031828 | Ind. Eng. Chem. Res. 2013, 52, 5934−5942

Industrial & Engineering Chemistry Research

∑ rbin(m ,j) ≤ max _number_of_materials(j) a

ma

Article

was data from 35 experiments where processing conditions and formulations where varied circumstantially. The materials were all characterized using tests (Table 2) relevant to the wet

(8b)

Other important constraints to consider are those pertaining to the concentration of materials. For example, one might want to constrain to concentration of a given material to be above a minimum, below a maximum, or fixed, while keeping the total summation of the ratios to the unit.

Table 2. Physical Properties Considered in the Design of a New Wet Granulated Product physical properties measured for case 1

r(ma) ≥ r _min(ma)

surface area particle size distribution (laser diffraction) porosity flow water holding capacity at equilibrium (DVS) surface energy by inverse gas chromatography contact angle density

r(ma) ≤ r _max (ma) r(ma) ≤ r _min(ma)

∑ r(ma) = 1 ma

(9)

In other scenarios where the overall formulation is fixed as well as the total quantities required per each of the J necessary ingredients (as in the case of commercial manufacture); it becomes necessary to include calculations (eq 10) to (i) ensure that the candidate solution adheres to the mass required by recipe for material J; (ii) to prevent the optimizer from choosing more material than the one available in inventory for a given lot (ma) of material J; and (iii) to ensure the summation of the within-material mixture ratios is the unity.

granulation processing32 including API and excipients. And although this exercise concluded in a satisfactory manner, it is important to point out that mixture designs are available in literature and that a structured Mixture DOE could have been implemented a priori.33 The objective was to find a formulation that would result in a tablet with acceptable hardness levels (at least 20 kP of crushing strength at 13 kN of compression force) with the maximum possible concentration of API. An LVRM was fitted to the available data which consisted of scale-independent process descriptors,34,35 formulations, physical properties for materials, and the resultant hardness compression profile at the tablet press. With six significant latent variables (cross-validated by jack-knifing), the model was able to capture up to ∼85% of the variability in the processing conditions, 94% of the variability in the formulation space, and 81% of the variability in the hardness compression profiles. The predictive ability of the model is given in Table 3. Such a model is then used as the backbone for an optimization exercise. The objective function for the optimizer is to determine process conditions and formulation to achieve a given target for the hardness compression profile, subject to an API concentration above an established minimum. This was procedurally achieved by setting a hard constraint to the minimum concentration of the API and solving the optimization problem. This minimum hard constraint is lowered down until a feasible solution was obtained. For simplicity reasons, this quantity was then rounded to the closest multiple of 5. Figure 5 illustrates the obtained compression profile for the initial deficient “high drug load” formulation and the new one obtained with the formulation found by the optimizer. The new formulation contained a 65% concentration of API and clearly fulfilled the hardness requirements established. This formulation was thereafter used for the development of this product.

kg(ma , j) = r(ma , j)kg_req (j) kg(ma , j) ≤ kg_available(ma , j)

∑ r(ma ,j) = 1 ma

responses measured crushing strength at 7 kN, 11, 13, and 15 kN of compression force

(10)

Finally, it is desirable for the optimizer to be able to detect when the physical properties of a material are incoherent in the magnitude and correlation structure with the data used to fit the model (which is not unusual when there is a provider change, or the provider transfers the manufacture of a raw material to another facility). An elegant way to establish these constraints are by imposing maximum values on the spex and the HotT2. Doing so has proven to be an effective way to establish multivariate constraints that ensure the solutions are not unreasonably different from the historical data.18 The collection of eqs 3−10 constitute the final mathematical problem to be optimized. Due to the existence of nonlinear constraints (eqs 3 and 4) and decision variables that are real and integer (znew, r, and rbin), this problem is considered a mixed-integer nonlinear programming (MINLP) problem, and as such, it requires an appropriate optimization solver. The following sections describe two real cases of the application of this mathematical framework. In both cases, the model building and data preprocessing were performed in MATLAB (The Mathworks, Natick, MA) and the optimization calculations were done in GAMS using the Baron solver (GAMS Development Corporation, Washington, DC).

5. CASE STUDY 1: FORMULATION OF A NEW DRUG PRODUCT The clinical needs for a new compound required the formulation of a tablet with a high concentration of API. A wet granulated route was selected due to the cohesive and poorly compressible characteristics of the drug and the necessary high concentration required. At the point of development when considerable scale-up was necessary there

6. CASE STUDY 2: FROM COMMERCIAL MANUFACTURING A legacy product manufactured at commercial scale was being subject to continuous improvement efforts seeking to reduce the lot-to-lot variability exhibited in the final product dissolution profile, specifically the 30 min dissolution. The product is a controlled release one and involves a total of 11 5938

dx.doi.org/10.1021/ie3031828 | Ind. Eng. Chem. Res. 2013, 52, 5934−5942

Industrial & Engineering Chemistry Research

Article

Table 3. Diagnostics for the Multiblock PLS Model Fitted to the Data from Case Study 1 LV #

R2X (%) [process]

R2X (%) [cummulative] [process]

R2X (%) [formulation]

R2X (%) [cummulative] [formulation ]

R2Y (%)

R2Y (%) [cummulative]

Q2Y (%)

Q2 (%) [Cummulative]

1 2 3 4 5 6

42.56 14.16 9.78 5.11 6.83 6.07

42.56 56.72 66.50 71.61 78.44 84.51

55.16 14.23 12.73 4.88 3.87 3.47

55.16 69.39 82.12 87.00 90.86 94.33

61.96 4.66 5.40 4.59 2.81 2.10

61.96 66.62 72.02 76.62 79.43 81.53

57.91 3.06 7.85 4.68 3.72 2.96

57.91 60.98 68.83 73.51 77.23 80.19

Figure 5. Hardness compression profile before optimization for high dose formulation (40% API on left) and after optimization (60% API on right).

Table 4. Diagnostics for the Multiblock PLS Model Fitted to the Data from Case Study 2 LV #

R2X (%)

R2X [cummulative] (%)

R2Y(%)

R2Y [cummulative] (%)

Q2 (%)

Q2Y [cummulative] (%)

1 2 3 4

31.08 16.88 9.00 5.64

31.08 47.97 56.96 62.61

11.78 5.36 4.38 5.03

11.78 17.14 21.52 26.56

9.95 3.68 5.58 1.87

9.95 13.63 19.21 21.08

different excipients, manufactured in a complex production train where all process conditions were kept constant throughout the manufacturing history of the product due to regulatory constraints. A data set similar to the one illustrated in Figure 1 was collected for the entire manufacturing history of the product spanning 5 years of manufacturing. Process conditions were ultimately not used since there was no variability in these throughout the years. The complete data set consists of the within-material mixture ratios for all the ingredients for the 215 lots of final product considered. There were eleven different raw materials and 96 different physical properties in total. The values for the physical properties for all incoming (matrix X in Figure 1) were taken from the certificate of analysis for each material; and the quality of the product was characterized by the mean and standard deviation of the dissolution measured at 30 and 180 min. An LVRM model was fitted to this data using four significant latent variables that captured 62% of the total variability in the physical properties of the raw materials and was able to predict 21% of the variability in the dissolution profile of the product (Table 4). And given the fact that this is process is already

under tight control, having the ability to systematically predict 21% of the variability in the quality attribute implies great potential for improvement. Further study of the model diagnostics, specifically the correlation coefficient R2 for quality (Figure 6) uncovers that the driving forces identified in the third and fourth latent variable explain the majority of the variability for the mean value of dissolution at 30 min, while the first latent variable explains the standard deviation of it. And since the mean value of dissolution at 30 min was explicitly set in the quality specification, we concentrated our efforts in studying the score space for the third and fourth latent variable (Figure 7) and the ways to drive a new lot to be in the center region (indicated with an ellipsoid in Figure 7) of that score space. The objective function for this case study was built to maintain the mean value of dissolution at 30 min on target, while keeping the standard deviation below a threshold. After building the model and the optimization function, the following activity was to verify the ability of the system to provide lot selections that would drive the dissolution to its target. This verification exercise was carried out by designing three lots of 5939

dx.doi.org/10.1021/ie3031828 | Ind. Eng. Chem. Res. 2013, 52, 5934−5942

Industrial & Engineering Chemistry Research

Article

Figure 6. Captured variance per latent variable and per variable in the response for the commercial product case study. Figure 8. New designed lots for low, medium, and high dissolution in the third and fourth dual score space.

Table 5. Target and Obtained Dissolution Values Obtained from Experimental Verification of the Optimization Framework for Case Study 2 dissolution target

obtained in commercial scale

30 min average

180 min average

30 min average

180 min average

63.90 57.84 51.60

97.46 95.20 94.73

63.50 55.00 51.17

97.67 96.33 93.04

was a measurable improvement in the variability for the dissolution for the product (as illustrated in Figure 9); the system is still operational as of the Summer of 2011 when this manuscript was prepared.

Figure 7. Third and fourth dual score space for lots of commercial product included in the model.

7. CONCLUSIONS Contextually correct historical data is a critical asset that a corporation can take advantage of to expedite assertive decisions. Herein we present a mathematical framework to

product where the target mean dissolution value at 30 min was purposefully driven to the high, middle, and low ranges of the acceptance range in dissolution given in the specification. Each of the designed lots would have specific choices of materials, from the inventory, in order to achieve its target dissolution. Five lots were designed, out of which three were executed. Not surprisingly, the five lots designed to give high, medium, and low values of dissolution fall in the extremes of the t3−t4 score space (Figure 8). Four lots were designed to cover the high and low values of dissolution; two lots were designed attempting to minimize/maximize the dissolution profile within acceptance (marked as HiY and LowY in Figure 8); and two lots were designed attempting to minimize/maximize the value of t4, since it was observed that this latent variable had the highest leverage on this quality attribute, while keeping dissolution within the acceptance range (points maker as Lowt4 and Hit4 in Figure 8). Three of these five lots (one targeting a low dissolution, one at the center point, and one targeting a high dissolution) were executed at commercial scale. The results (Table 5) confirmed the ability of the framework to control the dissolution of the product to a specified value and reassured the operations personnel of the stability of the system. The optimization framework was put in place to select lots of raw material from the inventory at the end of 2009, and throughout 2010, there

Figure 9. Mean dissolution at 30 min before and after the optimization system was put in place for the commercial product. 5940

dx.doi.org/10.1021/ie3031828 | Ind. Eng. Chem. Res. 2013, 52, 5934−5942

Industrial & Engineering Chemistry Research



support the decision-making process of selecting materials for the manufacture (or design) of a (new) product making the most efficient use of historical data available. Given the complexities in size and ill-condition nature of the massive data sets that arise from these scenarios, we believe it is important to make use of data analysis methods that are interpretable and capable of reducing the system to its simplest significant form. Latent variable regression methods provide this unique capability of decoupling the sources of orthogonal variability affecting a given metric, while providing a much simpler set of interpretable parameters that empower the practitioner to understand and troubleshoot complex systems in the most efficient manner. Building a predictive model can indeed be useful in understanding a system (depending on the modeling technique). However, the exercise of embedding such a model into an optimization framework forces the practitioner to formulate quantitative objective functions and corollary constraints necessary to solve the problem at hand. This thinking exercise proved to provide clarity on the critical metrics and how these impact the business, justifying the effort invested in developing and deploying these solutions. With regards to the performance of the framework presented in this work: the quality of the solutions found by the optimizer is a function of the accuracy of the underlying models as well a strong function of the pool of available materials. In the case of the design of a new product, the quality targets were met with no difficulty due to the large pool of potential materials available, and the relatively large decision space given by the freedom in the mixture ratios. For the second case presented, the commercial product, the obtained reduction in the lot-to-lot variability is by no means as small as it could be; it is only as small as possible with the materials available in inventory at the time of manufacturing for each lot. Finally, this work demonstrates how the use of modern MINLP methods for optimization in combination with an LVRM and a proper mathematical formulation of the material selection process enables the practitioner to find a good solution to a combinatorial problem where the number of candidate solutions is much too large to be searched by hand.



Article

REFERENCES

(1) Iacocca, R. G.; Burcham, C. L.; Hilden, L. R. Particle engineering: A strategy for establishing drug substance physical property specifications during small molecule development. J. Pharm. Sci. 2010, 99 (1), 51−75. (2) Mullarney, M. P.; Hancock, B. C. Improving the prediction of exceptionally poor tableting performance: An investigation into Hiestand’s ″special case″. J. Pharm. Sci. 2004, 93 (8), 2017−2021. (3) Sun, C. C.; Hou, H.; Gao, P.; Ma, C.; Medina, C.; Alvarez, F. J. Development of a high drug load tablet formulation based on assessment of powder manufacturability: Moving towards quality by design. J. Pharm. Sci. 2009, 98 (1), 239−247. (4) Hamad, M. L.; Bowman, K.; Smith, N.; Sheng, X.; Morris, K. R. Multi-scale pharmaceutical process understanding: From particle to powder to dosage form. Chem. Eng. Sci. 2010, 65 (21), 5625−5638. (5) Waterman, K.; Carella, A.; Gumkowski, M.; Lukulay, P.; MacDonald, B.; Roy, M.; Shamblin, S. Improved Protocol and Data Analysis for Accelerated Shelf-Life Estimation of Solid Dosage Forms. Pharm. Res. 2007, 24 (4), 780−790. (6) Waterman, K. C.; MacDonald, B. C. Package selection for moisture protection for solid, oral drug products. J. Pharm. Sci. 2010, 99 (11), 4437−4452. (7) Food and Drug Administration ICH Q8(R2) Pharmaceutical Development. Federal Register, 2009; Vol. 71 (98). (8) Stephanopoulos, G.; Reklaitis, G. V. Process systems engineering: From Solvay to modern bio- and nanotechnology.: A history of development, successes and prospects for the future. Chem. Eng. Sci. 2011, 66 (19), 4272−4306. (9) Westerberg, A. W. A retrospective on design and process synthesis. Comput. Chem. Eng. 2004, 28 (4), 447−458. (10) Barnicki, S. D.; Siirola, J. J. Process synthesis prospective. Comput. Chem. Eng. 2004, 28 (4), 441−446. (11) Cervera-Padrell, A. E.; Skovby, T.; Kiil, S. +.; Gani, R.; Gernaey, K. V. Active pharmaceutical ingredient (API) production involving continuous processes−A PSE-assisted design framework. Eur. J. Pharm. Biopharm. 2012, 82, 437−456. (12) Burnham, A.; MacGregor, J. F.; Viveros, R. Latent variable multivariate regression modeling. Chemom. Intell. Lab. Syst. 1999, 48, 167−180. (13) Yacoub, F.; MacGregor, J. F. Analysis and optimization of a polyurethane reaction injection molding (RIM) process using multivariate projection methods. Chemom. Intell. Lab. Syst. 2003, 65, 17−33. (14) Yacoub, F.; MacGregor, J. F. Product optimization and control in the latent variable space of nonlinear PLS models. Chemom. Intell. Lab. Syst. 2004, 70, 63−74. (15) Garcia-Munoz, S.; MacGregor, J. F.; Kourti, T.; Apruzzece, F.; Champagne, M. Optimization of Batch Operating Policies. Part I. Handling Multiple Solutions. Ind. Eng. Chem. Res. 2006, 45, 7856− 7866. (16) Muteki, K.; MacGregor, J. F.; Ueda, T. Rapid Development of New Polymer Blends: The Optimal Selection of Materials and Blend Ratios. Ind. Eng. Chem. Res. 2006, 45 (13), 4653−4660. (17) Garcia-Munoz, S.; MacGregor, J. F.; Neogi, D.; Latshaw, B.; Metha, S. Optimization of Batch Operating Policies. Part II. Incorporating Process Constraints and Industrial Applications. Ind. Eng. Chem. Res. 2008, 47, 4202−4208. (18) Garcia-Munoz, S.; Dolph, S.; Ward, H. W., II Handling uncertainty in the establishment of a design space for the manufacture of a pharmaceutical product. Comput. Chem. Eng. 2010, 34 (7), 1098− 1107. (19) Tomba, E.; Barolo, M.; Garcia, S. General Framework for Latent Variable Model Inversion for the Design and Manufacturing of New Products. Ind. Eng. Chem. Res. 2012, 51 (39), 12886−12900. (20) Muteki, K.; MacGregor, J. F. Optimal purchasing of raw materials: A data-driven approach. AIChE J. 2008, 54 (6), 1554−1559. (21) Takayama, K.; Fujikawa, M.; Obata, Y.; Morishita, M. Neural network based optimization of drug formulations. Adv. Drug Delivery Rev. 2003, 55 (9), 1217−1231.

AUTHOR INFORMATION

Corresponding Author

*E-mail: salvador.garcia-munoz@pfizer.com. Tel.: +1 (860) 7150578. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors recognize that the efforts to construct these systems based on collected data required a monumental effort by a large team; we want to thank and acknowledge the team for their hard work and dedication to collect and analyze the data that was used in these case studies. Paul Goulding and Arulsuthan Balasundaram (Pfizer Worldwide R & D, Sandwich, Kent, UK); Ivelisse Colon-Rivera, Avinash Thombre, and Julian Lo (Pfizer Worldwide R & D, Groton, CT, USA); Denise Rivkees (Pfizer Global Technical Services, Morrstown, NJ, USA); Israel Cotto, Victor Ruiz, and Denisse Sanchez (Pfizer Global Supply, Vega Baja, Puerto Rico); and Leah Appel, Josh Shockey, and Matt Shaffer (Greenridge Consulting, Bend, OR, USA) 5941

dx.doi.org/10.1021/ie3031828 | Ind. Eng. Chem. Res. 2013, 52, 5934−5942

Industrial & Engineering Chemistry Research

Article

(22) Shao, Q.; Rowe, R. C.; York, P. Comparison of neurofuzzy logic and decision trees in discovering knowledge from experimental data of an immediate release tablet formulation. Eur. J. Pharm. Sci. 2007, 31, 129−136. (23) Shao, Q.; Rowe, R. C.; York, P. Investigation of an artificial intelligence technology - Model trees Novel applications for an immediate release tablet formulation database. Eur. J. Pharm. Sci. 2007, 31, 137−144. (24) Muteki, K.; MacGregor, J. F.; Ueda, T. Mixture designs and models for the simultaneous selection of ingredients and their ratios. Chemom. Intell. Lab. Syst. 2007, 86 (1), 17−25. (25) Garcia-Munoz, S.; Polizzi, M. WSPLS: A new approach towards mixture modeling and accelerated product development. Chemom. Intell. Lab. Syst. 2012, 114 (0), 116−121. (26) Polizzi, M. A.; Garcia-Munoz, S. A framework for in-silico formulation design using multivariate latent variable regression methods. Int. J. Pharm. 2011, 418, 235−242. (27) Kettaneh-Wold, N. Analysis of mixture data with partial least squares. Chemom. Intell. Lab. Syst. 1992, 14 (1−3), 57−69. (28) Muteki, K.; MacGregor, J. F.; Multi-block, P. L. S. modeling for L-shape data structures with applications to mixture modeling. Chemom. Intell. Lab. Syst. 2007, 85, 186−194. (29) Westerhuis, J.; Kourti, T.; MacGregor, J. F. Analysis of Multiblock and Hierarchical PCA and PLS Models. J. Chemom. 1998, 12, 301−321. (30) Hoskuldsson, A. PLS Regression Methods. J. Chemom. 1988, 2, 211−228. (31) Nomikos, P.; MacGregor, J. F. Multivariate SPC Charts for Monitoring Batch Processes. Technometrics 1995, 37 (1), 41−58. (32) Vemavarapu, C.; Surapaneni, M.; Hussain, M.; Badawy, S. Role of drug substance material properties in the processibility and performance of a wet granulated product. Int. J. Pharm. 2009, 374 (1−2), 96−105. (33) Eriksson, L.; Johansson, E.; Wikstrom, C. Mixture design–design generation, PLS analysis, and model usage. Chemom. Intell. Lab. Syst. 1998, 43 (1−2), 1−24. (34) Faure, A.; Grimsey, I. M.; Rowe, R. C.; York, P.; Cliff, M. J. Applicability of a scale-up methodology for wet granulation processes in Collette Gral high shear mixer-granulators. Eur. J. Pharm. Sci. 1999, 8 (2), 85−93. (35) Hapgood, K. P.; Litster, J. D.; White, E. T.; Mort, P. R.; Jones, D. G. Dimensionless spray flux in wet granulation: Monte-Carlo simulations and experimental validation. Powder Technol. 2004, 141 (1−2), 20−30.

5942

dx.doi.org/10.1021/ie3031828 | Ind. Eng. Chem. Res. 2013, 52, 5934−5942