Ind. Eng. Chem. Res. 2008, 47, 1967-1974
1967
Process Modeling Based on Process Similarity Junde Lu and Furong Gao* Department of Chemical Engineering, The Hong Kong UniVersity of Science & Technology, Clearwater Bay, Kowloon, Hong Kong
In the processing industries, operating conditions change to meet the requirements of the market and customers. For example, in injection molding, the same machine may be used with different molds and materials to make different parts. For each different mold, material, and machine combination, data-based process modeling must be repeated for the development of a new prediction model. This may involve the repetition of a large number of experiments, if common process characteristics are ignored. Obviously, this is inefficient and uneconomical. Although operating conditions may be different for different processes, certain process behaviors and characteristics are common under these conditions. The effective use and extraction of these common process behaviors and characteristics can allow fewer experiments for the development of new process model, resulting in a savings of time, cost, and effort. With this as the key objective, a modeling method is proposed for process modeling. This includes information extraction from a base model, a design of experiments (DOE) for the new process, assessment of difference, model migration strategy, and verification for the new model. As an illustrative example, this paper demonstrates model development for new molds to predict the width of an injection-molded part, taking advantage of an existing model. 1. Introduction Industrial batch processes, such as injection molding, semiconductor processing, pharmacy, and most bioprocesses, become the increasingly preferred choice in industry to produce highvalue-added products to meet the rapidly changing market. This circumstance leads to a boom of quality-related activities such as the Six Sigma (6σ) campaign and transition in a production system. The production of better-quality products is based on a reliable and prompt prediction of product quality. Generally, product quality is measured off-line, and quality measurement is a costly, cumbersome, and time-consuming practice. Therefore, it is important to develop a quality prediction method that allows us to “measure” product quality in a fast, economical, efficient, and accurate way. Quality prediction is often based on a model built between process conditions and quality variables. In other words, it is a type of regression problem in nature, with the objective of finding a relationship between a set of X-variables (process conditions) and one or several response variables Y (quality properties). Generally, quality prediction models can be built mechanistically and empirically. A mechanistic model, which is built on the first-principle or prior process knowledge, has a better extension than an empirical model. However, it is difficult to develop such a model, because of process high dimensionality, complexity, and process variation, and also because of limited product-to-market time. In an industrial process, operating conditions frequently change to meet the requirements of the market or customer. In other words, different specifications of product from the same family are produced by changing the recipe from the first (or old) process to a second (or new) process. These processes might be of different size, process condition and configuration, or with different equipment, although their intrinsic physical principles are the same. To predict the product quality rapidly and online, it is requisite to develop a quality model whose objective is to find a relationship between the process conditions and the * To whom correspondence should be addressed. Tel.: +852-23587139. Fax: +852-2358-0054. E-mail:
[email protected].
response variable(s). Traditional data-driven approaches, including artificial neural networks (ANNs), fuzzy logic model (FLM), partial least squares (PLS), and support vector machines (SVMs), require a large number of experimental data to build these models.1-4 A practical problem is how to apply/modify the existing black-box model to the new process when the process conditions have been changed. For clarity, we define the existing model describing the old process as the “base model” and the model to be developed for the new process as the “new model”. Changes in the process conditions and environment may make the existing base model invalid for new quality prediction. A possible solution to this model problem is to repeat the experiments, measure the samples again, and rebuild a new model for the new process. To obtain the similar model prediction precision as the base model, a similar large number of data are required for the development. Obviously, this is inefficient, time-consuming, and uneconomical, and this is particularly problematic for processes with expensive material and/or a long cycle. As we know, despite the fact that the operating conditions may be different for different processes, certain process behaviors and characteristics remain common under these different conditions. For example, in injection molding, where plastic granules are processed into different specifications of products with different molds, process conditions, such as barrel temperature, packing pressure, and injection velocity, have similar impacts on the molded part qualities. In more-general cases of making products from the same family, these similar processes might be of different size and configuration; they are all based on the similar physical principles and follow a certain similar process behavior. This characteristic, to a degree, should have been already built in the data-driven model, i.e., the base model. Obviously, extraction and utilization of these common process behaviors and characteristics or process similarities, and taking advantage of the existing base model, can allow fewer data to be required to develop a new process model, as shown in Figure 1. Although there is a tremendous amount of literature available for building process models for quality prediction and optimiza-
10.1021/ie0704851 CCC: $40.75 © 2008 American Chemical Society Published on Web 02/27/2008
1968
Ind. Eng. Chem. Res., Vol. 47, No. 6, 2008
Figure 1. Development of the new model.
tion, to the best knowledge of the authors, there is little work published on the data-based model migration for industrial processes with the objective of reducing the number of experimental data. In these very limited literatures that exist on the subject, knowledge of the first-principles or process mechanism is required. For example, hybrid models, which combine prior knowledge or first-principles with traditional datadriven methods, have been proposed for chemical processes.5 Jaeckle et al.6,7 utilized historical process data to find a window of process operating conditions within which a product with a new specified quality can be produced. Similarly, the same concepts are developed using latent variable modeling and joint-Y PLS with historical process data on both sites.8,9 However, the objective of these papers is to determine the new process conditions for a new plant, rather than building a new model. Multivariate calibration model is used to extract chemical information from spectroscopic signals. The changes in spectral variations, which are due to different instrument characteristics or environmental factors, may render the original calibration model invalid for prediction in the new system. Feundale et al.10 reviewed various methods for calibration model transfer to avoid time-consuming full recalibration by taking advantage of the existing model. Most of calibration models are linear and one-dimensional; such methods are not suitable for process modeling. If two processes are of similar complexity, it is reasonable to speculate that a similar number of data is required to build separate models of each. However, with an existing data-based model, if common process information is efficiently explored to design new experiments on new process, one of the following can be expected: (1) Fewer data are required to model the new process than that which is required for modeling the old process, if the two models are developed with similar prediction precision, or (2) A new model can be obtained, with higher accuracy than that of the base model, if using equal or almost equal numbers of data. In this paper, the purposes are (a) to bring attention and enhance the interest of the community for this important modeling work of industrial processes, (b) to state the problem and its challenges, and (c) to give an illustrative solution to a simple problem case. With such a problem, we define it as “Process Modeling Based on Process Similarity” (or PMBPS). This work can be divided into the following steps: (a) information extraction from the base model, (b) design of experiment, (c) assessment of the similarities or differences between the new process and the base model, and (d) model migration and verification of new model, as shown in Figure 2.
Figure 2. Overall framework for developing the new model, based on process similarity.
The remainder of the paper is so organized. Section 2 presents the key requirements and challenges of each major step in the PMBPS. Section 3 outlines several possible cases of difference between the base model and the new process. Section 4 presents exemplary solutions to a simple difference case where input and output linear correction could be used for the model migration. Section 5 gives an illustrative application. Finally, some conclusive remarks are given in Section 6. 2. Approach Overview The key requirements and challenges for PMBPS can be summarized as follows. 2.1. Information Extraction from the Base Model. Despite the fact that it has been developed with input-output data alone, the base model, as noted in the Introduction, should contain certain key process characteristics. Information could be extracted to assist the understanding of the old process and to guide the experimental design for the new process. For example, with a base model, we could determine which process condition (or factor) has a more-significant influence on the response variable; we also could determine if the effect on the response variable is linear or nonlinear, and, if it is nonlinear, what type of nonlinearity and monotonicity it is. Information such as this will be helpful for planning the experiments for the new process. 2.2. Design of the Experiment. Traditionally, training data for modeling is collected through a statistical design of experiment (DOE). Generally, without any process information, an experimental program begins with a screening design that considers only two levels for each factor, followed by an
Ind. Eng. Chem. Res., Vol. 47, No. 6, 2008 1969
augmented experiment to develop a more-accurate model.11 Conventional DOE approaches require the training data to be evenly distributed in input space and treat each possible DOE point equivalently, because no prior process knowledge is available. Obviously, this type of design is not an optimal design for this case. With the availability of a base model, we can plan experimental design for the new process by taking advantage of the existence of a base model. This design may not be an optimal for the new process, but it must be a good starting design, because these two processes are assumed to have good similarity. The key challenge of this step is how to develop and identify a guided new DOE method from the base model to discover the two processes’ differences and similarities and to find quickly the features of the new process. 2.3. Assessment of Differences. Ideally, the base model would have been built for one of the processes of interest, and any subsequent models of similar processes would be composed of the original base model plus a simple difference model. However, in practice, the nature of the differences between the processes is not known a priori. Consequently, to obtain a new model, difference information between the processes must be assessed. Furthermore, the objective of this paper is to reduce the number of experiments required to model a new process. The degree of similarity between two processes will have significant impact on the number of experiments that are required: The higher the degree of process similarity, the more experiments can be saved. To the extreme case, when complexity of the difference exceeds that of the process itself, i.e., the process similarity is very small, the proposed approach should not be used. The assessment of the process difference is equivalent to that of the process similarity. Because the knowledge of the new process is limited, the only thing that we know is that the new process is similar to an old process for which we have a good enough model (the base model). In regard to the questions of how similar these two processes are and where and how they are similar, they can be only answered experimentally. The new experiments must be planned carefully. A sufficient number of experiments must be conducted to assess sufficiently information concerning the process difference before a new model can be developed satisfactorily. However, too many experiments result in superfluous work, which defeats the objective of this work. The number of experiments to be planned must be dependent on the complexity of process difference and type of model migration strategies to be adopted. Based on the aforementioned considerations and analysis, it is not possible to give a complete or detailed assessment once for all. For the assessment of the process differences, we suggest that one follow these steps: (i) tentatively formulate hypotheses about the differences; (ii) perform experiments to investigate and verify these hypotheses; and (iii) based on the results, formulate new hypotheses, and so on. This suggests that the process differences assessment is a sequential and iterative learning process, as shown in Figure 2. 2.4. Model Migration and Verification. Having obtained a set of data for the new process, model updating and/or migration strategies should be developed to migrate the base model to the new process, based on assessed process differences. This procedure should (i) retain characteristics that are common to the old process, (ii) remove information that is no longer valid for the new process, and (iii) add, modify, or update the new information evolved in the new process. With the base model, there are two possibilities. One possibility is that the base model is developed by the current user, where, then, the structure of
the base model is known to the user. This is the case where the base model is completely open to the user. In this case, certain model evolution strategies such as neural networks or fuzzy methods may be used to develop the new model.12,13 However, the neural or fuzzy approach required a large number of training data. The other case is more general, where a black-box model (for example, that observed with a commercial software package) is given to the user, so that both the base model structure and the parameters are not known. This black-box model maps only the input-output relationship. This is a most challenge case, which is the one upon which we focus in this paper. Therefore, we develop a method here for model migration, regardless of how the base model has been built. Of course, this poses the biggest challenge for the development of new model. The design of experiment and model migration should be coupled together, to yield a new model with satisfactory prediction accuracy, as illustrated in Figure 2. 3. Possible Cases of Differences In reality, there exist many causes that render the original model invalid. There are many different ways of assessing the differences between the original model and the new process. In this first attempt, we divide the differences in terms of differences complexity. 3.1. Simple DifferencesShift and Scale Difference. The simplest process difference is the new process with just a shift and scale to the old process. The shift and scale difference occurs not only in the output(s), but also in the input(s). The traditional scale-up analysis in the dimensionless group may be helpful to this class of problem; however, it has been limited mostly to the first-principle model. This paper presents a possible solution to this class of problems. 3.2. The Difference is Less Complex than the Process Itself. Because of the changes in process dynamic behavior, the new process cannot be simply described by a shift and scale of the old process. However, the process difference exhibits a relatively simple trend, and this can be corrected by a correction model. 3.3. The Complex Difference. In certain cases, the complexity of the differences between two processes is more complex than that of the process itself. This is a truly challenge case. In this case, it may be possible that it is easier to build a new model rather than focus on model migration. The focus of this paper is on processes with certain similarities. Therefore, this case is not within the focus of the research. 4. Example Solution to a Simple Case With the simplest case, the new process is simply a shift and re-scale of the old process; a systemic procedure has been developed to demonstrate the concept of model migration for reducing the number of experiments required to model the new process. Based on the requirements and challenges discussed in section 2, the overall framework is modified and a possible solution is given for each step, as shown in Figure 3. It begins with the given base model, followed by a DOE step that is guided by the information extracted from the base model. Having collected new data on the new process, experimental data analysis is conducted to determine the type of process differences. As stated previously, the step of process difference assessment is a sequential and iterative learning process and follows the procedure of formulation and verification hypotheses; therefore, this step is changed to formulate and verify the
1970
Ind. Eng. Chem. Res., Vol. 47, No. 6, 2008
Figure 4. Unevenly distributed response.
difference between the average yi value of all observations in the experiment at the high (+) level of A and the average yi value of all observations in the experiment at the low (-) level of A. This difference is called the main effect of A, as shown in eq 1. Generally, A+ and A- are used to represent the high and low levels of A, respectively:14
ME(A) ) yj(A+) - yj(A-)
Figure 3. Flowchart of the solution to a simple case.
hypotheses, regardless of whether the new process is a shift and re-scale of the old process when the simple case is the process of interest. If the verification results mean that the process difference is almost a shift and re-scale case, an output slope/bias correction (OSBC) is conducted on the base model to develop the new model. If the performance of the new model is not good enough, a more-sophisticated procedure, which is known as the input-output slope/bias correction (IOSBC), can then be applied to the base model. The case where the new process is not a shift and re-scale of the old process will be discussed in a future paper. Finally, the performance of the new model is evaluated using the new process data. If the new model prediction criterion is not satisfied, the procedure will go back to the DOE step and more experiments are expected to be conducted. The following sections detail the description for each step. 4.1. Process Information Extraction. The base model is a good representation of the old process, and it consists of much process information. This information can be used to guide DOE design for the new process. In a particular process, different factors (inputs) have different impacts on the response variable (output). To determine how much does the response variable changes when a factor is changed, the main effect of each factor is introduced. The main effect of a factor is defined as the change in response produced by a change in the level of the factor. We use the generic notation yi to denote the observation for the ith experimental runs. For example, to measure the average effect of a factor, for example, A, compute the
(1)
where ME(A) represents the main effect of factor A, yj(A+) is the average of the yi values observed at the high level of A, and yj(A-) is similarly defined. Having analyzed the main effect of each factor, more factor levels should be specified on these factors with a significant main effect. New samples designed by such a strategy will have good representation of the new process and can improve significantly the model prediction precision, as we can focus on the important factors and take less consideration for less important factors to reduce the number of experiments. 4.2. Design of Experiment. As stated previously, a good experimentation should be on a sequential or hierarchical basis, so that, in the beginning, some sparse experiments should be conducted to capture the key features of the new process. These sparse experiments should be conducted at influential and critical points which can efficiently reflect the new process’ behavior. Conventional DOE approaches require the experiments under evenly distributed input conditions, and each possible DOE point is treated equivalently, as no prior process knowledge is available. In fact, due to the nonlinearity of input-output relationship, output is unevenly distributed even with evenly distributed inputs. Similar to the theory of robust parameter design,14 more experiments are needed to perform in the region where the output is more sensitive to a change of inputs. To facilitate a more formal discussion, denote the input factor by x, and the relationship between x and the output response y is given by
y ) f(x)
(2)
Obviously, as shown in Figure 4, if function f has a nonlinear effect, evenly distributed factor levels will lead to unevenly distributed y value through f. The factor at low levels will has significant effect on response value. To better describe the change of response and improve model prediction ability, more experiments should be performed at region where input factors have a significant effect on output or at region with high data density if we consider input and output data together. The cluster estimation method, developed by Chiu,15 can group a large data set into several cluster centers. Considering
Ind. Eng. Chem. Res., Vol. 47, No. 6, 2008 1971
the main effect of each factor and nonlinear characteristic of output space together, the cluster estimation method is a good choice to design new experiments for the new process, as it can generate more cluster centers to describe the region with high data density and each cluster center is a good representative of a local region. Before we conduct the cluster estimation method, sufficient data set can be readily generated at almost no experimental cost from the base model to describe the old process behavior. Generation of the N datapoints starts with discretization of the input space by gridding X with equidistant lines on the base model, and these input data combined with corresponding y-value to form N input-output data pairs (Z ) [XT; y]), where X denote the inputs, y denotes the output, and Z is the augmented data vector. The distance between data points should be similar to the resolution of corresponding physical variables. By applying the cluster estimation method on the generated dataset, the obtained cluster centers are critical and influential data points that can efficiently reflect the behavior of the old process. This optimal design on the old process then can be migrated to the new process by conducting experiments at each center to capture the new process behavior. The clustering algorithm considers each data point Zi as a potential cluster center, and a measure of its potential is defined as15
(
N
Pi )
exp ∑ j)1
4 2
)
|zi - zj|2
r
(for i ) 1, ..., N)
(3)
where Pi represents the potential of the ith data point, N denotes the number of data points surrounding the potential center, and r is the effective radius of a cluster center, defining a neighborhood; data points outside this radius have little influence on the potential. Thus, the measure of potential for a candidate data point is a function of its distances to all other data points. The closer a data point is to a candidate data point, the more it contributes to the potential. The traditional subtractive clustering uses a uniform radius for each dimension of input-output space, which makes effective ranges of the cluster center on each dimension equal. In fact, the input may have a different impact on the output, as dictated by its main effect calculated from base model. To better describe such input impact on output, a nonuniform effective radius that is based for measure of the potential of data point Zi is proposed: N
Pi )
N
)
(
exp ∑ j)1
[(
exp ∑ j)1
)
4|zi - zj|2 r2
4|zi,1 - zj,1|2
4|zi,2 - zj,2|2
+
r12
‚‚‚
N
)
∑ j)1
(
exp -
m+1
4|zi,k - zj,k|
k)1
rk2
∑
+
r22
)
)]
4|zi,m+1 - zj,m+1|2 rm+12
in Chiu’s paper.15 Estimation of input impact can be done by calculating the main effect of the input variable.14 By properly setting the effective radius of each dimension, the number and location of cluster centers can be found. Obviously, the region where input variables have significant impact on the output will have a greater density of data points and more cluster centers are expected to be found. Each cluster center is, in essence, a prototypical data point that exemplifies a characteristic behavior of the old process. Naturally, new experiments concluded that such cluster centers may potentially describe the key features of the new process best. 4.3. Formulation and Verification of Difference Hypotheses. The assessment of difference is a process of hypotheses formulation and checking, as stated previously. For the case of this paper, we assume that the new process is a shift and rescale of the old process; therefore, the assessment of difference is to determine if there is a linear relationship between predictions (xj) from the base model and observed data from the new process (yj), using linear regression. The degree of linearity can then be checked using the correlation coefficient (R2), which is defined as m
R2 ) 1 -
(yˆ j - yj)2 ∑ j)1 (5)
m
∑ j)1
(yˆ j - yj)
2
where yj is the observed data from the new process, yˆ j the estimation of the observed data from a fitted straight line, yj the sample mean of the observed data, and m is the total number of data sets. If the R2 value is larger than a specified threshold, this confirms that the new process is simply a shift and re-scale to the old process, and a new model can be obtained using OSBC or IOSBC to the base model, as stated later in this paper. However, if the R2 value is smaller than the threshold, this means simple slope/bias correction method is not sufficient to correct the complex difference between two processes. In this case, more-sophisticated methods must be applied. The choice of the threshold for the R2 value is based on the prediction error of the base model and the measurement noise of data collected from the process. The recommended threshold for R2 is defined by one minus the summation of the prediction error of the base model and the measurement noise. 4.4. Slope/Bias Correction (SBC). If the results of the assessment of differences indicate that the new process is a shift and re-scale of the old process, then the new model can be obtained through a slope/bias correction (SBC) to the base model. This correction involves input and output correction. Given the base model of any form,
Ybase ) f(Xbase)
(6)
2
(4)
where k denotes the dimension of the input-output space and the term m + 1 denotes the output space. The selection of effective radii is based on the input impact on output: the larger the impact, the smaller the radius. Thus, the influential space of each center is changed from a hypersphere to a hyperellipsoid. Detailed literature on finding the cluster centers can be found
where Xbase and Ybase are the inputs and output of base model, respectively, and f is any nonlinear function used to describe the old process. If there is only a shift and scale in output space, the new model is obtained through OSBC correction to the base model:
Ynew ) SOYbase + BO ) SO f(Xbase) + BO
(7)
where SO and BO represent re-scale and shift of the old process
1972
Ind. Eng. Chem. Res., Vol. 47, No. 6, 2008
Figure 5. Slope/bias correction (SBC).
in output space, respectively. Estimation of the aforementioned parameters can be obtained through ordinary least-squares methods. More generally, shift and re-scale occur in not only output but also input space, thus, inputs of new process Xnew should be transformed to those of the old process (Xbase) by input slope and bias correction (ISBC), as shown in eq 8:
Xbase ) SIXnew + BI
Figure 6. Mold I with flat cavity.
(8)
where SI and BI denote the re-scaling and shifting of the old process in input space, respectively. The new model then is obtained through IOSBC correction to the base model, as shown in Figure 5:
Ynew ) SO f(SIXnew + BI) + BO
(9)
Estimation of the aforementioned parameters can be obtained by optimizing the following equations, together with the training data from the new process:
Min J(SO, BO, SI, BI) ) T
(10)
s.t. i ) yi - [SO f(SIXnew,i + BI) + BO] where yi is the ith observed data on the new process and denotes the predicted error between the observed values on the new process and the predicted values from the base model. We should note that all identified shift and re-scale parameters are vectors determined by the dimensions of input and output space. Generally, OSBC will work well when the differences between two similar processes are simple and consistent in the entire operating region of interest. Because this method is a univariate approach, and only two parameters need to be identified, the time and effort needed to build such a new model are significantly small. However, when the process differences are more complex and differ from sample to sample, procedures such as IOSBC or other approaches may be needed to obtain acceptable prediction results. When we apply IOSBC on the base model, not all factors or inputs are transformed, but only the factors with a significant main effect are considered, which can save new experiments to develop the new model, because the number of new experiments is dependent on the number of parameters to be identified in IOSBC. We should note that factors with low main effects but significant quadratic or other nonlinear impacts on response cannot be ignored. 5. Application The performance of the proposed methodology was evaluated on injection molding with different molds. Injection molding, which is an important polymer processing technique, makes different shapes and types of products with different polymer materials and molds. The product qualities are the results of different combinations of material properties, mold and part geometries, and processing conditions. These qualities can be
Figure 7. Mold II with fan gate. Table 1. Factors and Levels for Full Factorial Design Value factor
level 1
level 2
level 3
packing pressure, PP (bar) injection velocity, IV (mm/s) barrel temperature, BT (°C)
150 8 180
300 24 200
450 40 220
roughly divided into dimensional property, surface property and mechanical or optical property. For simplicity, dimensional properties, such as the width of the injection-molded part, are selected as the product quality in this paper. To illustrate the proposed method, the experiments were performed on an 88-ton Chen Hsong reciprocating-screw injection molding machine (Model No. JM88MKIII) with two different molds. The processes with mold types I and II are considered to be the old process and the new process, respectively. The two mold geometries are shown in Figures 6 and 7, respectively. To focus on the mold influence on product quality, the same material (high-density polyethylene, HDPE) is used in these two processes. In the analysis, three process variables, including the packing pressure, injection velocity, and barrel temperature are chosen to describe the width of the part, because they have a relative significantal effect on part dimension. To build the base model, a full factorial design with 27 experimental runs is conducted, with each factor having three levels, as shown in Table 1. An analysis of variance (ANOVA)
Ind. Eng. Chem. Res., Vol. 47, No. 6, 2008 1973 Table 2. Experiments on Old and New Process packing pressure, PP (bar)
injection velocity, IV (mm/s)
barrel temperature, BT (°C)
experiment
old process
new process
old process
new process
old process
new process
1 2 3 4 5 6 7 8 9
293 415 187 150 375 222 375 337 259
395 477 325 300 450 348 450 423 373
24 24 24 10 38 38 10 10 38
32 32 32 25 39 39 25 25 39
200 200 200 185 215 220 180 220 180
210 210 210 203 218 220 200 220 200
is composed of experimental data and a quadratic polynomial regression model is built as the base model with the relative root-mean-square error (R-RMSE) of 0.1403%. R-RMSE is defined as
R-RMSE )
x
N
∑ i)1
( ) Yi - Yˆ i
2
Yi
N
experiment
base model (mm)
corrected model with OSBC (mm)
corrected model with IOSBC (mm)
actual value (mm)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
116.74 116.69 116.64 117.06 117.57 117.47 116.58 116.53 117.00 116.90 117.41 117.36 117.31 116.42 116.37 116.32 116.79 116.74 117.26 117.20
97.42 97.37 97.32 97.74 98.26 98.15 97.26 97.21 97.68 97.58 98.10 98.05 98.00 97.10 97.05 97.00 97.50 97.42 97.94 97.89
97.42 97.35 97.29 97.73 98.30 98.17 97.28 97.21 97.72 97.58 98.16 98.09 98.02 97.13 97.07 97.00 97.51 97.44 98.01 97.95
97.46 97.29 97.08 97.41 98.35 98.22 97.36 97.15 97.74 97.49 98.28 98.18 98.11 97.32 97.01 96.94 97.71 97.62 98.23 98.10
(11)
where Yi is the observed data from the old process, Yˆ i the estimated value from the base model, and N the number of data points. Assuming that we have no idea of what the base model structure should be, with the exception of an input-output relationship, a set of data are generated from the base model to calculate the main effect of each factor according to eq 1. Calculated results show that the packing pressure has a significant impact on the quality of the part, and the injection velocity and barrel temperature have relatively less significant but similar impact on the quality of the part. Furthermore, there are eight possible parameters to be identified with IOSBC correction of three input factors and one output. Consequently, the three effective radii for clustering estimation are selected, as the following equation shows:
[r1, r2, r3] ) [0.35, 0.9, 0.8]
Table 3. Comparison of the Predicted Values from the Base Model and Corrected Models
(12)
A set of generated data points calculated from the base model then are clustered to represent the experimental conditions for the new process. Clustering results indicate that nine experiments should be conducted on the new process in which mold type II is applied. These new experiments cannot be directly conducted on the new process before the initial transformation of inputs, because each process variable has different operating ranges, which is due to the change of the mold. The linear transformation of experimental conditions between old process and new process is applied:
Connew - Connew,min Conold - Conold,min ) (13) Conold,max - Conold,min Connew,max - Connew,min where Conold and Connew represent the experimental conditions of old and new processes, respectively. Conold,max and Conold,min represent the high boundary and low boundary of operating conditions of old process, and similarly definitions are given for Connew,max and Connew,min. Given the new operating ranges (packing pressure ) 300-500 MPa; injection velocity ) 2440 mm/s; and barrel temperature ) 200-220 °C), Table 2 shows the experimental conditions of the old process and the new process. Note that this type of input transformation is different from the SBC of inputs. Clustering results also indicates
that the packing pressure, injection velocity, and barrel temperature have 8, 3, and 5 levels, respectively. This is in accordance with previous analysis results that the packing pressure has a more-significant main effect than other two. Having performed nine new experiments on the new process, an OSBC correction is applied on the base model, using the first four new experimental data points. The verification of the new model shows that the R-RMSE of corrected model is equal to 0.1682%, slightly larger than that of the base model. To improve the prediction ability of new model further, an IOSBC correction is applied on the base model, using all nine new experimental data points. The verification result of new model shows a similar R-RMSE of 0.1435%, as compared with that of base model. The verification results of the corrected model are based on 20 unseen data, which are not used for migrating the base model. Table 3 shows the prediction results from the original base model, the corrected model with OSBC, the corrected model with IOSBC, and 20 actual data. The second column shows there are significant prediction errors with the results predicted directly from the base model without any correction. The third column shows significant improvement of predictions from new model corrected by OSBC is obtained compared with the base model. Further improvement of predictions from new model corrected by IOSBC is also found in the fourth column. Predicted values from the two new corrected models are very close to the actual values. 6. Conclusions Developing an accurate process model for industrial processes is generally a difficult and expensive task. In this paper, we have outlined a model migration method by taking advantage of a base model developed for an old process, using a limited number of experiments. A systematic approach, including information extraction from a base model, design of experiment, assessment of difference, model migration strategy and model verification has been proposed. For the simple case where the new process is a shift and re-scale of the old process, the new model is obtained through a slope/bias correction to the base model, using fewer data than those required for development of base model. This shows the proposed method is effective.
1974
Ind. Eng. Chem. Res., Vol. 47, No. 6, 2008
Acknowledgment This work is supported in part by the Hong Kong Research Grant Council (under Project No. CERG 612906). Literature Cited (1) Lotti, C.; Ueki, M. M.; Bretas, R. E. S. Prediction of the shrinkage of injection molded iPP plaques using artificial neural networks. J. Injection Molding Technol. 2002, 6 (3), 20. (2) Li, E.; Li, J.; Yu, J. A Genetic Neural Fuzzy System and Its Application in Quality Prediction in the Injection Process. Chem. Eng. Commun. 2004, 191 (3), 335-355. (3) Lee, D. E.; Song, J. H.; Song, S. O.; Yoon, E. S. Weighted Support Vector Machine for Quality Estimation in the Polymerization Process. Ind. Eng. Chem. Res. 2005, 44 (7), 2101-2105. (4) Lu, N.; Gao, F. Stage-Based Process Analysis and Quality Prediction for Batch Processes. Ind. Eng. Chem. Res. 2005, 44 (10), 35473555. (5) Psichogios, D. C.; Ungar, L. H. A hybrid neural network-first principles approach to process modeling. AIChE J. 1992, 38 (10), 14991511. (6) Jaeckle, C. M.; MacGregor, J. F. Industrial applications of product design through the inversion of latent variable models. Chemom. Intell. Lab. Syst. 2000, 50 (2), 199-210. (7) Jaeckle, C. M.; MacGregor, J. F. Product Design through Multivariate Statistical Analysis of Process Data. AIChE J. 1998, 44 (5).
(8) Jaeckle, C. M.; MacGregor, J. F. Product transfer between plants using historical process data. AIChE J. 2000, 46 (10), 1989-1997. (9) Garcia, Munoz, S.; MacGregor, J. F.; Kourti, T. Product transfer between sites using Joint-Y PLS. Chemom. Intell. Lab. Syst. 2005, 79 (12), 101-114. (10) Feudale, R. N.; Woody, N. A.; Tan, H.; Myles, A. J.; Brown, S. D.; Ferre, J. Transfer of multivariate calibration models: a review. Chemom. Intell. Lab. Syst. 2002, 64 (2), 181-192. (11) Montgomery, D. C. Design and Analysis of Experiments, 6th Edition; Wiley: New York, 2005. (12) Angelov, P. P.; Filev, D. P. An approach to online identification of Takagi-Sugeno fuzzy models. IEEE Trans. Syst., Man. Cybern., Part B 2004, 34 (1), 484-498. (13) Alexandridis, A.; Sarimveis, H.; Bafas, G. A new algorithm for online structure and parameter adaptation of RBF networks. Neural Networks 2003, 16 (7), 1003-1017. (14) Jeff Wu, C. F.; Hamada, M. Experiments: Planning, Analysis, and Parameter Design Optimization; Wiley: New York, 2000. (15) Chiu, S. L. Fuzzy model identification based on cluster estimation. J. Intell. Fuzzy Syst. 1994, 2, 267-278.
ReceiVed for reView April 3, 2007 ReVised manuscript receiVed September 25, 2007 Accepted January 3, 2008 IE0704851