Article pubs.acs.org/IECR
Quality Control of Batch Processes Using Natural Gradient Based Model-Free Optimization Fei Zhao,†,§ Ningyun Lu,*,† and Jianhua Lu‡ †
College of Automation Engineering, Nanjing University of Aeronautics & Astronautics, Nanjing, P. R. China School of Computer Science and Engineering, Southeast University, Nanjing, P. R. China
‡
ABSTRACT: Model-based optimization (MBO) has been widely applied for quality control of batch processes; however, it is not easy to obtain a globally effective and accurate quality model with affordable effort. Instead of building a quality model, model-free optimization (MFO) uses process data directly, which is more efficient and economic for quality control of batch processes. Considering the complex nonlinearity and dynamics in batch processes, a quality control scheme using natural gradient based optimization is proposed in this paper. Optimization algorithm is developed from the aspect of manifold in nonEuclidean space. An approximation method is derived for the calculation of the natural gradient, and a multivariate iterative sensitivity matrix based on Riemannian geodesic distance is proposed to obtain a novel adaptive stepping strategy. The proposed quality control scheme has been verified in the injection molding process. A set of comparison tests are presented to demonstrate the feasibility and effectiveness of the proposed method.
1. INTRODUCTION Product quality control of batch processes is a challenging research topic due to high nonlinearity, complex batch and/or time dependent dynamics, and time-varying relationships between high-dimensional process variables and end-product qualities. Quite different from process level control strategies which come in a great variety, process optimization based quality control approaches predominate in batch processes. That is, quality control is conducted by searching for the optimal process conditions via an optimization method to meet the end-product quality requirements. Traditionally, accurate quality models are preliminarily required.1,2 Thus, this kind of quality control is called a model-based optimization (MBO) approach. Quality models can be grouped into two types: the firstprinciple models and data-driven models. The first-principle models are generally derived from process mechanisms such as momentum, heat and mass transfer in the fields of physics, chemistry, engineering, and so on.3 However, rigorous firstprinciple models are difficult to obtain in most complex industrial batch processes. Data-driven models, on the contrary, have been widely used in practice, because they rely on only designed experiments or accumulated process data.4 Various types of data-driven quality models have been investigated. Identified models are often used in iterative learning control of product quality in some batch processes.5,6 Multivariable statistical modeling techniques such as principle component regression (PCR) and partial least square (PLS) have been adopted in industry.7,8 Other data-driven modeling techniques such as artificial neural network (ANN) and support vector machine (SVM) are also reported in the literature.9,10 To develop a data-driven quality model, a large number of experiments or complete historical process data are basically required, which is however time-consuming and economically inefficient. Anyhow, it is not easy to obtain globally effective and accurate quality model with affordable effort,11 especially © 2014 American Chemical Society
for the batch processes that involve frequent process changeovers or serious process uncertainties. Instead, model-free optimization (MFO) based quality control is arousing attentions recently. MFO based quality control uses experimental measurements rather than a quality model to search for the optimal process settings. The reasons have been discussed comprehensively in refs 12 and 13 why MFO methods are useful for quality control of batch processes (especially for a type of batch process like injection molding process that has short cycle time and low operational cost). MFO based quality control schemes can be further classified into within-batch control and batch-to-batch control, depending on the type of update used (inter-run or intra-run) and the type of measurements available (batch-end or online).14 Since online measurements of end-product quality are usually not available in most of industrial processes, batchto-batch quality control is the main stream. The proposed quality control scheme in this paper falls within the scope of batch-to-batch optimization. In a batch-to-batch optimization problem, MFO requires a suitable measurement driven approach to search for the optimal direction. The search algorithms can be generally categorized into two types: gradient-free and gradient-based. Gradient-free algorithms are usually called direct search. They can be performed without calculating the gradients; therefore, they can reduce the number of experiments in the MFO. The simplex search15 is a classical gradient-free algorithm, which has been successfully applied to the injection molding process by Kong et al.12 Gradient-free algorithms are simple and effective, but they are suitable only for low-dimensional problems, and they often suffer from the slow convergence speed. Gradient-based Received: Revised: Accepted: Published: 17419
June 11, 2014 September 25, 2014 September 30, 2014 September 30, 2014 dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428
Industrial & Engineering Chemistry Research
Article
The remainder of this paper is structured as follows. In section 2, the formulation of NG based MFO for quality control of batch processes is given at the beginning; then, details on how to calculate the NG and how to determine the step size are presented, followed by a convergence analysis of the proposed quality control scheme; finally, several practical aspects of implementation including data normalization, startup, and termination of the iterative optimization are discussed. An illustrative example is given in section 3 for the application of the proposed quality control scheme to an injection molding process. Results and discussions based on a set of comparison tests are presented to demonstrate its feasibility and effectiveness in section 4. Finally, conclusions are drawn in section 5.
algorithms use gradient to provide the search direction, which can ensure faster convergence than the gradient-free algorithms. However, gradient calculation (or approximation) is a difficult task. Considering that the convergence of product quality is very important as it concerns the production cost and profit, gradient-based optimization is studied in this paper. Concerning different implementations of gradient approximation, gradient-based MFO approaches can be further divided into two types: deterministic gradient based and stochastic gradient based. Deterministic gradient algorithms,16,17 such as the Newton method, the conjugate gradient, and quasi-Newton method, can compute gradient exactly using the derivative of the cost function. However, since these methods are based on quadratic approximation, they work well only in a small neighborhood of the optimal point. In addition, the insufficient understanding of process mechanism and uncertainties limits their applications to large-scale problems. On the other hand, stochastic gradient algorithms are conducted by stochastic approximation of the derivative.18 Variable perturbation19 and finite differencing20 are the regular tools to obtain the gradient approximation. Two-point gradient algorithm21 and simultaneous perturbation stochastic approximation (SPSA)22 are the most widely used. They have in common that the next operating profile is deduced from the current one and the previous ones, so they have super linear convergence speed.23 This property makes them quite attractive for MFO based quality control of batch processes. Unfortunately, stochastic gradient based methods have to endure high optimization cost as they need a lot of perturbation experiments for gradient approximation. In order to reduce the optimization cost of stochastic gradient approximation methods, natural gradient (NG)24 was then proposed and has become an attractive concept in the domain of stochastic search. NG is based on the Riemannian geometry rather than the conventional geometry in the Euclidean space, and it employs the knowledge of Riemannian structure of the parameter space to adjust the gradient search direction. Compared with deterministic gradient and stochastic gradient, it is possible to use NG to obtain a more reasonable direction of steepest ascent.24 Another possible advantage of using NG is that it does not require that the cost function is locally quadratic.25 NG has been widely used in blind source separation,26 motion segmentation,27 medical imaging,28 etc. The application of NG to industrial process optimization is, however, not reported yet. Considering the inherent complex nonlinearity and dynamics of batch processes, better quality control performance can be expected by using the NG based MFO strategy. This paper aims to develop a quality control scheme for batch processes using NG based MFO strategy. Two key contributions are summarized as below. (1). We analyze batch process data from the aspect of manifold in non-Euclidean space, and propose an NG based MFO algorithm to solve the quality control problem for batch processes. (2). An approximation method is derived for calculation of NG, and a multivariate iterative sensitivity matrix is obtained based on Riemannian geodesic distance to realize a novel adaptive stepping strategy, which can improve the convergence speed in MFO and therefore reduce the cost of MFO-based quality control for batch processes.
2. NOVEL QUALITY CONTROL SCHEME FOR BATCH PROCESSES 2.1. MFO Based Quality Control. Product quality control of batch processes can be converted into an optimization problem as12 min J = |E(Q ) − Q tg| x
s. t.
R iL
≤ xi ≤ R iH ,
i = 1, ..., p
x = (x1 , ..., xp)T
(1)
where x is a p-dimensional vector of the operating variables, J represents an objective function (or cost function), Q is the quality index to be controlled, E(Q) represents the expectation of batch-end quality measurements within a quality control period, Qtg is the target product quality, and RLi and RHi define the feasible operating window for the corresponding operating variable. The objective of the above control problem is to search for the optimal process settings over a few batches to meet the end-product quality requirements. In a gradient descent based batch optimization problem, iterative optimization strategy29 is often adopted to determine the optimal process settings for the next batch as follows: x k + 1 = x k − μk ∇J(x k),
k = 1, 2, ...
(2)
where k is the index of iteration, ∇J(xk) is the gradient (i.e., the optimization direction), and μk is the step size along the optimization direction. Computations of ∇J(xk) and μk are the two most important issues which determine the optimization performance. All the existing optimization-based quality control schemes use the ordinary gradient in Euclidean space to provide the searching direction. However, for batch processes which are often characterized by strong nonlinearity and complex dynamics, the operating variable space is not Euclidean but has a Riemannian metric structure, where the ordinary gradient does not give the steepest direction of a target function.24 In a Riemannian space, manifold is an important concept. A manifold is a topological space that resembles Euclidean space near each point in differential geometry, and the relationship between the points can be represented by metric tensor.28 In batch processes, the operating points in the parameter space can be viewed as such points located on a manifold that reflects the relationship between the local Euclidean space and the global Riemannian space. This laid the foundation of modeling and monitoring of batch 17420
dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428
Industrial & Engineering Chemistry Research
Article
processes.30 In short, differential geometry based gradient calculation method is more suitable for optimization problems in batch processes. The reasons can be summarized as below: (1) Topological manifold can be defined in a batch process since every point in the operating variable space can have a neighborhood homeomorphic to the open set of a given quality measurement; (2) The analytic manifold of variable space in a batch process is approximately complete if ignoring the process settings on the boundary; (3) The manifold is compact because the variables’ settings in a batch process consist of a bounded closed set. NG is a good choice, which was developed from a perspective of Riemannian geometry aiming to obtain a more accurate search direction than the ordinary gradient in Euclidean space.24 Using NG, the iterative optimization formulated in eq 2 can be modified as x k + 1 = x k − μk ∇̃J(x k),
k = 1, 2, ...
point xk‑1 can be described by the exponential map and the logarithmic map27 (5)
log x (x k) = Δ
(6)
k−1
k−1
Equation 5 is the exponential map, which maps Δ to the point xk (i.e., α(1)) on the manifold reached by the geodesic curve. Equation 6 is the logarithmic map, which is the inverse of the exponential map. Then, the Riemannian gradient at the point xk, ∇JRg(xk), can be obtained by31 ∇JRg (x k) = −log x (x k)
(7)
log x (x k) = x k − 1 log(x k − 1−1x k)
(8)
k‐1
k−1
where
(3)
where ∇J(̃ xk) is NG, defined by ∇̃J(x k) = G−1(x k)∇J(x k)
expx (Δ) = x k
∞
log(x k − 1−1x k) = (4)
∑ i=0
( −1)i − 1 (x k − 1−1x k − I)i i
(9)
and x−1 k−1 is the pseudo inverse of xk−1, I is a p-dimensional identity matrix. With the Riemannian gradient ∇JRg(xk), the Riemannian metric G(xk) can be estimated by32
The NG at the point xk is the product of the inverse of the Riemannian metric G(xk) and the ordinary gradient ∇J(xk). G(xk) is a positive definite matrix, reflecting the local data structure on a Riemannian manifold. In order to realize eq 3, the Riemannian metric, G, should be obtained first for the calculation of the NG, ∇̃ J(xk), and then, the step size, μk, needs to be determined from the perspective of a Riemannian manifold. 2.2. Calculation of NG. Due to the sparse and limited available experiments for the MFO-based quality control scheme of industrial batch processes, NG has to be solved approximately. In terms of eq 4, the NG ∇̃ J(xk) depends on the Riemannian metric G(xk) and the ordinary gradient ∇J(xk). Riemannian metric will be derived from Riemannian distance, and the ordinary gradient will be obtained by the widely used two-point gradient method.21 2.2.1. Riemannian metric. Assuming that xk‑1 and xk are two adjacent iteration points in the operating space of a batch process, they can be viewed as two points lying on a Riemannian manifold M, as shown in Figure 1.
G(x k) = ∇JRg (x k)(∇JRg (x k))T
(10)
2.2.2. Calculation of the Ordinary Gradient. In section 2.1, quality control of batch processes has been converted into a batch-to-batch optimization problem that aims to find the optimal settings of operating variables x* such that the gradient of the objective function J at x* is zero. It is assumed that endbatch quality measurements Q(xk) (simply denoted as Qk) are available under the operating settings xk. The recursive procedure (eq 2 or eq 3) is in the general stochastic approximation (SA) frame. Finite difference (FD) and simultaneous perturbation (SP) are two general forms of gradient approximation. In this paper, the two-point gradient method (a variation of SPSA) is used to compute the ordinary gradient. In terms of SPSA method, all elements of xk are randomly perturbed together to obtain two measurements, Q(xk + ckΔk) and Q(xk − ckΔk). Then, the gradient can be obtained approximately by ∇ J (x k ) ≈
Q (x k + ck Δk ) − Q (x k − ck Δk ) 2ck Δk
(11)
where Δk is a perturbation vector at the kth iteration, ck is the corresponding gain, xk + ckΔk and xk − ckΔk are two adjacent perturbation operating points. For convenience’s sake, suppose the iteration point is ((xk + xk−1)/2), and the perturbation vector ckΔk is ((xk − xk−1)/2), it is easy to get
Figure 1. Illustration of the exponential and logarithmic mapping between xk−1 and xk.
x − xk−1 ⎞ ⎛ x + xk−1 ⎟ Q (x k) − Q (x k − 1) = Q ⎜ k + k ⎝ ⎠ 2 2 x − xk−1 ⎞ ⎛ x + xk−1 ⎟ − Q⎜ k − k ⎝ ⎠ 2 2
Txk−1M is the tangent plane at point xk‑1. For a tangent Δ ∈ Txk−1M, there is a unique geodesic curve α(t) ∈ M (i.e., the dotted line connecting xk‑1 to xk in Figure 1), parametrized by t ∈ [0,1], starting at xk‑1 with an initial velocity Δ = α′(0). Apparently, α(0) = xk−1 and α(1) = xk. The relationship between the two points xk‑1 and xk and the tangent Δ at the
(12)
Then, the gradient at (xk + xk−1)/2 is 17421
dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428
Industrial & Engineering Chemistry Research
Article
Q (x k) − Q (x k − 1) ⎛ x + xk−1 ⎞ ⎟ = ∇J ⎜ k x −x ⎝ ⎠ 2 2· k 2 k−1 =
Q (x k) − Q (x k − 1) xk − xk−1
dk = d(x k − 1, x k) =
(13)
(14)
With eqs 10 and 14, it is easy to approximate the NG at the kth iteration using eq 4. 2.3. Determination of Step Size. Different step size rules lead to different optimization convergence performances. With respect to whether or not the step size is deterministic, step size rules can be classified into two branches: regular step size,16 such as constant value, given design formula, one-dimension search (including minimization rule, approximate minimization rule, Armijo rule, Goldstein rule, Wolfe rule, etc.) and Newtontype method; and adaptive step size,33,34 such as simulated annealing, fuzzy control, etc. From the angle of industrial applications, a regular step size might not satisfy both the convergence speed and the tracking performance, while an adaptive step size can accelerate the convergence generally. Thus, adaptive step size is the focus in this paper, because most batch processes are used to produce low-volume and high-value products, and they have high requirement on the convergence speed. An adaptive step size algorithm based on Riemannian geodesic distance and iterative sensitivity matrix is proposed to determine the parameter μk in eq 3. 2.3.1. Riemannian Geodesic Distance. The Riemannian metric that has been introduced in section 2.2.1 cannot be directly used to compute the geodesic distance, because it is a simplified analytic algorithm that approximately represents the neighborhood feature of the local single data point. It is necessary to choose a symmetric positive metric performing Riemannian manifold for calculating the Geodesic curve function. Let M be a Riemannian manifold. The tangent space at x ∈ M to the manifold is denoted by Tx. Denote the metric of Riemannian manifold by ⟨·,·⟩x:Tx × Tx → R. For a tangent vector v ∈ Tx at the point x ∈ M,
⟨v , v⟩x = tr((x−1v)2 )
sk = (Q k − Q k − 1) × (x k − x k − 1)−1 = [sk1 , sk2..., skp]T (19)
where (xk − xk−1)−1 is calculated by the pseudo inverse operation. In order to obtain the accumulative effect of each variable, it is necessary to calculate the sensitivity iteratively. The iterative sensitivity of the ith variable is
Si =
α(t ) = x exp(tx v)
∫0
1
tr(((α(t ))−1α(̇ t ))2 ) dt
|ski| ∑p |sk|
(20)
The iterative sensitivity matrix for all operating variables is given as ⎡ S1 ⎢ ⎢0 Sk = ⎢ ⎢⋮ ⎢⎣ 0
0 ··· 0 ⎤ ⎥ S2 ··· ⋮ ⎥ ⎥ ⋮ ⋱ 0⎥ ··· 0 S p ⎥⎦
(21)
With eqs 18 and 21, the proposed variable step size method, named geodesic stepping matrix, can be computed by Μk = dk
Sk tr(Sk )
(22)
Equation 22 extends the step size from scalar to matrix, so that the variable with a high sensitivity can obtain a relatively high stepping size during the iterative optimization process. Based on the above, eq 3 can be rewritten as
(15)
x k + 1 = x k − Μk ·∇̃J(x k),
(16)
k = 1, 2, ...
(23)
2.4. Convergence Analysis. The conventional two-point gradient calculation method (eq 14) and SPSA (eqs 11 and 13) are based on finite difference essentially. The former is a kind of forward difference, and the latter is a kind of central difference. From eq 14, the two-point gradient calculation method is an approximation of SPSA, so the convergence performance should be compared with SPSA. By Tailor expansion, the gradient at xk can be deduced by a backward difference method as
Then, the geodesic distance between two points x1 and x2 on the manifold is given by d 2(x1 , x 2) =
∑ k
where the symbol “tr” represents trace computation in the matrix operation. Based on this metric, the following geodesic curve function can be derived:35 −1
(18)
2.3.2. Iterative Sensitivity Matrix. From eq 18, the Riemannian geodesic distance dk is a scalar. If multiplying this scalar distance by the obtained NG ∇̃ J(xk), each operating variable in xk will get the same stepping speed. It is unreasonable because different variables may have different effects on the end-batch product quality. In order to overcome this problem, the following iterative sensitivity matrix is developed for determining the step size. Sensitivity analysis can increase the understanding of relationships between the input and output variables of a process. Multivariable sensitivity can be computed by
If the two points are very close, the gradient at xk can be calculated approximately by ⎛ x + xk−1 ⎞ Q k − Q k−1 ⎟ = ∇ J (x k ) ≈ ∇ J ⎜ k ⎝ ⎠ 2 xk − xk−1
tr(log 2(x k − 1−1x k))
(17)
Back to our problem of optimization based quality control of batch processes, the Riemannian geodesic distance between two adjacent iterative operating points xk−1 and xk can be obtained by 17422
dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428
Industrial & Engineering Chemistry Research
Article
approximate version of SPSA, while the convergence of SPSA has been proven.22,36 Therefore, it is possible that the NG can obtain convergent directions. As mentioned in section 2.1, eq 1 is an MFO problem. Since there is no explicit expression of derivative, the difference quotient is used instead of the derivative. The consequent problem using difference quotient is whether the search direction is descending to this minimization problem. Since Qtg is used to judge the gradient direction (positive and negative) with the obtained part quality of current batch, it can guarantee the steepest descent direction of the minimization problem. Rigorous proof of the convergence has not been obtained yet, although the application results in section 3 have shown that the proposed quality control scheme has better convergence performance compared with SPSA-based quality control. 2.5. Several Practical Aspects of Implementation. 2.5.1. Data Normalization. All variables’ measurements will be normalized into the range [0, 1] to ensure a unified scale for different operating variables. At the end of iteration, the obtained process settings need to be transformed back into their physical settings (i.e., the denormalization) for manipulating the machine. The physical settings must be within the feasible region. If any of the operating settings exceeds the region, its nearest boundary value should be used to replace the calculated value. 2.5.2. Start-up of the Iterative Optimization. Assume that the tolerance of target deviation is ε, which is chosen according to the customer’s requirements. The quality control procedure is triggered when the quality exceeds the tolerance. By this time, the iteration index k is set to 1. Since the iterative optimization needs at least two initial points, the second point should be determined at the same time, which can be obtained by simultaneous perturbation method as
Q (x k) − Q (x k − 1) xk − xk−1 Q (x k) − Q (x k − (x k − x k − 1)) = xk − xk−1
∇ J (x k ) =
⎡ ⎛ d Q (x k ) = ⎢Q (x k) − ⎜Q (x k) − (x k − x k − 1) · ⎢⎣ d(x k) ⎝ 2 ⎞⎤ d Q (x k ) 1 + (x k − x k − 1)2 + o((x k − x k − 1)2 )⎟⎥ 2 2! d(x k) ⎠⎥⎦ /(x k − x k − 1) ≈
d Q (x k ) d2Q (x k) 1 − (x k − x k − 1) d(x k) 2! d(x k)2 (24)
The gradient at (xk + xk−1)/2 shown in eq 14 can be deduced by a central difference method using Tailor expansion as Q (x k) − Q (x k − 1) ⎛ x + xk−1 ⎞ ⎟ = ∇J ⎜ k ⎝ ⎠ 2 xk − xk−1 x +x x −x Q k 2 k−1 + k 2 k−1 − Q = x −x 2 k 2 k−1
(
( x +2x
)
⎡ xk + xk−1 ⎢ x k − x k − 1 dQ 2 = ⎢2 x + x k k−1 2 d ⎢⎣ 2
( (
)+ )
k
k−1
−
xk − xk−1 2
)
2 ⎛⎜ x k − x k − 1 ⎞⎟3 ⎠ 3! ⎝ 2 ⎤
( x +2x ) + o⎛⎜⎛ xk − xk−1 ⎞3⎞⎟⎥/⎛2 xk − xk−1 ⎞ 3 ⎠ ⎠⎥ ⎝ ⎠ x +x 2 2 ⎝⎝ d( 2 ) ⎥⎦
d3Q
k
k−1
⎜
k
( x +2x ) + x +x d( 2 ) k
dQ ≈
⎟
⎜
⎟
k−1
k
k−1
k−1
(x + x ) ( )
k k−1 2 3 2 (x k − x k − 1) d Q 2 3 x +x 3! 8 d k 2 k−1
x2 = x1 + c Δ
where Δ is a random perturbation vector, the elements in Δ are randomly assigned to be ±1 with equal probability, and c is a small positive scalar. 2.5.3. Termination of the Iterative Optimization. Because of the uncertainties in both production and sensoring, it is better to feedback the average quality measurements in each iteration. Assume that the experiments conducted under the operating condition xk are repeated N times, the feedbacked quality will be
(25)
For analyzing the degree of the approximation of eq 14, the difference between ∇J(xk) and ∇J((xk + xk−1)/2) can be calculated as ⎛ x + xk−1 ⎞ ⎟ − ∇J (x )|| ||∇J ⎜ k k ⎝ ⎠ 2 xk + xk−1 3 2 1 2 (x k − x k − 1) d Q 2 = || + (x k − x k − 1) xk + xk−1 3 2 3! 8 ! d 2
(
d2Q (x k) d(x k)2
(
(27)
)
Q̅ k =
)
||
1 N
N
∑ Q k ,i i=1
(28)
The iterative optimization process is repeated until the average quality measurements Q̅ k under the operating setting xk in the kth process adjustment satisfies the terminal criterion
(x + x ) ( )
3 k k−1 (x k − x k − 1) d Q || x k − x k − 1 || d2Q (x k) 2 = || + || 3 x +x 2 12 d(x k)2 d k 2 k−1
|| Q̅ k − Q tg || ≤ ε
(29)
The operating setting xk is considered as the optimal setting in this iteration, and the optimization process is terminated. 2.6. Procedures of the Proposed Quality Control Scheme. Based upon the above, the proposed quality control scheme for batch processes is depicted in Figure 2. The major steps of implementation are listed as below. (1) Compare the current batch-end product quality with the target value in real time.
(26)
From the above, we can observe that the Euclidean distance between xk−1 and xk restricts the precision of this approximation. As the iteration process continues, it is apparent that ∇J(xk) can be regarded as an approximation of ∇J((xk + xk−1)/ 2)) because the two iteration points are getting closer and closer. As detailed in section 2.2, the calculation of NG is based on the ordinary gradient. The ordinary gradient is an 17423
dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428
Industrial & Engineering Chemistry Research
Article
the packing-holding stage, during which additional polymer is “packed” at high pressure to compensate for the material shrinkage associated with the material cooling and solidification. The packing-holding continues until the gate freezes off, which isolates the material in the mold from that in the injection unit. The process enters the cooling stage; the part in the mold continues to solidify until it is rigid enough to be ejected from the mold without damage. Concurrently, with the early cooling phase, plastication takes place in the barrel where polymer is melted and conveyed to the front of barrel by screw rotation, preparing for the next cycle. In injection molding, product quality can be described by mechanical properties (such as part strength and stiffness), dimensional conformity (such as weight, length and thickness), and surface appearance (such as sink mark, record groove and jetting).7 In most applications, product weight (PW) is often selected as a representative quality variable to control for the following reasons: (i) product weight and operating variables are highly correlated; (ii) product weight can be readily measured at the end of each batch; and (iii) weight is a good representative of process stability and is of great commercial interest. Previous studies have shown that, the control of PW involves many process parameters, and significant interactions exist among those parameters. For example, Srinivasan et al.37 have developed a regression model that relates PW to the set-points for mold temperature, melt temperature, packing time, and packing pressure. Sun et al.38 have presented another multiple regression model for PW that involves the injection velocity, injection pressure, packing time, and melt temperature. In authors’ previous work,39 four process variables including melt temperature, injection velocity, packing pressure, and packing time are considered for analyzing their influences on PW. From the above, it can be confirmed that optimal setting of key operating parameters is essential to obtain desirable product quality. The injection molding process requires searching for the optimal process settings to obtain the specified product quality. Quality control of the injection molding process can be viewed as a batch-to-batch iterative optimization process, i.e., an MFO process if ignoring quality model and relying only on batch-end quality measurements. 3.2. Experimental Setup. The verification experiments are conducted using Moldflow Plastic Insight (MPI), a commercial simulation software of injection molding process. The mold is a simple curving assembly part designed by Pro/Engineer 5.0 as shown in Figure 3. Product weight is selected as the quality variable, and its values can be obtained in the log file after simulation. The tolerance level of product weight variation is chosen less than 0.1%. Experimental material is high-density polyethylene (HDPE). Other parameters of injection molding machine and process controllers are set by default. In the MPI simulation environment, PW can be manipulated by optimizing the set-points of mold temperature (x1), melt temperature (x2), injection time (x3), packing time (x4), and packing pressure (x5). The descriptions and the feasible regions of operating variables are shown in Table 1. Other experimental conditions are set by default. In order to illustrate the necessity of using Riemannian geometry based NG for the quality control of the injection molding process, several groups of experiments are conducted to show the nonlinear relationships between the process variables and the product quality. Figure 4a−c shows the fitted response surfaces with different pairs of independent variables (x1 and x3 in Figure 4a; x2 and x4 in Figure 4b; x2 and
Figure 2. Flowchart of the proposed quality control scheme.
(2) The quality control procedure is triggered when the observed quality measurement goes beyond the threshold. By this time, the iteration index k is set to 1, the current operating condition and quality measurement are denoted as the first data pair (x1, Q1). Then, we search the second iteration point (x2,Q2) using simultaneous perturbation method (i.e., eq 27). (3) Calculate the NG ∇̃ J(xk) (k ≥ 3) using eqs 4, 10, and 14; meanwhile, calculate the stepping matrix Mk using eqs 19−22; finally, the new iterative operating point xk+1 is yielded using eq 23. (4) Conduct verification experiments under the operating condition xk+1, resulting in the average product quality Q̅ k+1. (5) If Q̅ k+1 satisfies the termination criterion, terminate the quality control procedure; otherwise, repeat steps (3) and (4) iteratively.
3. ILLUSTRATIVE EXAMPLE 3.1. Injection Molding Process. Injection molding is an ideal batch process for the application and verification of the proposed quality control scheme using NG based MFO. Injection molding is a typical multistage process, among which, injection (or filling), packing-holding, and cooling are the most important phases. During filling, the screw moves forward and pushes the molten polymer into the mold cavity. Once the mold is completely filled, the process then switches to 17424
dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428
Industrial & Engineering Chemistry Research
Article
Figure 3. Curving assembly part mold.
Table 1. Operating Variables for the Part variable no.
description
range
unit
x1 x2 x3 x4 x5
mold temperature melt temperature injection time packing time packing pressure
[20−60] [120−255] [0.2−1] [3−10] [75−90]
°C °C s s %
x5 in Figure 4c) and the same response variable (PW). It is clear that the operating space shows complex nonlinear properties. NG based MFO is expected to achieve better quality control performance. To verify the control performance for tracking the set-points and rejecting the disturbances, it is necessary to create the corresponding simulation environment. This is easy for the set point tracking (some step changes may be enough), but a little bit difficult for the disturbance rejection because MPI is not an open-source software and it is not allowed to change any operating settings in the middle of simulation. To simulate quality variations caused by the external disturbances, we change the grid length of the part mold in the MPI software. It is feasible because different grid lengths in MPI correspond to different match ratio, resulting in different part weights.40 In this paper, three kinds of grid length are used under the same condition, x = [50, 185, 0.4, 5, 85]T, and the corresponding match ratios and part weights are listed in Table 2. In the simulation, the first grid length in Table 2 is viewed as a baseline, so that the quality tolerance level is 0.02 g because the weight variation is specified to be 0.1% (that is, ε = 0.02 in eq29).
4. RESULTS AND DISCUSSION Four groups of testing experiments are arranged as following: optimization speed test, tracking performance test, antidisturbance test, and repeatability test. In addition, the proposed NG based MFO method (denoted as method A) is compared with an SPSA based MFO method13 (denoted as method B). The results and discussions are shown below. 4.1. Optimization Speed Test. Optimization cost is proportional to the number of batches needed during the optimization. It can also be considered as the convergence speed in a general optimization problem. Figure 5 shows the
Figure 4. Nonlinear relationships between process variables and product quality for the injection molding process.
Table 2. Grid Lengths and Weights
17425
no.
grid length/mm
match ratio/%
weight/g
1 2 3
2.66 2.99 3.39
88.2 88.6 87.3
19.79 19.74 19.66
dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428
Industrial & Engineering Chemistry Research
Article
accelerated obviously in the previous iterations and smoothly converges to the target in the last several iterations. The iterative step sizes obtained from the sensitivities are presented in Figure 7.
Figure 5. Comparison results of optimization speed test.
comparison results for the optimization speed test. The target value of part weight Qtg is set to 19.32 g, and the initial point Q1 is 19.74 g with grid length 2.99 mm. The iteration process consists of a sequence of experimental runs, which contain iteration runs and perturbation runs. For method A, there are no perturbation runs except the second iteration point (See eq 27). However, for method B, there are 10 perturbation runs between the 6 iteration runs. From Figure 5, the two quality control trajectories start from the same initial point and converge to the same target value. However, the optimization costs are different. Method A reaches the target quality within 11 runs, whereas method B needs 16 runs in total. For method B, high optimization cost is mainly caused by the perturbation runs (used for the gradient approximation of SPSA). In order to show the iteration optimization process of the proposed method more clearly, some intermediate results are plotted in Figure 6. As shown in Figure 6, the sensitivities of the
Figure 7. Evolution of iterative step size in optimization speed test of method A.
4.2. Tracking Performance Test. Batch processes are usually flexible for producing various products; thus, they are often operated over a range of conditions. Process engineers should find the new optimal operating condition to meet the new quality requirements for new process configuration as soon as possible. It is a typical target tracking problem, so it is necessary to study the tracking performance of the proposed method. In this test, the initial point Q1 is changed to 19.66 g, which corresponds to no. 3 in Table 2. The initial quality target value is set to 19.79 g and the tracking target is 19.39 g. The comparison results for tracking performance test are plotted in Figure 8. The two methods can achieve similar performance when tracking the first target. After changing the target from 19.79 g to 19.39 g, method A reaches the new quality target within 12 runs, whereas method B needs 22 runs. The proposed method has a quicker tracking speed. 4.3. Anti-disturbance Test. The product quality under the given optimal settings may vary due to process disturbances or
Figure 6. Evolution of iterative sensitivity of the five operating variables in optimization speed test of method A.
five process variables are 0.2 in the first optimization run, because the random perturbation vector Δ in eq 26 is randomly assigned to be ±1 with equal probability. As the optimization process is performed iteratively, sensitivity of each variable is accumulated gradually, and finally approaches 0.2 again. The change tendency of sensitivity is caused by iterative accumulation process. The optimization process can be
Figure 8. Comparison results of tracking performance test. 17426
dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428
Industrial & Engineering Chemistry Research
Article
than a quality model are directly used to search for the optimal process setting. This batch-to-batch optimization method involves the determination of NG and adaptive step size. An approximation method for the calculation of NG has been derived in non-Euclidean space. A multivariate iterative sensitivity matrix based on Riemannian geodesic distance has been proposed to obtain a novel adaptive stepping strategy. Adoption of NG and the proposed geodesic stepping matrix can improve the convergence speed of MFO and therefore reduce the cost of MFO-based quality control scheme for batch processes. The proposed quality control scheme has been verified for product weight control in injection molding process using MPI software. The verification experiments consist of optimization speed test, tracking performance test, antidisturbance test, and repeatability test. The results can illustrate the advantages of the proposed method. MFO has good potential in quality control of batch processes. This paper is a first attempt. There remain some problems unsolved yet, such as the rigorous convergence proof and the more comprehensive experimental verification. The authors believe that such a data-driven model-free optimization based quality control scheme will have an extensive application prospect to various industrial batch processes.
uncertainties. In this anti-disturbance test, process disturbance is introduced by changing the mold grid length in MPI software, so that the products have fluctuating weights even under the same operating condition. The purpose of this test is to verify the robustness of the proposed method. In this test, the initial point Q1 is set to 19.74 g, and the target quality is 19.5 g. As shown in Figure 9, in the beginning,
■
Figure 9. Comparison results of antidisturbance test.
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected].
method A needs 5 iteration runs to reach the target, while method B needs 4 iteration runs and 6 perturbation runs. Then, the grid length of the mold is intentionally changed from 2.99 mm to 3.39 mm. By doing so, product weight is changing from 19.51 g to 19.40 g for method A and from 19.51 g to 19.39 g for method B. Weight fluctuations lead to the second optimization task. From the results, method A has faster tracking speed than method B, which also means the optimization cost of method A is lower than method B too. 4.4. Repeatability Test. Repeatability is also a key indicator representing the stability of an optimization method. From eqs 27 and 11, both methods A and B adopt stochastic gradient calculation. The former uses stochastic search only in the first step to determine the second iteration point, while the latter adopts stochastic optimization for estimating the gradient in every iterative run. Since both methods involve stochastic optimization, it is necessary to conduct repeatability test for verifying and comparing the stability of the optimization performance. In the experiments, the initial point Q1 is set to 19.74 g, and the quality target value is 19.5 g. For each method, two groups of experiment are conducted under the same conditions. Method A needs 9 and 10 runs to achieve the desired quality, respectively. Optimization costs are almost the same. Method B needs 13 and 21 runs, respectively. Such a dramatic difference is caused by the strong stochastic nature of SPSA in method B. In summary, the above experimental results show that the proposed quality control scheme can find the new optimal operating settings with faster optimization speed, less optimization cost, stronger robustness, and better repeatability.
Present Address §
Fei Zhao is now a PhD candidate in State Key Laboratory of Industrial Control Technology, Department of Control Science & Engineering, Zhejiang University, P.R. China. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS This research was supported by the National Science Foundation of China (Nos. 61374141, 61073059, and 61034005) and by the Open Research Project of the State Key Laboratory of Industrial Control Technology, Zhejiang University, China (No. ICT1442).
■
REFERENCES
(1) Chen, Z. B.; Turng, L. S. A review of current developments in process and quality control for injection molding. Adv. Polym. Technol. 2005, 24, 165. (2) Lu, N. Y.; Gao, F. R. Stage-Based online Quality Control for Batch Processes. Ind. Eng. Chem. Res. 2005, 45, 2272. (3) Lucyshyn, T.; Kipperer, M.; Kukla, C. A Physical Model for a Quality Control Concept in Injection Molding. J. Appl. Polym. Sci. 2012, 124, 4926. (4) Wan, J.; Marjanovic, O.; Lennox, B. Disturbance rejection for the control of batch end-product quality using latent variable models. J. Process Control 2012, 22, 643. (5) Wang, Y. Q.; Gao, F. R.; Doyle, F. J., III Survey on iterative learning control, repetitive control, and run-to-run control. J. Process Control 2009, 19, 1589. (6) Xiong, Z. H.; Dong, J.; Zhang, J. Optimal iterative learning control for end-point product qualities in semi-batch process based on neural network model. Sci. China Series F 2009, 52, 1136. (7) Yang, Y.; Gao, F. 2006. Injection molding part weight: online prediction and control based on a nonlinear principal component regression model. Polym. Eng. Sci. 2006, 46, 540.
5. CONCLUSIONS An NG based MFO has been presented for quality control of batch processes from the aspect of Riemannian manifold and stochastic gradient optimization. The measurements rather 17427
dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428
Industrial & Engineering Chemistry Research
Article
(8) Lu, N. Y.; Gao, F. R. Stage-Based Process Analysis and Quality Prediction for Batch Processes. Ind. Eng. Chem. Res. 2005, 44, 3547. (9) Zhang, J. Batch-to-batch optimal control of a batch polymerisation process based on stacked neural network models. Chem. Eng. Sci. 2008, 63, 1273. (10) Zhang, S. N.; Wang, F. L.; He, D. K.; Jia, R. D. Real-time product quality control for batch processes based on stacked leastsquares support vector regression models. Comput. Chem. Eng. 2012, 36, 217. (11) Bonvin, D. Optimal operation of batch reactors − A personal view. J. Process Control 1998, 8, 355. (12) Kong, X. S.; Yang, Y.; Chen, X.; Shang, Z. J.; Gao, F. R. Quality Control via Model-Free Optimization for a Type of Batch Process with a Short Cycle Time and Low Operational Cost. Ind. Eng. Chem. Res. 2011, 50, 2994. (13) Gattu, G.; Zafiriou, E. Methodology for on-line setpoint modification for batch reactor control in the presence of modeling error. J. Chem. Eng. 1999, 75, 21. (14) Srinivasan, B.; Bonvin, D.; Visser, E.; Palanki, S. Dynamic optimization of batch processes: II. Role of measurements in handling uncertainty. Comput. Chem. Eng. 2002, 27, 27. (15) Nelder, J. A.; Mead, R. A Simplex-Method for Function Minimization. Computer J. 1965, 7, 308. (16) Sun, W. Y.; Yuan, Y. X. Optimization Theory and Methods; Springer: New York, 2006; Chapters 3−4. (17) Park, H.; Amari, S.; Fukumizu, K. Adaptive natural gradient learning algorithms for various stochastic models. Neural Networks 2000, 13, 755. (18) Kiefer, J.; Wolfowitz, J. Stochastic estimation of maximum of a regression function. Ann. Math. Stat. 1952, 23, 462. (19) Stevenson, P. M. Optimized perturbation theory. Phys. Rev. 1981, 23, 2916. (20) Randall, J. L. Finite Difference Methods for Differential Equations; University of Washington: Seattle, 2005; p 6. (21) Dai, Y. H.; Yuan, J. Y.; Yuan, Y. X. Modified Two-Point Stepsize Gradient Methods for Unconstrained Optimization. Comput. Optim. Appl. 2002, 22, 103. (22) Spall, J. C. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Automat. Contr. 1992, 37, 332. (23) Chen, X.; Zhang, L.; Kong, X. S. Automatic Velocity Profile Determination for Uniform Filling in Injection Molding. Poly. Eng. Sci. 2010, 50, 1358. (24) Amari, S. Natural gradient works efficiently in learning. Neural Comput. 1998, 10, 251. (25) Zhang, Z. N.; Sun, H. F.; Peng, L. Y. Natural gradient algorithm for stochastic distribution systems with output feedback. Differ. Geom. Appl. 2013, 31, 682. (26) Liu, J. Q.; Feng, D. Z.; Zhang, W. W. Adaptive improved natural gradient algorithm for blind source separation. Neural Comput. 2009, 21, 872. (27) Subbarao, R.; Meer, P. Nonlinear Mean Shift over Riemannian Manifolds. Int. J. Comput. Vis. 2009, 84, 1. (28) Fletcher, P. T. Riemannian Geometry for the Statistical Analysis of Diffusion Tensor Data. Signal Processing. 2007, 87, 250. (29) Schlegel, H. B. Geometry optimization. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2011, 1, 790. (30) Hu, K. L.; Yuan, J. Q. Batch process monitoring with tensor factorization. J. Process Control 2009, 19, 288. (31) Castano-Moraga, C. A.; Lenglet, C.; Deriche, R.; Ruiz-Alzola, J. A Riemannian approach to anisotropic filtering of tensor fields. Signal Process. 2007, 87, 263. (32) Yang, Z. Y.; Laaksonen, J. Principal whitened gradient for information geometry. Neural Networks 2008, 21, 232. (33) Kalivas, J. H. Optimization using variations of simulated annealing. Chemomet. Intellig. Lab. Sys. 1992, 15, 1. (34) Ozen, A.; Kaya, I.; Soysal, B. Variable Step-Size Constant Modulus Algorithm Employing Fuzzy Logic Controller. Wireless Pres. Commun. 2010, 54, 237.
(35) Simone, F. Learning by Natural Gradient on Noncompact Matrix-Type Pseudo-Riemannian Manifolds. IEEE Trans. Neural Networks 2010, 21, 841. (36) Blum, J. R. Multidimensional stochastic approximation methods. Ann. Math. Stat. 1954, 25, 737. (37) Srinivasan, K.; Srinivasan, T.; Maul, G. P. Improvements in Closed Loop Control of Thermoplastic Injection Molding Processes. 49th Annual Technical Conference, Montreal, Canada 1991, 343. (38) Sun, C. H.; Chen, J. H.; Sheu, L. J. Quality control of the injection molding process using an EWMA predictor and minimumvariance controller. Int. J. Adv. Des. Manuf. Technol. 2010, 48, 63. (39) Lu, N. Y.; Gong, G. X.; Yang, Y.; Lu, J. H. Multi-objective Process Parameter Optimization for Energy Saving in Injection Molding Process. J. Zhejiang Univ.-Sci. A (Appl. Phys. Eng.) 2012, 13, 382. (40) Jay, S. Moldflow Design Guide: A Resource for Plastics Engineers; Moldflow Corporation: MA, 2006; Chapter 5.
17428
dx.doi.org/10.1021/ie502348w | Ind. Eng. Chem. Res. 2014, 53, 17419−17428