Article pubs.acs.org/IECR
Sequential and Orthogonalized Partial Least-Squares Model Based Real-Time Final Quality Control Strategy for Batch Processes Runda Jia,*,†,‡ Zhizhong Mao,†,‡ Fuli Wang,†,‡ and Dakuo He†,‡ †
School of Information Science & Engineering, Northeastern University, Shenyang 110004, China State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110004, China
‡
ABSTRACT: In this work, the problem of driving a batch process to a desired final product quality using data-driven model based midcourse correction (MCC) is described. Specifically, we adapt a sequential and orthogonalized partial leastsquares (SO-PLS) method to calibrate the inferential quality model, which takes into account the serial nature of the input batch data matrices and could retain the reliable information as much as possible when it is used to perform online quality prediction. Since the process variable trajectories that are necessary to predict the final quality are incomplete at a certain decision point, known data regression (KDR) is used to estimate the future trajectories, and the causal relationship of the initial conditions and the future candidate manipulated variables in determining the future process variable trajectories is also considered. Finally, taking the advantage of the latent variable model, the indicators that consider only the degrees of freedom are employed as hard constraints to confine the SO-PLS model only used in a valid region. The efficacy of the proposed approach is demonstrated through a virtual batch process and a cobalt oxalate synthesis process.
1. INTRODUCTION Batch processes have found widespread application in a variety of sectors in the chemical industry because of their flexibility to manage types of products and different grades. Typical examples include crystallization, fermentation, polymerization, etc. Batch processes exhibit a number of characteristics that lead to interesting control problems. They are finite duration processes in which the product quality is usually available by off-line assay after completion of the batch. Furthermore, the absence of equilibrium points, the nonlinear behavior of the chemical reactors, and the time-varying dynamics over a wide range of operation conditions impede the direct application of control strategies designed for continuous processes.1 Numerous approaches have been presented to control the quality or to optimize some economic objective in batch processes. These approaches can be distinguished into two general categories: batch-to-batch control and within-batch control.2 The idea behind batch-to-batch control is the use of data collected from the past completed batches to refine the manipulated variable trajectories (MVTs) of the upcoming batch.3−9 It is crucial for batch-to-batch control to assume that the information gained from the operation of immediately prior batches is useful for predicting the next batch.10 If the above assumption is not accomplished, the batch-to-batch control strategies would be useless. Since the batch-to-batch control only consists of adjusting the MVTs for the next batch, it is an entirely off-line strategy and lacks any real-time feedback mechanism to reject disturbances during the batch evolution. In this case, with-in batch control can be used to detect these disturbances from the process variable © 2016 American Chemical Society
trajectories early in the current batch and compensate for their effects by adjusting the manipulated variables as the batch evolves. There are basically two levels for with-in batch control. The lower level control is the set-point tracking of certain process variable trajectories, which can be achieved by PID controller or model predictive control (MPC),11,12 etc. However, implementing the same reference trajectories does not guarantee that the desired quality will be met unless the variance in the initial conditions for the batch is very little.13 The higher level control utilizes all available measurements to online adjust the MVTs in such a way as to alter the evolution of the remainder of the batch in order to control the final quality for that batch. Since theoretical approaches to the higher level control of product quality are based on the use of the detailed first principle model,14 a large amount of computation effort may not be suitable for agile responsive manufacturing (e.g., injection molding process). Moreover, the development and maintenance of an accurate first principle model are often difficult. Data-driven modeling methods overcome several limitations of the theoretical approaches because they can use the existing information routinely collected and give a chance to readily build and maintain these models. Among data-driven methods, the multiway principal components analysis (MPCA) and multiway partial least-squares (MPLS) proposed by Nomikos and MacGregor15 have been shown to be particularly suitable for Received: Revised: Accepted: Published: 5654
October 16, 2015 April 6, 2016 April 22, 2016 April 22, 2016 DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
Figure 1. Batch-wise unfolding of batch data set for model building.
of the data is not taken into account. When using the data-driven model to perform online prediction, the uncertainty only exists in the future process variable trajectories. Hence, the overlapping information that appears in the process variable data set should be removed during the stage of off-line model building. Motivated by the above considerations, sequential and orthogonalized PLS (SO-PLS)21−23 is used in this work to calibrate the inferential quality prediction model in MCC strategy. The main contributions of this work can be summarized as follows: (1) Although higher prediction performance can be achieved at the batch terminal time, few works have analyzed the model accuracy during the evolution of the batch. Integrating the different matrices to directly calibrate a unified PLS model do not always work well, since the information in these matrices may be partly overlapping and the forecasting error will be propagated to the quality prediction. Thus, instead of using a one-step PLS regression, PLS is used in sequence in this work, and for each step information that is already present in the model is removed from the data matrix. In this case, the reliable information in initial conditions and MVTs is retained in the SO-PLS model as much as possible and the uncertainty in the process variable trajectories is partly removed, which gives us an opportunity to improve the prediction accuracy at a certain decision point. (2) The causal relationship between the future input and output behaviors for the MPCA model based missing data imputation methods has not been of concern before, because unusual operating policy might be a fault signal that was required to be detected as soon as possible. However, the causality should not be neglected in the quality prediction, since better forecasting accuracy of future trajectories can result in better prediction performance. In this work, known data regression (KDR)24 based on augmented input data matrix is used to estimate the future trajectories and the causal relationship of the initial conditions and MVTs in determining the process variable trajectories are also analyzed in this PCA projection based estimator. (3) Most latent variable model based MCC strategies employed the soft constraint to ensure that the datadriven model is used in a valid region. However, the weight is an additional parameter that needs to be tuned by trial and error, and the control performance is highly sensitive to the weight. In this work, the indicators that consider only the degrees of freedom (d.o.f.)25 are employed as hard constraint to confine the data-driven model. And
modeling batch processes. Several research studies have been completed using either MPCA or MPLS.16−18 In those approaches, simple and easily available online measurements with some off-line measurements at a predetermined midcourse point (decision point) or every sampling instant are utilized to predict the final quality. And if the prediction falls out of a statistically defined acceptable region, appropriate remedial action should be taken to correct the current batch. One example of this form of control strategies was proposed by Flores-Cerillo and MacGregor,16 and a PLS model is employed to provide a long-term estimation of the final product quality. The method performs the midcourse correction (MCC) via optimization in the latent variable space when the prediction significantly deviates from the desired value and the MVTs are reconstructed from the optimal solution of the PLS scores. A significant issue while using the latent variable model for final quality prediction and control in the aforementioned methods is that the future online process variable trajectories that are required to predict the final quality are incomplete during the evolution of the batch. More specifically, the data used for model building calls for entire batch trajectory to perform the prediction, while measurements are only available up to the current decision point and the future trajectories are needed to be estimated. Generally, the problem of missing data imputation is solved by using ad hoc approaches or more powerful statistical estimation approaches based MPCA model19 in process monitoring application. However, when the latent variable model is used in quality control strategies, the need to consider the causal relationship between the future input and output behaviors is obvious.13 Hence, different from process monitoring approaches, the causality of the initial conditions and the future candidate manipulated variables in determining the future process variable trajectories should be considered. To eliminate the need for forecasting the future trajectories, Wang and Srinivasan20 built a finite set of PLS models at predetermined decision points and used the data only up to the decision point rather than entire process variable trajectories. Recently, Aumi et al.13 proposed to handle the estimation of future trajectories by multiple local linear autoregression exogenous (ARX) models with an appropriate weighting function, and the causality between the manipulated variables and process variables was accounted for predicting the final quality and computing the future MVTs. Although recognizing that the MVTs determine the process variable trajectories and in turn the final quality, combing the information in the various sources (matrices) to build a single-PLS inferential quality model may not necessarily be the best choice. This is because the information in the different matrices may be partly overlapping, and the serial nature 5655
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
serial nature of the batch data matrices is taken into account in the stage of quality model building. Example 1. Consider the following model
even when the initial conditions and past trajectories lie outside the valid region, a more reliable solution still can be obtained by the proposed strategy. The remainder of this work is organized as follows. A brief review of the features of batch data and the theory of the SO-PLS method is described in Section 2. Section 3 discusses the SO-PLS model for final quality prediction. The proposed MCC strategy is given in Section 4. Application of the proposed strategy to case studies is presented in Section 5, and Section 6 draws some conclusions.
x1̇ = −0.053x1x 2 − ux1/x3
(1)
x 2̇ = −0.053x1x 2 − 0.256x 22 + u(5 − x 2)/x3
(2)
x3̇ = u
(3)
where x1 and x2 are the concentration of species and x3 is the reactor volume. The manipulated variable u is the feed rate of x2, which is divided into 10 intervals wherein it remains constant. The model parameters, operation bounds, and initial conditions can be found in ref 14. Suppose x2 can be measured online, and x1 at the batch termination is taken as the final quality. A total of 60 batches of simulated data were generated to evaluate the prediction performances of the single-PLS model that were used in ref 16. Of the 60 batches of data, 30 batches were utilized to build the data-driven model in which the measurement noises were added, and the remaining 30 batches were used as test data without noises. First, a single-PLS model was calibrated using only initial conditions and batch-wise unfolded manipulated variables. In this case, the future trajectory of the process variable was not required to be estimated. Next, the trajectory of x2 was incorporated to build an augmented single-PLS model, where the input data matrix X for model building includes the initial condition matrix Z, the manipulated variable matrix U, and the process variable matrix V. We found that the predicted root-mean-square-error (RMSE) at batch termination decreased from 0.0032 to 0.0006 as shown in Figure 2.
2. PRELIMINARIES In this section, we first illustrate how the two-dimensional input data matrix is obtained by the batch-wise unfolding approach. And then, the features of the batch data matrices are analyzed. This is followed by an overview of the SO-PLS method, which is a sequential use of the PLS regression in combination with orthogonalization. Finally, the relationship between the input and the output matrices is captured by the SO-PLS model. 2.1. Structure of Batch Data. In most industrial fields, a detailed first principle model of the batch process is not available for product quality control. To understand how to build a datadriven model that can be used during the evolution of the batch to predict the final quality, we consider a generic batch process as shown in Figure 1. A three-way matrix X̲ can be composed of large amount of process data collected from the I batches. There are different approaches for constructing a two-dimensional matrix out of the cube of the data set. Nomikos and McGregor15 suggested the batch-wise unfolding approach as shown in Figure 1 as the most suitable way for modeling differences among batches. After unfolding the three-way matrix X̲ , the batch data can be classified into four types: (1) The matrix Z ∈ RI×L represents the off-line measured initial conditions such as raw material properties before the start of a batch, where L is the number of measured different properties. (2) The matrix U ∈ RI×F represents the MVTs segmented into m intervals, where F = f·m, and f is the number of manipulated variables. Generally, decision points are selected using prior process knowledge among these intervals, and in the limit, control action can be taken at each interval. (3) The matrix V ∈ RI×G represents the online measured process variable trajectories sampled at n sampling instants, where G = g·n, and g is the number of process variables. (4) The matrix Y ∈ RI×M represents the final product quality measured at the end of each batch, where M is the number of quality properties. To identify a data-driven model for quality prediction, the unfolded matrices U and V are concatenated to the initial conditions matrix Z, forming an augmented input matrix X = [Z U V] ∈ RI×N, where N = L + F + G. Then the input matrix X can be regressed onto the output matrix Y using linear regression such as PLS. However, mixing the information from all the input matrices together is not necessarily the best choice, since the information in these matrices may be partly overlapping. Moreover, the forecasting errors in the future trajectories will be directly propagated to the final quality prediction and greatly affect the control performances. Based on the above considerations and recognizing that the process variable trajectories are determined by the initial conditions and MVTs, in this work the
Figure 2. Comparison of prediction performances at every decision point using single-PLS model and augmented single-PLS model for the toy example.
Generally, the matrix Z can be measured before the start of a batch, and the matrix U can be determined by control policy. However, the matrix V can only be taken on during the evolution of a batch. Apparently, the matrix V is influenced directly by matrix [Z U] but also possibly by other phenomena. Thus, the overlapping information that relates to product quality Y arises in matrices [Z U] and V, and the unique information that relates to Y can also exist in matrix V at the same time (as shown in Figure 5656
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
matrix [Z U], which removes the overlapping information in the matrix V. The original Y is again fitted to the orthogonalized Vorth I×A2 ∈ RI×G by PLS regression to obtain the score matrix Torth , 2 ∈R namely, the second PLS model. The parameters A1 and A2 are the number of latent variables retained in the PLS models, respectively. As can be seen, the SO-PLS method is a sequential use of PLS regression in combination with orthogonalization, which only focuses on the additional contribution of a new matrix. Then the full predictive model can be summarized as follows:
3). The important element to improve the prediction performance is a data-driven model that reveals the joint information on
orth ̂ ̂ Ŷ = TQ 1 1 + T2 Q 2
= [ Z U ]Ŵ 1Q̂ 1 + V orthŴ 2 Q̂ 2 Figure 3. Diagram of information that relates to final quality in different input data matrices.
A1 × M
where Q̂ 1 ∈ R and Q̂ 2 ∈ R are the regression coefficient matrices for the score matrices, respectively, and Ŵ 1 ∈ R(L + F ) × A1 and Ŵ 2 ∈ RG × A 2 are the weighting matrices that are needed to compute the PLS score matrices. The detailed algorithm for building the SO-PLS model and estimating new samples is given in Appendix A. Remark 1. The number of latent variables to retain for the SOPLS model is an important issue. Cross-validation can also be used to determine the optimum number, since orthogonalization is a linear prediction process.22 There are two parameters for the number of latent variables to be estimated for matrices [Z U] and V, and the global approach can be used in this work, which is a simultaneously determining method.23 We typically set an upper limit for the number of latent variables in each matrix and then search through all possible combinations. In some cases, different choices will give similar prediction ability, and we will use the more latent variables in the combined matrix [Z U]. Such situation indicates overlapping information in the two input matrices (as shown in Figure 3), which can be included in the first PLS model and can also be included in the second PLS model. In the proposed strategy, it is important to have this reliable information represented in an early stage of the SO-PLS model building. This is because, for online prediction, the unknown parts do not take place in initial conditions and MVTs, and we should extract as much reliable information as possible in the first stage of the SO-PLS model building and try our best to minimize the adverse effects of uncertain information in future process variable trajectories.
all possible factors on the final quality. However, a monolithic model that accounts for each of these factors often fails in practice when the data are ill-conditioned due to the high degree of multiple correlations between variables and the presence of noise. In this case, Flores-Cerrillo and MacGregor16 proposed to use the PLS method to model the relationship between the input and output matrices, so that a latent variable model with low dimension can be achieved. And such a model can also include the information unique for matrix V to improve the prediction accuracy, especially at the batch terminal time. However, it is more interesting to analyze the prediction performances at every interval rather than at the end of the batch, since the final quality prediction is only performed during the evolution of the batch. In Figure 2, it can be seen that, at the first three intervals, the prediction accuracy of the augmented singlePLS model is not better than using only initial conditions and MVTs (RMSE = 0.0026). This is because the forecasting error in the future trajectory of x2 directly propagated to the final quality prediction, and the problem was especially prevalent at the early intervals of the batch when limited data were available. After that, the prediction performances were gradually enhanced and finally converged at 0.0006, which was much better than the single-PLS model. It should be noted that the future trajectory of x2 was estimated by projection to the plane using PLS (PPPLS) as in refs 16 and 19. 2.2. Sequential and Orthogonalized PLS (SO-PLS). The regression situation considered here is the one with an initial condition matrix Z, a manipulated variable matrix U, and a process variable matrix V. Both of the matrices U and V will in most cases have many columns that may be highly collinear. The simplest approach for modeling of this type of data is to use a universal PLS model with some type of weighting for the contributions from the different matrices. As above-mentioned, this may suffer from a number of weaknesses. In this work, SOPLS method is used to describe the relationship between the input and output matrices, which is also a standard linear model: Y = [ Z U ]B1 + VB2 + E
3. MODEL BUILDING In section 3, the SO-PLS model is first calibrated off-line to predict the final quality. However, since the process variable trajectories that are needed to predict the final quality are deficient as the batch evolves, the formulated SO-PLS model cannot be used alone to perform the online quality prediction. To overcome this issue, KDR is used to estimate the future process variables trajectories, and the causal relationship of the initial conditions and MVTs in determining the process variable trajectories is also discussed in this PCA model based estimator. 3.1. Off-Line Model Building. Generally, the proposed MCC strategy uses SO-PLS methodology to build the datadriven model from historical batch data. However, such strategy can be only used in the region where the SO-PLS model is valid due to the hard constraint on the manipulated variables. If permitted, a few complementary identification experiments can be performed to enlarge this region, and the prediction and control performance will be further improved for some special cases. After obtaining the batch data set for model identification,
(4)
where E ∈ R is the residual matrix and B1 ∈ R and B2 ∈ RG×M are the regression coefficient matrices to be identified. Since the unknown parts (future trajectories) only exist in matrix V, the augmented input matrix X is divided into [Z U] and V. And then the first step for the SO-PLS method is to fit the matrix Y to the combined matrix [Z U] by PLS regression, namely, the first PLS model. Next the matrix V is orthogonalized with respect to the score matrix T1 ∈ RI×A1 of the combined I×M
(5)
A2 ×M
(L+F)×M
5657
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research the MVTs are segmented into a large number of intervals, in which the decision points can be selected using prior process knowledge. If we have no prior knowledge about the batch process, every interval can be thought of as a decision point as
Figure 5. Comparison of prediction performances at every decision point using augmented single-PLS model and the proposed SO-PLS model for the toy example. Figure 4. Segmentation of MVTs and process variables.
first two intervals, the prediction accuracy (RMSE = 0.0025) had been already outperformed the single-PLS model using initial conditions and MVTs alone. It means that the SO-PLS model not only enhanced the prediction performance at every interval during the evolution of the batch, but also it can be used at a more early stage of the batch than traditional augmented single-PLS model. The significant issue while using the SO-PLS model for final quality prediction is that the online measurements that form the input vector are incomplete and are required to be estimated. Such estimation inevitably results in additional uncertainty, especially at the early stage of the batch when limited data are available. For the augmented single-PLS model, suppose Δv ∈ RG represents the uncertain information in process variable trajectories, and then the final quality prediction can be calculated as follows:
shown in Figure 4. The MCC strategy usually consists of two stages: (1) At each decision point i (i = 1, 2, ..., n), a final quality prediction using initial conditions, MVTs, and process variable trajectories available up to that time is performed to determine whether or not the quality will fall out of a predetermined acceptable region. (2) Then if needed, the control action is calculated via optimization to obtain the remainder of the MVTs for that batch to yield the desired final quality. The two-stage procedure is repeatedly performed at every decision point i using all available information. The novelty of the proposed strategy in this work is that the SO-PLS method is used to build the data-driven model rather than single-PLS model which is used in most literature. Hence, the prediction accuracy can be improved at each decision point, and the final quality prediction can be described as follows: ⎡z ⎤ ‐ PLS ̂ 1Q̂ )T ⎢ new ⎥ + (Ŵ 2 Q̂ )T v orth ŷ SO ( W = new 1 2 new ⎣ u new ⎦ (6)
ŷ PLS new
⎡ Znew ⎤ ⎢ ⎥ ̂ ̂ ) ⎢ u new ⎥ = (WQ ⎢ v + Δv ⎥ ⎣ new ⎦ T
(7)
where vnew ∈ R is the new available vector of process variable trajectories without uncertainty. Q̂ ∈ RA × M is the regression coefficient matrix for the score matrices, Ŵ ∈ R(L + F + G) × A is the weighting matrix that can be used to compute the PLS score vector of new available input data tnew ∈ RA, and A is the number of latent variables in PLS model. Rearranging eq 7, the following equation can be obtained G
‐ PLS ∈ RM is the predicted final quality, znew ∈ RL and where ŷ SO new unew ∈ RF are the new available vector of initial conditions and G MVTs, and vorth new ∈ R is the new available vector of process variable trajectories orthogonal to the combined score matrix T1. Example 2. Also considering the batch process model in Example 1, the prediction performance of the proposed SO-PLS model was evaluated. The same training data set was used to calibrate the SO-PLS model, and the final quality prediction was performed on the same test data set as well. Note that, to make the comparison fair, the PPPLS was also employed to forecast the future trajectory of x2 in this example. In Figure 5, the prediction performances of different datadriven models were compared, and the prediction accuracy of the proposed SO-PLS model was nearly identical to that of the augmented single-PLS model at batch terminal time. However, since the SO-PLS model is used instead of the single-PLS model, the prediction accuracy at every interval was further improved (the decrease of RMSE at each decision point in Figure 5). At the
ŷ PLS new
⎛⎡ z new ⎤ ⎞ T T ⎜⎢ ⎥ ⎡ 0L + F ⎤⎟ ̂ u ̂ = Q W ⎜⎢ new ⎥ + ⎢ ⎥⎟ ⎜⎢ v ⎥ ⎣ Δv ⎦⎟ ⎣ ⎦ ⎝ new ⎠ T = Q̂ (t new + Δt)
(8)
where ΔtT = [0TL + F Δv T]Ŵ and 0L+F is a vector filled with L + F 0s. Since the regression coefficient matrix Q̂ integrates all the information in [Z U] and V, the uncertain information Δv would influence the whole data-driven model through the latent 5658
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
the SO-PLS model in solving the optimization problem, in which the new MVTs may need to be determined. However, for product quality control, the operation region is always around the nominal conditions and extrapolation is not a critical problem. In this case, we only require limited identification data with relatively small changes to establish the causal relationship between MVTs and final quality, and the disturbances on the system performance would be greatly diminished. Once new data from the optimization stage are obtained, the traditional so-called two-step approach adds these data to the original data set to improve the quality prediction model, and such an approach can also be used in the proposed MCC strategy. However, if the model is updated, there is a conflict between parameter estimation and optimization, since parameter estimation requires persistency of excitation; i.e., the inputs must be sufficiently varied to uncover the unknown parameters, a condition that is usually not satisfied when near-optimal inputs are applied.1 Hence, although the data-driven model is repeatedly refined by new available data, the plant-model mismatch still exists, which cannot be conquered by model-adaption methodology. And we suggest integrating the batch-to-batch control strategy as in refs 26 and 27 to compensate for the model uncertainty. 3.2. Online Final Quality Prediction. The prediction accuracy can be increased by improving either model building or future trajectory estimations or both. At the step of model building, we knew all the complete process variable trajectories for identifying the SO-PLS model and need not to estimate the unknown trajectories. Then the rest of work for online final quality prediction is to estimate the unknown trajectories as accurately as possible. There are numerous efficient estimators for the future trajectories (missing data), e.g., using ad hoc approaches, such as using average trajectories, or using a more powerful statistical estimation approach based on the PCA model.28,29 Aumi et al. also handled future trajectories by multiple local linear ARX models with an appropriate weighting function.13 In this work, we also recognize the causality of the initial conditions and MVTs in determining the process variable trajectories, and the augmented PCA model based estimator is used to describe the causal relationship. It should be noted that since the SO-PLS model is used in this paper to build the quality inferential model, the unified latent variables cannot be obtained to estimate future trajectories of process variables. In this case, another global regression model should be calibrated, and the PLS model can also used here to establish the relationship between process variables and MVTs as in ref 16. However, in such a model we only pay attention to the forecasting ability of future trajectories instead of final quality; thus, the PCA model is more suitable to be used here to extract unified latent variables of augmented input data matrix. For a new batch, assume that k (k ≥ i) sampling instants of process variable trajectories are available at decision point i as shown in Figure 4. Although the implemented MVTs are only available up to i − 1, candidate future MVTs exist in the optimizer that can be used to predict the process variable trajectories up to batch termination. As a result, only the vector of process variable trajectories vnew ∈ RG is incomplete at decision point i. For vector vnew at sampling instant k, we can group all of the unknown elements at the end of the vector. The known elements in the vector vnew are denoted by vnew,k * ∈ Rgk, and the vector v#new,k ∈ RG−gk represents the unknown elements. Thus, we T v#T can use vTnew,k = [v*new,k new,k] to rearrange the unknown elements at the end of the vector. This notation can be applied to other vectors or matrices as well. Hence, for online final quality
variables when it is used to perform online prediction. However, at the stage of SO-PLS model building, the matrix V is orthogonalization with respect to [Z U] that is used for prediction of quality. And V − Vorth has no additional contribution to the prediction, since it is already implicitly incorporated in the first PLS model. Then the second PLS model focuses only on the part of V that contributes in addition to [Z U] for explaining variability in product quality. Thus, for SO-PLS model, using eq A.3, eq 6 can be rewritten as follows ⎡z ⎤ − PLS ̂ TŴ 1T ⎢ new ⎥ + Q̂ TŴ 2T (v orth ŷ SO Q = new + Δv) 1 2 new ⎣ u new ⎦ T T orth = Q̂ 1 t1,new + Q̂ 2 (t orth 2,new + Δt 2 )
(9)
where t1,new ∈ RA1 is the score vector obtained from the first PLS A2 model, torth 2,new ∈ R is the orthogonalized score vector obtained T = Ŵ 2 Δv . Since Q̂ 1 only from the second PLS model, and Δt orth 2 includes the information in [Z U] and the initial conditions znew and MVTs unew can be determined in advance, we can retain as much reliable information as possible in the first PLS model. On the other hand, orthogonalization can be considered a way of filtering out the part of V that is due to [Z U] before interpreting the effect of V. Therefore, Q̂ 2 only integrates part of the information in V in eq 9, and the adverse effects of uncertainty in Δv are confined in the second PLS model. Above all, compared with the augmented single-PLS model, more reliable information can be retained in the SO-PLS model to further improve the online prediction accuracy. Finally, the batch data sets sometimes have very different measurements, a simple way is to multiply the block in the input matrices by weighting factors that make them more influential during the calibration of the data-driven model. Since no more latent variables are required to extract from each of the blocks, we do not need to do any relative weighting of blocks to compensate for different scales used. Hence the SO-PLS model is that they are invariant to the relative weighting of the matrix. Remark 2. At early stages of the batch process as in Figure 5, nearly the whole future trajectories were required to be estimated; thus, the forecasting errors of future trajectories dominated the final quality prediction error, and the prediction performance was not greatly improved by the proposed SO-PLS model. Moreover, Figure 5 can also guide us to select the decision points, when the prior process knowledge is absent. Some intervals with apparent descent of predicted RMSE can be chosen as decision points, since at these intervals, the prediction performances can be significantly improved. For example, the decision point 6 in Figure 5 is more suitable to be selected as the final decision point, since the prediction accuracy at this point greatly improved from the previous ones, and after that point the predicted RMSE decreased relatively slowly. Remark 3. The data used for off-line model building should consist of representative operating data in order to capture the information on most of the disturbances and operation policies. Thus, the data in which some changes in the MVTs are performed at each decision point i are required to establish the causal nature between these changes and the final qualities. Since the information on SO-PLS models comes mainly from the historical data set, they often can have good interpolative capabilities. In this work, the SO-PLS model does not predict the product quality outside the range spanned by the data collected for model building. Obviously, such limitation restricts the use of 5659
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research prediction at decision point i, there is a regressor xnew,i ∈ RN composed of the following vectors T #T x Tnew, i = [ z Tnew u Tnew v *new, k v new, k ]
to estimate the score vector, the forecasting error is always inevitable. Since we do not know the exact future process variables trajectories in advance, we cannot estimate the forecasting error, either. Under this circumstance, the forecasting errors will be directly propagated to the final quality prediction and the MCC strategy with single-PLS model could not solve this issue effectively. However, the first stage for the SO-PLS model building identifies the important information on [Z U] and leaves most of the noise as residuals. Thus, the projection of V is done on to the most reliable part of [Z U]. The rest of V used for regression will be filtered and leave the noise primarily in the residuals. Under this circumstance, only the information unique for the particular set of measurements is included, and the prediction accuracy at each decision point might be enhanced. Since the KDR method is used to estimate the future trajectories in this work, it is desirable to analyze the causal relationship of the forecasting model embedded in the augmented PCA projection. Denote that the column vector υk contains the g process variables measured at sampling instant k, and thus we can rewrite the known elements at decision point i in terms of υk as follows:
(10)
We also denote that = ∈R is the G−gk known elements of the vector = v#T new,k ∈ R represents the unknown elements of this vector. For augmented batch process data X, a PCA model can be built to obtain the loading matrix P ∈ RN×N. The matrix PB ∈ RN×B, which contains the first B loadings of P, can also be divided into two parts: P*B,i ∈ # R (F+L+gk)×B corresponding to x new,i * and P B,i ∈ R (G−gk)×B corresponding to xnew,i * . Hence, the augmented PCA model can be partitioned as follows: T x*new,i
T v*new,k ] xnew,i, and x#T new,i
[zTnew
L+F+gk
uTnew
⎡ x * ⎤ ⎡ P* τ + d*⎤ i ⎢ new, i ⎥ = ⎢ B , i B , i ⎥ ⎢ # ⎥ ⎢ # #⎥ ⎣ x new, i ⎦ ⎣ PB , iτB , i + d i ⎦
(11)
where τB,i ∈ R is the first B elements of the score vector and di* ∈ RF+L+gk and d#i ∈RG−gk are the forecasting errors corresponding to x*new,i and x#new,i, respectively. A lot of methods have been presented to estimate the score vector τB̂ , i from the known elements when a vector contains miss data, and then the # predicted unknown elements x̂ new, i can be calculated by B
# # x̂ new, i = PB , i τB̂ , i
T T T T T T x *new, i = [ z new u new υ1 , υ2 , ⋯ , υk ]
Thus, the j-step-ahead prediction υ̂k + j corresponds to the g # # elements of the predicted future trajectories v̂ new, k (i.e., x̂ new, i ) from (j − 1)g + 1 to jg, which can also be expressed as a function of the loadings P#B,i taking only from the (j − 1)g + 1th to the jgth rows:
(12)
Arteaga and Ferrer state that KDR is statistically superior to the other methods,24 which is equivalent to conditional mean ́ replacement (CMR). And the latter is also advocated by GaraciaMuñoz et al. in ref 19. Hence, KDR is used in this work to estimate the score vector τB̂ , i , which yields T τB̂ , i = Φ̂ i x *new, i
# υk̂ + j = x̂ new, i[(j − 1)g + 1: jg ]
= P#B , i[(j − 1)g + 1: jg ]τB̂ , i
(13)
and Φ̂ i = (P*i SP*i T)−1P*B , k SB
(16)
Substituting eq 13 into eq 16, the following expression can be obtained.
(14)
T υk̂ + j = P#B , i[(j − 1)g + 1: jg ]Φ̂ i x *new, i
where P*i ∈ R(F + L + gk) × N is the loading matrix P corresponding to xnew,i * , S ∈ RN×N is a diagonal matrix of eigenvalues in decreasing order, and SB ∈ RB×B contains only first B eigenvalues of the diagonal matrix S. At any decision point, the data used for prediction calls for the entire batch trajectory. However, the measurements are only available up to the current decision point, and the future trajectories are required to be estimated. Although KDR is used
υk̂ + j
(15)
(17)
T The term P#B , i[(j − 1)g + 1: jg ]Φ̂ i in eq 17 is a matrix of g rows and F + L + gk columns. Assume φjh,l (h = 1, 2, ..., g, l = 1, 2, ..., F + L + gk) T Φ̂ . Hence, we can reis the element of matrix P# B , i[(j − 1)g + 1: jg ]
i
express each element of this matrix and give rise to a multivariate time series prediction as follows:
⎡ z new ⎤ ⎢u ⎥ ⎡⎧ φ j ⋯ φ j ⎫ ⎧ φ j ⋯ φ1,j L + F ⎫ ⎧ φ1,j L + F + 1 ⋯ φ1,j L + F + gk ⎫⎤⎥ ⎢ new ⎥ + L L 1,1 1, 1, 1 ⎢⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎪ ⎪ ⎪⎥ ⎢ υ1 ⎥ ⎪⎪ ⎪ ⎬ ⎢ ⋱ ⋮ ⎬⎨ ⋮ ⋱ ⋮ = ⎢⎨ ⋮ ⋱ ⋮ ⎬ ⎨ ⋮ ⎥ ⎢⎪ ⎪⎪ j ⎪⎥ ⎢ υ2 ⎥ ⎪⎪ j j j j j ⎢⎪ φ ⋯ φ ⎪ ⎪ φ ⎪⎥ ⋯ φg , L + F ⎪ g ,L ⎭ ⎩ g ,L+1 ⎭⎪ ⎩ φg , L + F + 1 ⋯ φg , L + F + gk ⎭⎦ ⎢ ⋮ ⎥ ⎣⎩ g ,1 ⎢ ⎥ ⎣ υk ⎦ φj φj φ j , φ2, ⋯ ,φ j u
z
=
φzj z new
+
φuj u new
+
φ1j υ1
+
1
φ2j υ2
+⋯+
Because znew and unew are included to forecast the j-step-ahead prediction, the causality of the initial conditions and MVTs in determining the future process variable trajectories are embedded in this augmented PCA forecasting model.
φkj υk
j
k
(18)
Although linear models are used to obtain the time-varying nonlinear dynamic of batch processes, since the process variables are generally gathered at high sample rate, mean centering the data automatically removes the average trajectories of all high5660
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
The T2 statistic on the d.o.f. of the first PLS model can be computed by
dimensional online measurements and hence removes the main nonlinearities relating to the absolute level of the measurements.15 Moreover, the order of the forecasting model expands as new measurements are available, and the parameters of this model will adapt as the batch evolves. As a result, the forecasting model is actually an adaptive nonlinear prediction and provides a more powerful prediction than any fixed multivariate time series model.19
̌ iΛ−1t new, ̌ i Tdof, i 2 = t Tnew,
(19)
where Λ ∈ RA1×A1 is the covariance matrix of the scores matrix T1 ̌ i ∈ RA1, which corresponding to the first PLS model, and t new, considers the d.o.f. at decision point i, is defined as follows: ⎡ 0L + f (i − 1)⎤ ⎥ ̌ i = Ŵ 1T ⎢ t new, ⎢ u# ⎥ ⎣ new, i ⎦
4. CONTROL METHODOLOGY Since the SO-PLS model used in final quality control is an approximation of the practical process, the model should be used in a valid region. Using the model outside this region can deteriorate the prediction result and consequently reduce the quality control performance. Fortunately, the Hotelling’s T2 and SPE statistics can be calculated as indicators of validity for the SO-PLS model. Hotelling’s T2 measures the Mahalanobis distance to the origin in the latent variable space, and the SPE accounts for the residual variability. Such indicators determine how tightly the solution is to be constrained to the region defined by the historical data set. In this section, the indicators that consider only the d.o.f. were first discussed to ensure the validity. And then the optimization problem was formulated to alter the remainder of the MVTs in order to steer the final quality back to the desired region. 4.1. Ensuring the Validity of the SO-PLS Model. The common approach to ensure the validity of the latent variable model is by weighting Hotelling’s T2 statistic in the cost function to avoid using the model for extrapolation.16 However, the weight is an additional parameter that needs to be tuned, and the control performances are highly sensitive to this parameter. Small weight allows using the model in extrapolation, whereas large weight changes the results of the future MVTs. There is no tuning method to determine this parameter in any situation. Yacoub and MacGregor30 used hard constraints in the adjustment to the scores; nonetheless, this method was shown to be too restrictive to give the optimization algorithm the freedom to search the optimum. Recently, Lauri ́ et al.31 proposed to use the indicators that consider only the d.o.f. of controller to ensure the validity, i.e., the indictors (Hotelling’s T2 and SPE statistic) that are calculated only by future MVTs. In this work, the Hotelling’s T2 and SPE statistic on the d.o.f. are also considered as indicators to constraint the SO-PLS model to be used in a valid region. At decision point i, the implemented MVTs u*new,i ∈ Rf(i−1) are available up to i − 1, and the future MVTs are u#new,i ∈ Rf(i−1), which will be determined via the procedure of optimization. We *T u#T can also use uTnew,i = [unew,i new,i] to rearrange the unknown elements at the end of the vector, and F − f (i − 1) is the d.o.f. to be decided by the proposed MCC strategy at the decision point i. Constraining validity in terms of the d.o.f. can be used in situations in which the initial conditions and past trajectories of manipulated variables and process variables may lie outside the valid region. In this case, the indicators may provide a value that is above the threshold regardless of what future MVTs are calculated. Consequently, there may be no solution that provides the values of indicators below the threshold, and the evolution of the final quality control strategy will not proceed. Since the initial conditions znew and u*new,i cannot be changed and the process variable trajectories vnew are determined by znew and unew, we consider only the actual d.o.f. of the future MVTs u#new,i in the first PLS model.
(20)
where 0L+f(i−1) is a vector filled with L + f(i − 1) 0s. Similar to the T2 statistic, the SPE statistic on the d.o.f. of the first PLS model can be obtained by
SPEdof, i = ě Tnew, iěnew, i
(21)
and enew, ̌ i ∈ RF considering only the d.o.f. can be yielded by ěnew, i
⎡ 0L + f (i − 1)⎤ T ⎢ ⎥ ̂ = (IF − PA1W 1 ) ⎢ u# ⎥ ⎣ new, i ⎦
(22)
where IF ∈ R is the identity matrix and PA1 ∈ R is the loading matrix corresponding to the first PLS model. The zero vectors in eqs 20 and 22 make znew and u*new,i giving no contribution to the indictors (Hotelling’s T2 and SPE statistic). 4.2. Quality Control Strategy. In this section, the future trajectories estimating approach is used in conjunction with the SO-PLS model in a MCC strategy design. The SO-PLS model captures the effect of the entire batch trajectories on the final quality, and the augmented PCA model takes the causality and nonlinear relationship among the initial conditions, MVTs, and process variable trajectories. During the batch evolution, online trajectories are available up to the decision point i, and a decision is made on whether or not a correction is deemed necessary. If the predicted quality that follows the current operating policy falls out of the acceptable region, a correction is needed to drive the final quality back to this region. At decision point i, the following optimization problem is solved to obtain the future MVTs with the objective of achieving a desired final quality. F×F
F×A1
#T # min( ynew, ̂ i − ysp)T Q(ynew, ̂ i − ysp) + Δu new, iRΔu new, i # u new, i
s.t.
⎡ z new ⎤ T orth ynew, ̂ i = (Ŵ 1Q̂ 1)T ⎢ u ⎥ + (Ŵ 2 Q̂ 2) v new, k ⎣ new, i ⎦
(23)
(24)
T
# # # ̂ * v̂ new, k = x̂ new, i = PB , i Φ i x new, i
(25)
⎡ T 2 ⎤ ⎡θ 2⎤ ⎢ dof, i ⎥ ≤ ⎢ T ⎥ ⎢⎣ SPE ⎥⎦ ⎣⎢ θSPE ⎥⎦ dof, i
(26)
# # # u min , i ≤ u new, i ≤ u max , i
(27)
Δu#new,i
u#new,i
where ysp ∈ R is the desired final quality and = − present F−f(i−1) upresent represents the present operation new,i , where unew,i ∈ R policy kept unchanged until the end of the batch. Q ∈ RM×M and R ∈ R[F−f(i−1)]×F−f(i−1)] are the relative weighting matrices for setpoint tracking and control penalty, θT2 and θSPE are the threshold of the Hotelling’s T2 and SPE statistic on the d.o.f., and u#min,i ∈ RF−f(i−1) and u#max,i ∈ RF−f(i−1) are the lower and upper bounds of M
5661
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research the future MVTs at decision point i, respectively. In fact, the proposed MCC strategy is a higher level for the within batch control methodology. Once the future MVTs have been obtained at the current decision point i, they are implemented only up to the next control decision point i + 1. At the next decision point the remaining MVTs are reoptimized and the whole procedure is repeated in a receding horizon manner. The nice characteristic of the proposed MCC strategy is that, as the batch evolves, the model accuracy improves, which is the result of shortening the prediction horizon. And the arising effect of increasing model accuracy yields a stabilizing effect on the final quality. This improved model available toward the batch termination benefits the final MVTs that are implemented, which in turn affect the final quality. This can be considered by recognizing that there could be multiple MVTs from the initial conditions to the final desired quality, and the initial manipulated variable moves influence which of the remainder of the MVTs is taken. As long as these MVTs do not lead to the final quality becoming unreachable, the final MVTs eventually determine the proximity to the desired target. In this optimization problem, the objective function in eq 23 is composed of a term for minimizing the difference between the desired quality and the predicted quality and moves the suppression factor. Choosing Q = IM and R = 0F−f(i−1) gives the minimum variance controller. Equation 24 is used to obtain the predicted final quality based on the candidate future MVTs # unew,i , and the future trajectories of process variables are estimated in eq 25, which is required in eq 24 to make the final quality prediction. Equation 26 is employed to constrain the optimization solution in the region where the SO-PLS model is valid, and the 95% confidence limits of the Hotelling’s T2 and SPE statistic on the d.o.f. are calculated to determine the θT2 and θSPE in this work.15 Finally, eq 27 is the hard constraint on manipulated variables. As above-mentioned, eq 26 can be applied by weighting them in the cost function eq 23, i.e.,
Figure 6. Detailed diagram of the solving procedure for the proposed SO-PLS model based MCC strategy.
#T #T min( ynew, ̂ i − ysp)T Q(ynew, ̂ i − ysp)Δu new, iRΔu new, i # u new, i
+ λT 2Tdof, i 2 + λSPESPEdof, i
characteristics, the proposed strategy may not be able to cope with this problem. Under this condition, nonlinear models are more appropriate than linear ones to capture the behavior of the batch processes. Xiong and Zhang proposed to utilize neural networks to model highly nonlinear batch processes, and the nonlinear data-driven models are also employed to carry out online reoptimization control33 and batch-to-batch iterative optimal control.34 Hence, in this work, we assume the proposed strategy is used for linear batch processes or nonlinear batch processes that work in a small region around the nominal operation point.
(28)
where λT2 and λSPE are the weighting factor determining how tightly the solution is to be constrained to the region defined by the identification data set. The main advantage of weighting the indicators is that it is easier to solve as the optimization problem is formulated as a quadratic program (QP). However, the drawbacks in the weighting strategy are λT2 and λSPE should be tuned by trial and error in a specific situation, and the validity of predictions cannot be ascertained in general; the shape of the cost function is modified which can lead to a biased end-point result, and more time for implementation is needed provided the weights are tuned by trial and error. The main drawback of hard constraints is the indicators are quadratic in u#new,i, where the optimization problem eqs 23−27 is solved directly. Thus, the optimization problem yields a quadratically constrained quadratic program (QCQP), which is more computationally demanding than a QP. In this work, the CVX toolbox32 is used to solve this problem. Finally, the detailed diagram is shown in Figure 6 to make the proposed strategy more clear. Remark 4. Application of SO-PLS model based MCC strategy is well performed in linear processes because of its linearity assumption. For some complicated cases in industrial chemical and environmental processes, which especially have nonlinear
5. CASE STUDY In this section, the prediction and control performances of the method proposed here were evaluated over simulation study. To make the comparison fair, all the data were classic mean centered and scaled to unit variance before modeling. All the algorithms were implemented in Matlab R2012b, and CVX toolbox is used to solve the QCQP problem. 5.1. Case 1: Simulated Batch Process. A simulated batch process is utilized to test the proposed MCC strategy for the problem of the batch product quality control. Chin et al.35 presented a nonlinear model of a semibatch reactor. This process model was originally proposed by Von Watzdorf et al.36 as a case 5662
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
to maintain such final product quality around their desired value (QC = 42 mol) in the face of variations in process variables. In our simulation study, the process model in Appendix B is used to generate the data for training the SO-PLS model. Since the reactant B starts to be fed at t1 = 30 min, the rest of the trajectory of the feed rate of B is segmented into seven intervals wherein it remains constant. In order to develop the training data for the SO-PLS model, a pseudo-random binary signal (PRBS) with a magnitude of ±0.2 L/min is appended to excite the process dynamics and generate realistic batch-to-batch variations, and the sampling time used for the PRBS signal is 10 min. The 5% white noise is added to the measurement of final quality. A total of 90 batches of simulated process operational data are generated. Of the 90 batches of data, 60 batches are used to calibrate the SO-PLS model and the remaining 30 batches are used as test data. The solid curves in Figure 7 are the trajectories of state variables and manipulated variable under nominal operation conditions, and the dashed curves are the trajectories for one of these identification batches. In Figure 8, the predicted RMSE at each of 7 decision points for the 30 validation batches is displayed. For the augmented single-PLS model, 14 latent variables are selected by 10-fold cross-validation,37 and the predicted RMSE gradually decreases from 2.3613 to 0.5111. For the proposed SO-PLS model, 6 and 11 latent variables are retained in the first and second, respectively. Cross-validation is also used to determine the optimum numbers in this work, and the most latent variables in the combined [Z U] are considered. Similar prediction results can be found as in Example 2, and the prediction performance at every decision point is greatly improved by the proposed SO-PLS model. The only difference is that since the first decision point is chosen at t1 = 30 min instead of at the start of the batch, the predicted RMSE decreases to 2.2070 at that time. According to Figure 8, three decision points are selected at t1 = 30 min, t2 = 50 min, and t3 = 70 min, and the predicted RMSE declines obviously at these intervals. Next we consider further evaluating the performances of the different forecasting strategies for the future trajectories at certain decision points. The predicted trajectories, made using the available data up to the first decision point and second the decision point for reactor temperature in one of the test data sets, are shown in Figures 9 and 10, respectively. In this example, at
study for deterministic and stochastic simulation of batch process. The model for the process is presented in Appendix B. The consecutive homogeneous liquid phase reaction of A and B first led to the desired product C that reacts further to byproduct D. The feed components and the products are stored in a set of tanks. Due to the strongly exothermic nature of the reactions, a pure batch operation cannot be used as it leads to a risk of reaction runaway.35 Therefore, A is charged initially, and the heat-up is followed until B starts to be fed at t1 = 30 min. The reaction starts at this point and continues until the batch terminal time tf = 100 min. The feed rate of B (FB) is selected as the only manipulated variable, while the concentration of product C (CC) is assumed to be the only controlled variable that is measured after the batch is completed. The control objective for each batch is to make a specified number of moles of product C. Although the concentration of product C is only available after the finish of the batch, other measurements such as the reactor temperature (T) and the jacket temperature (Tj) can be obtained for monitoring and control purposes. These online measurements are collected every 1 min. In order to maintain the constant temperature of reaction, a PI controller is used in this reactor. The initial material concentrations (CAI and CBI), temperatures (TA and TB), set-point of reactor temperature (Tsp), and feed flow rate of B (FB) for a normal case are nominal values. Table 1 Table 1. Nominal Operating Conditions and Variation Ranges of Initial Conditions and Manipulated Variables symbol CAI CBI TA TB FB Tsp
description initial concentration of A (mol/L) initial concentration of B (mol/L) temperature of A (K) temperature of B (K) flow rate of B (L/min) set-point of reactor temperature (K)
lower bound
nominal value
upper bound
0.95
1.00
1.05
0.85
0.90
0.95
293.15 303.15 0.50 298.15
298.15 308.15 1.00 308.15
303.15 313.15 1.50 318.15
lists the nominal values of these variables and their possible variation ranges. Since final quality V(tf) CC(tf) is mainly affected by reactant concentration and within-batch process conditions, they could vary significantly. Therefore, the control objective is
Figure 7. Trajectories of state variables and manipulated variable used for data generation with the nominal trajectories. 5663
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
Figure 10. Forecasting the future process variable trajectory at the 2nd decision point (50 min) using the proposed KDR strategy, ARX time series model, and classic KDR strategy.
Figure 8. Comparison of prediction performances at every decision point using augmented single-PLS model and the proposed SO-PLS model for the case study.
variable trajectories. In order to make the superiority of KDR more convincing, the future prediction mean square errors (FPMSEs)19 of different methods were compared in Table 2, Table 2. Comparison of Forecasting Performances with Different Methods 1st decision point 2nd decision point
SCP
TSR
KDR
ARX
1.4596 1.2768
0.7567 0.6052
0.6879 0.4517
1.8237 1.6927
where SCP is the single-component projection method and TSR is the regression on trimmed scores.24 And higher forecasting accuracy was achieved by the KDR method. It should be noted that estimating performance of the ARX time series model can be further enhanced by multiple model strategies as in ref 13. Given the good prediction performances of the proposed SOPLS model, it can be used for MCC based strategy of the batch product quality control. In the first scenario, a variation was first introduced to the concentration of reactant A, leading to a more concentrated raw material CAI = 1.05 mol/L instead of the nominal case. Then the flow rate of reactant B increases to FB = 1.1 L/min instead of 1.0 L/min in the nominal case from the first decision point that simulated human error in the flow rate setting. If the manipulated variables continue as this set-point, it will result in a bad product quality with QC = 44.04 mol. Thus, the corrective action can only be taken from the second decision point with Q = 1 and R = 05 in this work. For augmented single-PLS model based MCC, after adjusting the set-point of the flow rate of reactant B at the second decision point, the batch can be recovered, and the final quality resulted in QC = 43.90 mol following the trajectory obtained at this decision point. After that, the remainder of the MVT is recalculated at the third decision point, and the final quality was improved with QC = 43.01 mol. Next the proposed SO-PLS model based MCC was used, and at the second decision point, the final quality can be decreased to QC = 43.58 mol, which can be further improved with QC = 42.66 mol. Both of them are much better than that of the augmented single-PLS model based MCC, and this is because the prediction accuracy was enhanced by the SO-PLS model at each decision point and in turn the control performance. Finally, the MVTs calculated at different decision points are compared in
Figure 9. Forecasting the future process variable trajectory at the 1st decision point (30 min) using the proposed KDR strategy, ARX time series model, and classic KDR strategy.
the first decision point (the second decision), the future trajectory of the reactor temperature is determined by initial concentrations of A and B, temperatures of A and B, and the future trajectory of the feed rate of B. These forecasting trajectories were obtained by using the proposed KDR strategy, ARX time series model, and classic KDR strategy that used only the process variables to build the PCA model. As judged from this example, the variation trends of the forecasting trajectories in Figures 9 and 10 obtained by the two estimators (proposed KDR strategy and ARX time series model) all coincided with the actual values. This is because the initial conditions and future MVTs are both considered in the PCA model and ARX model, respectively, and the causal relationship is embedded in the two estimators. Hence, the future temperature after the first decision point (the second decision) can be effectively forecasted, and the forecasting ability is further enhanced by the proposed KDR strategy. However, the variation trend of the forecasting trajectory obtained by classic KDR strategy did not follow the actual values, since the PCA model considered only process 5664
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
extend to the most extreme data points. From Figure 12, we found that the median of the control results for the proposed SOPLS model based MCC was 41.95 mol, which was closer to the desired value than that of the single-PLS model based MCC (41.91 mol). Moreover, the variations of final quality after implementing the correction were reduced by the proposed strategy, and more stable final quality can be obtained in this scenario. Finally, the control performance of the SO-PLS model based MCC that uses indicators considering full MVTs (SO-PLS model (b) in Figure 12) was also illustrated. Although the median of the control results (41.92 mol) was better than that of the augmented single-PLS model due to the high prediction accuracy of the SO-PLS model, it did not outperform SO-PLS (a). This is because the indictors that consider full MVTs sometimes would give an over conservative solution. 5.2. Case 2: Cobalt Oxalate Synthesis Process. To further analyze and evaluate the SO-PLS model based real-time final quality control strategy, a more complex model of a real plant (cobalt oxalate synthesis process) is considered in this work. The synthesis process has been utilized in refs 9 and 38 to test the batch-to-batch optimization methods, which is a liquid phase reaction of cobalt chloride and ammonium oxalate, and further leads to the desired cobalt oxalate crystals. Cobalt oxalate and ammonia chloride are produced:
Figure 11, and the trajectory was refined at the last decision point and resulted in better final quality.
Figure 11. Comparison of MVTs in the first scenario.
In order to further assess the control performance of the proposed strategy in the presence of variation in raw material purity, we consider another 30 new initial conditions that are not in the training or test data sets. If the batches are allowed to progress following the nominal profile, the final product qualities for most of these batches will significantly deviate from the desired value. In this case, assume at the three decision points that the new control actions are calculated by the proposed strategy using eqs 23−27 and maintain unchanged until the next decision point. Figure 12 first displayed the control performances of the augmented single-PLS model and SO-PLS model that use
CoCl 2 + (NH4)2 C2O4 → C2O4 → CoC2O4 ↓ + 2NH4Cl (29)
The process consists of a dissolver and a crystallizer, and the evolution of cobalt oxalate crystal is carried out in the crystallizer operated with continuous stirring. The industrial crystallizer and the cobalt oxalate crystals are shown in Figure 13. A heating
Figure 13. Industrial crystallizer and the cobalt oxalate crystal.
jacket is mounted to maintain the temperature of reaction. Due to the complex nature of the crystallization process, a fed-batch mode is applied. A fixed volume of cobalt chloride is first charged to the crystallizer, and ammonium oxalate is fed during the batch. The moment model of the cobalt oxalate synthesis process is given by the following set of ordinary differential equations:
Figure 12. Control performances for 30 batches evaluating augmented single-PLS model and SO-PLS model based MCC strategies in the second scenario.
indicators only considering the d.o.f. of MVTs (single-PLS model and SO-PLS model (a) in Figure 12), and each one corresponds to a different MCC strategy. Each strategy was evaluated for the 30 new batches and is shown as a box plot. The median of the 30 batches was represented as a red line. The edges of the box were the 25th and 75th percentiles, and the whiskers
dV = FB dt
dμ0 dt 5665
=B−
(30)
FBμ0 V
(31) DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research dμi dt
= iGμi − 1 −
FBμi V
i = 1, 2, 3
FBC BIVAI FC dC = − 3ρc kvGμ2 − B dt V (VAI + FBt )2
this circumstance, the new control profiles are calculated by the proposed strategy and are maintained until the next decision point. Figure 14a illustrated the boxplot that was obtained by
(32)
(33)
F ΔHr dT UA =− (T − Tj) + B (TB − T ) − dt MCp V ρCp FBC BIVAI 2
(VAI + FBt )
−
ΔHc ·3ρ k vGμ2 ρCp c
(34)
where V is the suspension volume, μi is the ith moment of the crystal size distribution, C is the solution concentration expressed in mass of crystal, and T is the temperature in the crystallizer. The feed rate of ammonium oxalate FB is the only manipulated variable. VAI is the initial volume of cobalt chloride, and CBI is the initial concentration of ammonium oxalate. ρc is the density of crystal, and kv is the volumetric shape factor. Tj and TB are the temperatures in the jacket and dissolver, respectively. U is the overall heat-transfer coefficient, A is the total heat-transfer surface area, M is the mass of solvent in the crystallizer, Cp is the heat capacity of the solution, and ρ is the density of the solution. ΔHr and ΔHc are the heat of reaction and crystallization, respectively. The calculation of crystal nucleation rate B and growth rate G are given in Table 3.
Figure 14. Control performances for 30 batches evaluating augmented single-PLS model and SO-PLS model based MCC strategies in the cobalt oxalate synthesis process.
using the augmented single-PLS model and SO-PLS model based MCC strategies. Similar to the control performances in the last case study, the median of the proposed SO-PLS model based MCC was closer to the desired value than that of the single-PLS model based MCC, and the variations of final mean crystal size were apparently reduced by the proposed strategy. To further evaluate the ability of the proposed strategy to handle uneven-length data, a total of 30 batches, whose terminal times range from 660 s to 720 s, are simulated to build the quality model. The ith input vector can be illustrated by [zTi uTi vTi ], where i = 1, 2, ..., 30. Since the initial conditions are all measured before the start of a batch, the matrix Z ∈ R30×5 can be first formed. However, the proposed strategy cannot be directly used in this case, due to the uneven-length characteristic of MVTs and process variable trajectories. Under this circumstance, the online measured data should be equalized. Work has been carried out into various methods for equalizing batch lengths.39 Recently, Corbett and Mhaskar also used subspace identification methods and data-driven model to cope with the batch data with varying durations.40 In this case study, we assume that all the timing differences between trajectories appear at the last stage of the batch process, and the SO-PLS model can be derived by cutting the batches to a minimum length41 as shown in Figure 15. Then the ith equalized MVT and process variable trajectories can be
Table 3. Expressions of the Parameters in the Cobalt Oxalate Synthesis Process Modela
a
parameter
value
nucleation rate B growth rate G solubility CS (T in K)
B = 6.31 × 1031 exp(−1.3621 × 104/T)N2.13ΔC2.775 G = 2.80 × 1014 exp(−1.584 × 104/T)ΔC CS = 0.0001(T − 273)2 + 0.001(T − 273) + 0.1
ΔC = C − CS is the supersaturation.
In this work, the objective for each batch is to make the final mean crystal size with a specific value, which can be calculated as follows
Ln =
μ1(tf ) μ0 tf
(35)
where tf is the batch terminal time. Although the mean crystal size is only available after the finish of the batch, other measurements such as temperatures and feed rate can be collected every 5 s online for monitoring and control purposes. And the initial concentrations can also be measured before the start of a batch. First, 30 even length batch data with terminal time tf = 660 s are generated to calibrate SO-PLS model. The feed rate is segmented into 11 intervals wherein it remains constant, and two decision points at 120 s and 360 s are selected based on the prior process knowledge.38 To satisfy the linearity assumption, the PRBS with a magnitude of ±0.5 L/s is appended to the feed rate of ammonium oxalate to excite process dynamics and generate realistic batch-to-batch variations. The 0.5% white noise is also added to the measurements, and different initial conditions are randomly selected. To illustrate of efficacy of the proposed strategy, we considered another 30 new initial conditions that are not in the training data set. If these batches follow the nominal operation policy, the mean crystal sizes for most of these batches will significantly deviate from the desired value (1.50 μm). Under
Figure 15. Align the uneven length to be a shortest length. 5666
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
Step 2. Make V orthogonal to the score matrix T1, and this is best viewed as estimating the process variables from the initial conditions and MVTs and subtracting the residuals:
illustrated by ǔ i and vǐ , and the corresponding matrices can also b e f o r m e d a s Ǔ = [u1̌ , ǔ 2 , ..., ǔ30]T ∈ R30 × 11 a n d V̌ = [v1̌ , v2̌ , ..., v30 ̌ ]T ∈ R30 × 264 , respectively. Then the SOPLS model can be calibrated by using matrices [ZǓ ] and v̌, and the predicted final mean crystal sizes ŷ ∈ R30 are ŷ = [ZǓ ]b1̂ + V̌
orth
b2̂
orth B̂ 1 = (T1TT1)−1T1TV
(A.1)
̂ orth V orth = V − TB 1 1
(36)
(A.2) orth
Step 3. Perform PLS with V against Y, estimate the new score matrix Torth and weighting matrix Ŵ 2 . Estimate also the 2 regression coefficient matrix Q̂ 2 . Step 4. The full predictive model can be obtained in eq 5. The step for predicting the new sample based on the SO-PLS model can be summarized as follows: Step 1. Center and scale the new initial conditions, MVTs, and process variables as for the training data set. Step 2. Find the new vnew orthogonal to the score matrix T1:
where b1̂ ∈ R16 and b2̂ ∈ R264 are the regression coefficient orth ∈ R30 × 264 is vectors identified by SO-PLS algorithm, and V̌ ̌ the orthogonalized matrix corresponding to V . In Figure 14b, the control performances were draw for unevenlength training data set. We found that the control results were also improved by the proposed MCC strategy, and the median of the mean crystal size decreased from 1.52 to 1.51 μm. This is because, even if the batch data used for model calibrating is with uneven length, the prediction accuracy can still be enhanced by the proposed SO-PLS model. It should be noted that a strong assumption is made by the equalizing method that is employed in this work (cutting the batches to a minimum length), and future work will address the situation where this assumption is relaxed (necessitating a more stochastic approach based on the SO-PLS model to quality prediction and control with more general uneven-length batch data).
̂ orth Tt1,new v orth new = vnew − B1
(A.3)
T where t1,new = [z Tnewu Tnew]Ŵ 1 is the score vector obtained from the first PLS model. Step 3. Simply add the predicted final quality from each of the ‐ PLS znew, unew, and vnew to obtain the predicted quality ŷ SO in eq 6. new
■
APPENDIX B. PROCESS MODEL FOR PRODUCT QUALITY CONTROL CASE STUDY The process under consideration in this paper is a jacketed batch reactor where an exothermic series-parallel first-order reaction takes place (Chin et al.35)
6. CONCLUSIONS A SO-PLS model based MCC for the problem of the batch product quality control has been proposed. The proposed strategy used the SO-PLS model instead of an augmented singlePLS model to drive the batch to a desired final quality by batch termination. Since only the information unique for the particular set of measurements is included to calibrate the data-driven model, SO-PLS methodology can improve the prediction accuracy at a certain decision point and in turn the control performances. Based on the SO-PLS model, KDR is used to estimate the future trajectories of the process variables that are required to perform online final quality prediction. An augmented input data matrix is utilized to build the PCA model for KDR, and the causal relationship of the initial conditions and MVTs in determining the process variable trajectories is also embedded in this estimator. Taking the advantages of latent variable methods, the indicators that consider only the d.o.f. are used as hard constraints to confine the SO-PLS model used in a valid region. Finally, the proposed strategy was demonstrated via simulation of a virtue batch process and cobalt oxalate synthesis process, and it significantly improved on the final quality variance over that obtained by using the single-PLS model.
A+B→C
(B.1)
B+C→D
(B.2)
The following equations describe the batch reactor: V ΔH1 d(VT ) UA (T − Tj) + FBTB − k10e−E1/ RT CA =− dt ρCp ρCp CB −
V ΔH2 k 20e−E2 / RT C BCC ρCp
(B.3)
dV = FB dt
(B.4)
d(VCA ) = −Vk10e−E1/ RT CAC B dt
(B.5)
d(VC B) = C BFFB − Vk10e−E1/ RT CAC B − Vk 20e−E2 / RT C BCC dt
■
(B.6)
APPENDIX A. THE ALGORITHM FOR BUILDING THE SO-PLS MODEL AND ESTIMATION OF NEW SAMPLES Assume that the training data sets are classic mean centered and scaled to unit variance and the detailed step for SO-PLS model building can be illustrated as follows: Step 1. Estimate the PLS estimates for the combined matrix [Z U] and quality matrix Y, and obtain the score matrix T1 and weighting matrix Ŵ 1. Estimate also the regression coefficient matrix Q̂ 1.
d(VCC) = Vk10e−E1/ RT CAC B − Vk 20e−E2 / RT C BCC dt
(B.7)
where ⎧0 t ≤ 30 min FB = ⎨ ⎩Q B(t ) t > 30 min ⎪
⎪
(B.8)
and T(0) = TI, V(0) = VI, and CA(0) = CAI. The parameters of the above model are given in Table 4. 5667
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research
(13) Aumi, S.; Corbett, B.; Clarke-Pringle, T.; Mhaskar, P. Data-driven model predictive quality control of batch processes. AIChE J. 2013, 59, 2852−2861. (14) Srinivasan, B.; Palanki, S.; Bonvin, D. Dynamic optimization of batch process I. Characterization of the nominal solution. Comput. Chem. Eng. 2003, 27, 1−26. (15) Nomikos, P.; MacGregor, J. F. Multiway partial least squares in monitoring batch processes. Chemom. Intell. Lab. Syst. 1995, 30, 97− 108. (16) Flores-Cerrillo, J.; MacGregor, J. F. Control of batch product quality by trajectory manipulation using latent variable models. J. Process Control 2004, 14, 539−553. (17) Wan, J.; Marjanovic, O.; Lennox, B. Disturbance rejection for the control of batch end-product quality using latent variable models. J. Process Control 2012, 22, 643−652. (18) Jia, R.; Mao, Z.; Wang, F.; He, D. Self-tuning final product quality control of batch processes using kernel latent variable model. Chem. Eng. Res. Des. 2015, 94, 119−130. (19) Garacía-Muñoz, S.; Kourti, T.; MacGregor, J. F. Model predictive monitoring for batch processes. Ind. Eng. Chem. Res. 2004, 43, 5929− 5941. (20) Wang, D.; Srinivasan, R. Multi-model based real-time final product quality control strategy for batch processes. Comput. Chem. Eng. 2009, 33, 992−1003. (21) Næs, T.; Tomic, O.; Mevik, B. H.; Martens, H. Path modelling by sequential PLS regression. J. Chemom. 2011, 25, 28−40. (22) Jørgensen, K.; Mevik, B. H.; Næs, T. Combining designed experiments with several blocks of spectroscopic data. Chemom. Intell. Lab. Syst. 2007, 88, 154−166. (23) Næs, T.; Tomic, O.; Afseth, N. K.; Segtnan, V.; Måge, I. Multiblock regression based on combinations of orthogonalisation, PLSregression and canonical correlation analysis. Chemom. Intell. Lab. Syst. 2013, 124, 32−42. (24) Arteaga, F.; Ferrer, A. Dealing with missing data in MSPC: several methods, different interpretations, some examples. J. Chemom. 2002, 16, 408−418. (25) Laurí, D.; Lennox, B.; Camacho, J. Model predictive control for batch processes: Ensuring validity of predictions. J. Process Control 2014, 24, 239−249. (26) Chen, J.; Lin, K. C. Integrated batch-to-batch control and withinbatch online control for batch processes using two-step MPLS-based model structures. Ind. Eng. Chem. Res. 2008, 47, 8693−8703. (27) Hermanto, M. W.; Braatz, R. D.; Chiu, M. Integrated batch-tobatch and nonlinear model predictive control for polymorphic transformation in pharmaceutical crystallization. AIChE J. 2011, 57, 1008−1019. (28) Zhang, Y. W.; Qin, S. J. Fault detection of nonlinear processes using multiway kernel independent analysis. Ind. Eng. Chem. Res. 2007, 46, 7780−7787. (29) Zhang, Y. W.; Qin, S. J. Improved nonlinear fault detection technique and statistical analysis. AIChE J. 2008, 54, 3207−3220. (30) Yacoub, F.; MacGregor, J. F. Product optimization and control in the latent variable space of nonlinear PLS models. Chemom. Intell. Lab. Syst. 2004, 70, 63−74. (31) Laurí, D.; Sanchis, J.; Martínez, M.; Hilario, A. Latent variable based model predictive control: Ensuring validity of predictions. J. Process Control 2013, 23, 12−22. (32) Grant, M.; Boyd, S. CVX: Matlab software for disciplined convex programming. Available http://cvxr.com/cvx/. (33) Xiong, Z.; Zhang, J. Neural network model-based on-line reoptimization control of fed-batch processes using a modified iterative dynamic programming algorithm. Chem. Eng. Process. 2005, 44, 477− 484. (34) Xiong, Z.; Zhang, J. A batch-to-batch iterative optimal control strategy based on recurrent neural network models. J. Process Control 2005, 15, 11−21. (35) Chin, I. S.; Lee, K. S.; Lee, J. H. A technique for integrated quality control profile control and constraint handling for batch processes. Ind. Eng. Chem. Res. 2000, 39, 693−705.
Table 4. Constant Parameter in the Simulated Batch Model
■
parameter
value
TI VI UA/ρCP VΔH1/ρCP VΔH2/ρCP k10 k20 E1/R E2/R
298.15 (K) 50 (L) 0.375 (L/min) −28.50 (K·L/mol) −20.50 (K·L/mol) 5.0969 × 1016 (L/(mol·min)) 2.2391 × 1017 (L/(mol·min)) 12035 (K) 13450 (K)
AUTHOR INFORMATION
Corresponding Author
*Tel.: +86 24 83689525. Fax: +86 24 83687434. E-mail:
[email protected] (R.D. Jia). Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS The authors would like to acknowledge the anonymous reviewers for their helpful comments. This work was supported by the National Natural Science Foundation of China (Nos. 61203103, 61374147, 61533007, and 61503384) and the Fundamental Research Funds for the Central Universities (Nos. N140404019 and N150402001).
■
REFERENCES
(1) Bonvin, D.; Srinivasan, B.; Hunkeler, D. Control and optimization of batch process. IEEE Contr. Syst. Mag. 2006, 26, 34−45. (2) Flores-Cerrillo, J.; MacGregor, J. F. Within-batch and batch-tobatch inferential-adaptive control of semibatch reactors: A partial least squares approach. Ind. Eng. Chem. Res. 2003, 42, 3334−3345. (3) Srinivasan, B.; Bonvin, D.; Visser, E.; Palanki, S. Dynamic optimization of batch process II. Role of measurements in handling uncertainty. Comput. Chem. Eng. 2003, 27, 27−44. (4) Doyle, F. J., III; Harrison, C. A.; Crowley, T. J. Hybrid model-based approach to batch-to-batch control of particle size distribution in emulsion polymerization. Comput. Chem. Eng. 2003, 27, 1153−1163. (5) Paengjuntuek, W.; Kittisupakorn, P.; Arpornwichanop, A. Batchto-batch optimization of batch crystallization processes. Chin. J. Chem. Eng. 2008, 16, 26−29. (6) Flores-Cerrillo, J.; MacGregor, J. F. Iterative learning control for final batch product quality using partial least squares models. Ind. Eng. Chem. Res. 2005, 44, 9146−3345. (7) Zhang, Y.; Fan, Y.; Zhang, P. Combining kernel partial leastsquares modeling and iterative learning control for the batch-to-batch optimization of constrained nonlinear processes. Ind. Eng. Chem. Res. 2010, 49, 7470−7477. (8) Sanzida, N.; Nagy, Z. K. Iterative learning control for the systematic design of supersaturating controller batch cooling crystallization processes. Comput. Chem. Eng. 2013, 59, 111−121. (9) Jia, R.; Mao, Z.; Wang, F.; He, D. Batch-to-batch optimization of cobalt oxalate synthesis process using modifier-adaptation strategy with latent variable model. Chemom. Intell. Lab. Syst. 2015, 140, 73−85. (10) Wang, Y.; Gao, F.; Doyle, F. J., III Survey on iterative learning control, repetitive control, and run-to-run control. J. Process Control 2009, 19, 1589−1600. (11) Golshan, M.; MacGregor, J. F.; Bruwer, M. J.; Mhaskar, P. Latent variable model predictive control (LV-MPC) for trajectory tracking in batch processes. J. Process Control 2010, 20, 538−550. (12) Aumi, S.; Mhaskar, P. Integrating data-based modeling and nonlinear control tools for batch process control. AIChE J. 2012, 58, 2105−2119. 5668
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669
Article
Industrial & Engineering Chemistry Research (36) Von Watzdorf, R.; Naef, U. G.; Barton, P. I.; Pantelides, C. C. Deterministic and stochastic simulation of batch/semicontinuous processes. Comput. Chem. Eng. 1994, 18, S343−S347. (37) Xu, Q. S.; Liang, Y. Z. Monte Carlo cross validation. Chemom. Intell. Lab. Syst. 2001, 56, 1−11. (38) Zhang, S. N.; Wang, F. L.; He, D. K.; Jia, R. D. Batch-to-batch control of particle size distribution in cobalt oxalate synthesis process based on hybrid model. Powder Technol. 2012, 224, 253−259. (39) Lu, N.; Gao, F.; Yang, Y.; Wang, F. PCA-based modeling and online monitoring strategy for uneven-length batch processes. Ind. Eng. Chem. Res. 2004, 43, 3343−3352. (40) Corbett, B.; Mhaskar, P. Subspace identification for data-driven modeling and quality control of batch processes. AIChE J. 2016, 62, 1581−1601. (41) Kourti, T. Multivariate dynamic data modeling for analysis and statistical process control of batch processes, start-ups and grade transitions. J. Chemom. 2003, 17, 93−109.
5669
DOI: 10.1021/acs.iecr.5b03863 Ind. Eng. Chem. Res. 2016, 55, 5654−5669