Data Driven Modeling for Monitoring and Control of ... - ACS Publications

Dec 9, 2013 - A key basis for advances in batch process control is the availability of a model for batch operation. The basis for such a model can be ...
0 downloads 5 Views 3MB Size
Article pubs.acs.org/IECR

Data Driven Modeling for Monitoring and Control of Industrial Fed-Batch Cultivations Dennis Bonné,†,‡ María Antonieta Alvarez,†,§ and Sten Bay Jorgensen*,† †

CAPEC, Department of Chemical and Biochemical Engineering, Technical University of Denmark, Søltofts Plads, 2800 Kongens Lyngby, Denmark ABSTRACT: A systematic methodology for development of a set of discrete-time sequence models for batch control based on historical and online operating data is presented and investigated experimentally. The modeling is based on the two independent characteristic time dimensions of batch processing, being time within the batch and the batch number. The model set is parsimoniously parametrized as a set of local, interdependent models which are estimated from data for as few as half a dozen batches. On the basis of state space models transformed from the acquired input−output model set, the asymptotic convergence of iterative learning control is combined with the closed-loop performance of model predictive control to form an optimal controller aiming to ensure reliable and reproducible operation of the batch process. This learning model predictive controller may also be used for optimizing control through optimization of the bioreactor operations model. The modeling and preliminary control performance is demonstrated on an industrial fed-batch protein cultivation production process. The presented methods lend themselves directly for application as Process Analytical Technologies.

1. INTRODUCTION Semibatch and batch processes are periodic processes which are used thanks to their logistic flexibility and modest capital requirement when managing many different product grades. Other considerations, such as the ease of handling substrate inhibition in fermentations and of traceability in pharmaceutical industry, also favor application of periodic operating forms. These processes have the common trait of a repeated operation which starts from nearly the same initial conditions. In this paper batch processes also will be used as a common denomination. For periodic processes the time within the period or batch t and the period or batch number k are the two characteristic independent time variables. The abovementioned operational benefits of batch processing are however in open loop operation counteracted by a large susceptibility to disturbances. The encountered disturbances occur from batch to batch, i.e., interbatch disturbances such as changes in raw material properties or variations in startup initialization, and as disturbances occurring during the individual batch execution, i.e., intrabatch disturbances. As a consequence of these disturbances significant variations in product quality and quantity are often observed between batches in industrial practice. Compensating for these disturbances have been difficult in the past due to the nonlinear and time-varying behavior of batch processing and to the fact that reliable on- or in-line sensors for monitoring final product quality rarely are available. Consequently there is a significant interest in developing systematic methods both for improving the reproducibility of batch processing and for facilitating optimization of the batch operations trajectory and tracking thereof by using suitable forms of closed loop control. Thus, the batch control problem becomes that of rejecting both intra- and interbatch variations while following timevarying reference trajectories through imposing suitable actions on the manipulated variables. The key characteristic of the batch control problem is the time-varying characteristics of batch © XXXX American Chemical Society

processing which together with the multivariable nature of many control problems in the chemical and biochemical batch process industries apparently did preclude theoretical progress within batch control for many years. However, the development of fundamental understanding of multivariable control and the increase in computational power have enabled development of feasible solutions1 which subsequently has been further developed.2 Thereby significant benefits have been made potentially available to reduce the effects of uncertainties embedded in batch operation. With the environmentally dictated increased focus on biologically based raw materials, an increased focus on batch processing may be expected. There is therefore a need for modeling of such processes in order to investigate whether the significant benefits achieved by advanced control of continuously operating processes also may be within reach on batch processes. A key basis for advances in batch process control is the availability of a model for batch operation. The basis for such a model can be a first principles model, a data driven model, or a hybrid between the two types. For many processes the first principles model type may be readily obtained and identified; however, in several cases it may be difficult to obtain first principles models. In particular for bioreactors reliable first principles models are still not available partly due to lack of understanding of the complex regulatory networks in a microorganism and to the many interacting phenomena in bioreactors. The development of advanced projection methods have revealed that multivariable batch processes may in fact be modeled from data3 as also shown for batch bioreactors.4 Special Issue: John Congalidis Memorial Received: December 3, 2012 Revised: December 5, 2013 Accepted: December 9, 2013

A

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

To simplify notation, define the input uk, output yk, one sample shifted output y0k , and disturbance vk profiles as

These advances indicated that data driven time series modeling of these processes should be feasible. That aspect is reported in this work exemplified on an industrial fermentation process. Other methods such as neural network models5 and support vector regression models6 have been investigated and applied often for the purpose of introducing a midcourse correction control signal. However, during, e.g., bioreactor operation, such a correction may easily occur too late. Instead it is important to be able to control it carefully, i.e., as soon as possible during the batch execution. For this reason development and application of a specific type of data driven models that approximate the behavior of batch chemical processes is presented in this paper. The modeling methodology was preliminarily presented in refs 7 and 8 and in more detail by Bonné.9 This paper presents methods for development of a model set for batch operation which can ensure reliable reproducible operation and which may enable optimizing operation. The contribution comprises data driven time series modeling of batch processes and a learning model predictive control methodology. The modeling methodology produces both a linear time-invariant state space model representation for interbatch prediction and a linear time-varying state space model representation for intrabatch prediction. The modeling approximates the nonstationary and nonlinear behavior of batch processes with a set of local but interdependent linear regression models parametrized as autoregressive moving average models with exogenous inputs (ARMAX). Tikhonov Regularization is applied to estimate the parameters of this model set. Learning model predictive control is presented for control of repeated operation of stochastic linear time-varying systems with finite time horizons. The methods have been implemented as a MATLAB toolbox (the toolbox is available for research purposes) called Grid of Linear Models (GoLM).9 The purpose of this paper is also to present an experimental evaluation of this data driven modeling methodology for monitoring and multivariable control of batch processes. The data driven modeling, identification, and control methodologies are presented in Section 2 with emphasis on the model identification. The industrial bioreactor case study is introduced in Section 3. A data driven model set intended for control is developed using data from a few open loop operated batches with perturbations of the variables which are intended to be used as control actuators. The implementation of the learning model predictive control is shown together with results from the first industrial test. The developed methodology is discussed, and conclusions are drawn in Section 4.

uk = [uk′ ,0uk′ ,1...uk′ , N − 1]′

yk = [yk′,1yk′,2 ...yk′, N ]′

yk0 = [yk′,0 yk′,1...yk′, N − 1]′

vk = [vk′ ,1vk′ ,2...vk′ , N ]′

(1)

Batch processes are modeled as a set of N LTI models for each output, where each low order model describes the causal behavior of one variable from one measurement point to the next. Such a set can also be referred to as one LTV batch model. The LTI models can be parametrized in a number of ways, but in the present contribution an ARMAX parametrization was chosen. This choice of parametrization offers a simple multivariable system description with a moderate number of model parameters. The objective of the model set is to quantify the causal correlations from the inputs uk and disturbances vk to the outputs yk. The GoLM toolbox models the differences between two successive batches with the ARMAX model, as detailed in the appendix: Δyk = yk − yk − 1 = −AΔyk0 + BΔuk − Cvk

Δy0k

(2)

where Δuk = uk − uk−1 and = − The matrix dimensions and the subsequent model transformations are given in the Appendix, where also disturbance modeling is discussed in more detail. The batch ARMAX model 2 may be converted into different representations dependent on the particular application task. 2.1.1. Interbatch Representations. If the task at hand is to predict (or simulate) the behavior of an entire batch before it is started, the following form is convenient y0k

y0k−1.

Δyk = H Δyk ,0 − GΔuk + Fvk

(3)

where Δyk,0 = yk,0 − yk−1,0. The change in the initial conditions Δyk,0 can be considered as either an input/control variable or a disturbance variable. In the latter case a model for the disturbance is to be included as detailed in the Appendix. The form 3 above is also convenient for the task of classification (e.g., normal or not) of a batch after it has been completed. Furthermore, this form can be used to determine open-loop optimal recipes in the sense of optimizing an objective for the batch. If the objective is to minimize the deviations ek = [e′k,1, e′k,2, ..., e′k,N]′, ek,t ∈ ny, from a desired batch operations model y,̅ then eq 3 can be modified into ek = y ̅ − yk = ek − 1 − H Δyk ,0 + GΔuk − Fvk

(4)

In the present paper the batch operations model y ̅ is assumed constant. However if optimization of the batch operations model is attempted then this may vary between batches. The two representations 3 and 4 of the batch ARMAX model above are applicable to off-line or interbatch type applications. 2.1.2. Intrabatch Representations. For online estimation, monitoring, feedback control, and optimization it is convenient to use a state space realization of the batch ARMAX model. In an observer canonical form, which is structurally a minimal realization, the state space realization is given as

2. METHODS The methods behind the GoLM toolbox are briefly presented below. The data driven modeling and the model types are described in Section 2.1. The model identification is described with emphasis on the treatment of the online data and on model assessment for a given modeling purpose in Section 2.2. The control design with state and output estimation is described in Section 2.3. 2.1. Datadriven Modeling. The available data includes data from a set of NB batches. Within each batch the data are sampled N times during the nominal batch length. The raw data needs a pretreatment prior to modeling, as described under data preparation in Section 2.2. The data sequences from batch k constitutes the process outputs yk,i ∈ ny, inputs uk,i−1 ∈ nu, and disturbances vk,i ∈ ny, for i = 1, ..., t, at times t = 1, ..., N.

xk , t = ( t xk , t − 1 + )t Δuk , t − 1 + ,t vk , t Δyk , t = yk , t − yk − 1, t = *xk , t

(5)

with Δuk,t−1 = uk,t−1 − uk−1,t−1 and the initial condition xk ,0 = *′Δyk ,0 . Just as in eq 3, the state space model form 5 is B

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

measurement noise, and (3) reconstruct missing data sequences. In the present work the following three step data preparation procedure is proposed: 1. Data outliers and complete batches that are inconsistent with normal operation are removed either by visual inspection of the data or after a classification analysis. If data from several identical vessels is used, then incompatible sensor calibrations and/or scalings are corrected if possible. 2. The data is resampled with a Kernel smoother without reconstructing missing data sequences. In the same operation, most of the high frequency measurement noise is removed. Care should be taken, however, to avoid erroneously removing system dynamics when removing measurement noise. Any remaining measurement noise is taken care of in the model estimation procedure described below. 3. Missing data sequences are reconstructed by resampling the difference profile between a resampled batch with missing data sequences and the previous or subsequent resampled batch with no missing data sequences, with a Kernel smoother. Missing data sequences are assumed to originate from sequences of outliers and/or temporary failures in the data acquisition system. After Kernel smoothing, the noise free observation zk̅ = [z′k̅ ,0, z′k̅ ,1, ..., z′k̅ ,N]′, with zk̅ ,t ∈ (ny+nu), is zk̅ = z ̂k + ωk, where z ̂k is the estimated observation and ωk = [ω′k,0, ω′k,1, ..., ω′k,N]′, ωk,t ∈ (ny+nu), is a sequence of estimation errors. The estimation error ωk will consist of both random and systematic errors such as the height of a characteristic peek being underestimated due to excessive smoothing due to too low bandwidth and/or trimming the hills and f illing the valleys due to too low local regression order. The estimation error ωk can thus be modeled with a random walk with respect to the batch index k:

convenient for prediction, monitoring, and optimization type applications, and it facilitates online implementation of such applications. Furthermore, the state space model form 5 is particularly well suited for closed-loop or feedback control applications. For tracking control applications the state space model form 5 can be modified to xk , t = ( t xk , t − 1 + )t Δuk , t − 1 + ,t vk , t ek , t = yt ̅ − yk , t = ek − 1, t − *xk , t

(6)

2.2. Model Identification. With the basic batch ARMAX model 2 derived above, the parametrization of the batch model is in place; however, the model orders and the model parameters still need to be determined from process data. A major drawback of the proposed parametrization is the immense dimensionality of the resulting set of models; this important issue is specifically dealt with in the following. The first step in any modeling work should clearly be the definition of the modeling objective: What problems are to be solved, and are they to be solved through monitoring and/or control and/or optimization of the batch operation trajectory? How is the most value added with the least effort? etc. Once the modeling objective is well-defined, the appropriate inputs and outputs can be chosen as basis functions (e.g., exp(z1) or z1z22 of variables z1 and z2) of either existing process variables or new process variables introduced as new actuators or sensors. If the modeling objective is monitoring, control, and/or optimization of product quality variables only measured sparsely off-line, then these can, once resampled according to the desired sampling intervals, be included as online measurements. If all the required inputs and outputs are chosen among the existing process variables, then historical process data can be used for the model identification. Otherwise, the necessary new actuators and/or sensors are installed on the process and a few subsequent batch runs perhaps with designed input perturbations when these variables are planned to be used as control actuators. The obtained process data usually has outliers, missing data, and varying length of the different batches. Therefore, pretreatment of the raw data is most desirable, e.g., as described next. 2.2.1. Data Preparation. Having selected the inputs and outputs and obtained the raw data, these should be inspected and prepared before model estimation is attempted. This preparation is achieved by filtering and resampling in accordance with the modeling objective, e.g., monitoring, control, and/or optimization of product quality variables. It is noteworthy to point out that the sample intervals can be placed in time, where they do the most good in terms of approximating the process behavior. If the modeling purpose is monitoring, then it may be advantageous to (re)sample more frequently around shifts between constraints during the batch processing. If the purpose is control then obtaining reliable information around specific quality related events during the batch may imply a desire for more frequent sampling where these are expected to occur. Thus, according to the batch dynamics, the data is resampled sufficiently densely to comply with the modeling objective. The nominal batch length of N samples is selected as the length of the shortest representative batch. If the batch length differs from the nominal length then the data is shortened (or augmented) to the nominal batch length. Once outliers have been removed from the data, a Kernel smoother is used since it provides an efficient statistical method for approximating the response of the individual variables. Kernel smoothing10 can be used to (1) resample the data, (2) remove most of the high frequency

ωk = ωk − 1 + vk

(7)

where vk = [v′k,0, v′k,1, ..., v′k,N]′, vk,t ∈  , represents a sequence of batchwise nonpersistent estimation errors that are assumed to be zero-mean. In the above paragraph the number of inputs and outputs has been assumed constant throughout the batch ny = ny(t), nu = nu(t) ∀t. This assumption is used in the remainder of this paper unless otherwise indicated. The expected difference between two successive batches is then given as the difference between their respective estimates. E{Δzk̅ } = E{zk̅ − zk̅ −1} = Δz ̂k. Furthermore, as the disturbance profile ν̃k,l = wk − wk−l is zero-mean for any l, the batch ARMAX model 2 can describe the difference between any two batches. Thus the difference profiles between any two batches can be used for identification. 2.2.2. Parameter Estimation. Several suggestions to how (sets of) LTI or LTV models should be identified from data can be found in the literature.11−13 In a modeling framework similar to the one employed here, Dorsey and Lee14 proposed to estimate both a set of finite impulse response (FIR) models and a LTI SS model using principle component analysis (PCA) and N4SID,13 but such a set of nonparametric FIR models is obviously nonparsimonious and would exhibit poor predictive capabilities.12 Others15 have suggested estimating a set of independent, overlapping local LTI SS models using canonical variant analysis (CVA). Instead the present contribution proposes to estimate a set of interdependent grid-point LTI (ny+nu)

C

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

models. Enforcing model properties, however, inevitably introduces bias into the model parameter estimates. There will thus be a trade-off between the bias and variance of the model parameter estimates, and this trade-off will determine the predictive capabilities of estimated models. One coefficient shrinkage based parameter estimation method that can incorporate model properties into wLS estimates is Tikhonov Regularization (TR).

ARMAX models using a novel interpretation of Tikhonov Regularization (TR)10 as described below. The batch ARMAX model 2 can be formulated as a pseudolinear regression model Δyk = Δxkθ + vk

(8) p

where Δxk = Δxk(Δy0k , Δuk, vk) ∈ ny ,nθ is a structured regressor matrix with past outputs, inputs, and disturbances and θ = θ(A, B, C) ∈ nθ is a column parameter vector with the model parameters from the batch ARMAX model. Taking the expectation of the linear regression model 8 gives Δyk̂ = E{Δyk } = Δxk̂ θ

̂ (V̂ , W , Λ) = arg min[J TR ] θTR wLS θ

s.t .

with Δx̂k = Δûk, v̂k) and where v̂k is an estimate of the disturbance or one-step-ahead (OSA) prediction error profile. This means that, if the process variable estimation error model 7 is a valid approximation, then estimation of model parameters from the pretreated data will give unbiased model parameter estimates. A multivariate weighted least squares (wLS) estimate of the model parameters in the linear regression model 8 is found by solving the following minimization problem min[JwLS ] θ

N Best

JwLS =

∑ (||Δyk̂

2 − Δxk̂ θ||W )

k=1 N Best

=

∑ ((Δyk̂

− Δxk̂ θ)′W (Δyk̂ − Δxk̂ θ))

k=1 N Best

=

N

∑ (∑ ((Δyk̂ , t k=1

− Δxk̂ , t θt )′Wt(Δyk̂ , t − Δxk̂ , t θt )))

t=1

(10)

where ||·|| denotes the 2-norm, W is a block diagonal weighting matrix with symmetrical, positive definite, block elements Wt ∈ ny(t),ny(t), and Nest B is the number of batches for model parameter estimation. The linear estimation problem 10 is in general structurally underdetermined, i.e., npy < nθ, and will thus in general have infinitely many solutions. The minimum norm (or variance) and unique estimator θ̂ is however found by solving 10 with principle component regression (PCR) (also referred to as truncated singular value decomposition (TSVD)) or ridge regression (RR) see, e.g., refs 10 and 16−18. To reduce the variance of model parameter estimates without introducing bias, as much data of sufficiently high quality as possible should be used for the model parameter val estimation. Typically, the available data set of NB = Nest B + NB est + Ntest batches is split up into sets of N batches for model B B parameter estimation, Nval B batches for model validation, and Ntest B batches for model testing. The linear system 9 is thus augmented as Y = [Δy1′̂ Δy2′̂ ...ΔyN̂ ′ est ]′ = [Δx1̂ ′Δx2̂ ′...Δx ̂N′ Best]′θ = Xθ B

2 = ||Y − Xθ||W + ||Lθ||2Λ2

(12)

where the overall estimation error V̂ = [v̂1′ , v̂2′ , ..., v̂NestB ′]′, W is defined in connection with eq 10, and the structured penalty matrix L maps the parameter vector θ into the desired parameter differences while the diagonal weighting matrix Λ weights the parameter differences, such that the penalty ΛLθ is a column vector of weighted differences between parameters in neighboring grid-point models. The penalty matrix L consists of five submatrices Lm: L = [L1′ L2′ L3′ L4′ L5′ ]′ which are individually weighted by block diagonal weighting matrices Λm: Λ = diag([Λ1 Λ2 Λ3 Λ4 Λ5]). To reduce the complexity associated with design of the penalty matrix L and the weighting matrix Λ, each of the submatrices Lm for m = 1, ..., 5 is penalizing violations of specific model (parameter) properties: L1θ Approximates the first order time derivative of the parameters θ. It thus incorporates local model interdependency by penalizing the model parameters time evolution. L2θ Approximates the second order time derivative of the parameters θ. It thus incorporates local model interdependency by penalizing nonsmoothness of the model parameters time evolution. L3θ Approximates the first order time derivative of the impulse response of the local models θt. It thus enforces dampened impulse responses onto the local model parameter estimates by penalizing the time evolution of the impulse responses. L4θ Approximates the second order time derivative of the impulse response of the local models θt. It thus enforces smooth impulse responses onto the local model parameter estimates by penalizing nonsmoothness of the impulse responses. L5θ Penalizes the variance of the parameter estimates θ̂ and thus enforces minimum variance estimatesL5 is simply an identity matrix. The estimated parameter vector θ̂TR(V̂ , W, Λ) is a function of the weighting matrix Λ that determines the coefficient shrinkage and hence the trade-off between bias and variance. This means that the regularization matrix Λ can be used to tune the predictive capabilities of the model estimate to the specific modeling purpose. Through the particular choice of penalty matrix L, the regularization matrix Λ also determines the interdependency between the grid-point models in the model grid. Note that if L′Λ2L is designed to be positive definite, then a unique solution to the estimation problem 12 is guaranteed. In general, if the Hessian matrix / = X ′WX + L′Λ2 L is positive definite, then the estimation problem 12 has a unique solution. Note, when used alone, the penalty matrix L5 reduces the TR problem 12 to a standard form TR problem or a RR

(9)

Δxk(Δŷ0k ,

s.t .

TR JwLS

(11)

The linear system 11 will, however, most likely still be rankdeficient, and solving it in a wLS sense would still produce model parameter estimates with excessive variance. One approach to reduce the variance of model parameter estimates is to enforce desired model properties. One such model property could be that neighboring grid-point models are analogous in the sense that they exhibit similar behavior. In fact, without this property, the model would constitute a set of independent models and not an entangled grid of interdependent D

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

generalization error . is supplemented (see below) with the fitting error - from the model parameter estimation

problem. Furthermore, with the proper choice of weighting matrix Λ52 (which for unbounded uncertainties will have to be negative definite8,19), this penalty formulation also gives the solutions to total least squares (TLS) problems,16−20 i.e., to estimation problems with uncertain inputs. In fact, it is for this reason that LS is applied instead of TLS in the modeling methodology presented here, although the inputs (X) are indeed uncertain. 2.2.3. Implementation Issues. By reformulating the TR 2 TR objective function JTR wLS in eq 12 into JwLS = ||Y̅ − Xθ̅|| , it is obvious that if the Hessian matrix / is positive definite, then the optimal model parameter estimate θ̂TR, eq 12, is the unique solution to the linear normal equations ̂ (V̂ , W , Λ)) = 0 X̅ ′(Y ̅ − X̅ θTR

-(θ )̂ =

NBval

∑ || Δyk̂

(13)

k=1

∑ || Δyk̂

2

− Δyk̂ (θ )̂ ||

k=1

(15)

- ρ(θ )̂ = (1 − ρ)- ps(θ )̂ + ρ - osa(θ )̂ .ρ(θ )̂ = (1 − ρ). ps(θ )̂ + ρ . osa(θ )̂

(16)

where superscripts indicate the type of model predictions (i.e., Δŷps or Δŷosa). The modeling purpose ρ should be chosen such that a model suitable for the intended model application is selected based on the model assessment. In practice, values in the interval 0.382 < ρ < 0 0.618, seems to work well for general purpose models. 2.3. Control Design. In order to use 6 for tracking control, it is necessary to estimate the states and outputs based on the noisy output observations. This estimation problem is presented before the learning model predictive control algorithm. The output profile and state estimation is formulated as a combination of a Kernel Smoother and a Kalman filter. Smooth estimates of output profiles are obtained with a Kernel Smoother between batch operations while state estimates are obtained with a Kalman filter during batch operation. This combination is suggested in part to allow for multiple sampling rates and in part to reduce the covariance of the estimates. If the output profiles were not estimated separately as smoothened estimates with a Kernel smoother but instead

2

− Δyk̂ (θ )̂ ||

N Best

i.e., the mean batch prediction error when the model θ̂ is fitted to the estimation data set. Thus far the nature of the model predictions Δŷ has not been discussed. The most general model prediction for model assessment is the τ-step-ahead prediction Δŷk,t+τ given the input profiles of batch k and the outputs up to and including time t in batch k. The number of steps ahead τ should be chosen such that a model suitable for the particular application is selected based on the model assessment. Most often, OSA prediction (i.e., τ = 1) is the method of choice. Another computationally tractable choice of prediction horizon is pure-simulation (PS), which means that the output profiles are predicted given the initial conditions, the input profiles, and for the estimation data ̂ set, the estimated OSA prediction error profiles (v̂k): Δŷps k (θ) = Ĥ (θ̂)Δŷk,0 − Ĝ (θ̂)Δûk + F̂(θ̂)v̂k. PS prediction represents the most demanding application of a batch process model. Thus, if a models quality is assessed to be good in terms of the estimated . ps (generalization error based on PS model predictions), the model will have significant credibility in any application. In general, OSA model prediction error based model assessment and selection has a tendency to undersmooth, while PS model prediction error based model assessment and selection has a tendency to oversmooth. However, as OSA and PS are the limit points for τ-step-ahead prediction, a compromise between these two errors provides a computationally tractable alternative to traditional τ-step-ahead prediction error based model assessment and selection. The compromise is selected by the modeling purpose ρ which is scalar between 0 and 1, where 0 favors PS while 1 favors OSA. This alternative to τ-step-ahead prediction error based model assessment uses the following weighted fitting and generalization errors

The geometric interpretation of the normal equations is that LS estimates are obtained when the OSA prediction errors (Y̅ − X̅ θ̂TR(V̂ , W, Λ)) are orthogonal to the linear subspace spanned by the regressors (X̅ ). The solution θ̂TR(V̂ , W, Λ) to the estimation problem 12 is obviously dependent on the estimated OSA prediction error profiles V̂ , which until now have been assumed known. These OSA prediction error profiles are, however, neither measurable nor known. This means that the OSA prediction error profiles have to be estimated from data simultaneously with the model parameters and consequently that the estimation problem is a pseudo-linear estimation problem. Here a numerically efficient explicit solution to the linear model parameter estimation problem 12 is used to solve the pseudo-linear estimation problem by solving a sequence of linear LS problems. Thereby the model parameters in a pseudolinear regression (PLR) model are estimated with a method which belongs to a class of PLR algorithms,21 that gives fast, but less accurate, approximations of Prediction Error Method (PEM)22 estimates. Note, however, that the algorithm does not require monitoring as the regularization guarantees bounded parameter estimates and hence bounded OSA prediction error profile estimates. 2.2.4. Model Assessment and Selection. Models are in general assessed in terms of their quality to provide a basis for model selection and to determine their validity. The quality of a model is most often evaluated by how well the model generalizes to an independent data set. That is, how well the model predicts the regressors/outputs given the predictors/ inputs, in a data set unseen by the model. The simplest and probably most widely used method for quantitative assessment of the quality of a model is cross-validation, which estimates the generalization error . when a model θ̂ (θ̂ is notation for θ̂TR(V̂ , W, Λ)) is applied to an independent data set 1 G(θ )̂ = val NB

1 N Best

(14)

where Δŷk is the prediction of the regressors in batch k of the independent validation data set. However, when it comes to assessment of ARMA(X) models, the drawback of cross-validation is that the accuracy of the model parameters in the MA part of a model is not properly assessed as the stochastic signals driving the MA part are unknown for the independent data set. This means that the estimated generalization error . on it is own is intractable for a selection of ARMA(X) models.10 To overcome this intractability in the assessment of ARMA(X) models, the estimated E

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

estimated as filtered estimates with a Kalman filter, the output error integrator built into the batch models that have been proposed throughout this paper would inadvertently integrate the state and hence output profile and their covariance, which would obviously not be attractive. As a prerequisite it is assumed that the observer canonical form of the LTV model 5 is observable; i.e., the initial condition can be estimated from a given input profile.9 Assume that during batch (k), observations zk,t of the outputs yk,t are collected at times t = 0, 1, ..., N and let the optimal estimate of the state xk,t2 in batch k at time t2 given data up to and including time t1 ≤ t2 be given as the conditional mean: x̂k,t2t1 = E{xk,t2|0k , t1} where the information 0k , t 0k , t = {zk , t , Δuk , t , 0k , t − 1}, 00, −1 = {y−1 , u−1 , z −1}

A controller designed to reject disturbances with respect to time within a batch will, hence, also, due to the integration of output and input errors into the model framework, asymptotically reject the effects of batchwise persistent disturbances with respect to batch index. Consequently, to utilize available information during a batch to obtain the best possible tracking performance, the following learning model predictive control formulation N

{Δuk , l , t }lN=−t 1 = arg

minN − 1 [

{Δuk , i}i = t



ek′̂ , i|t Q iek̂ , i|t + Δuk′, i − 1R iΔuk , i − 1]

i=t+1

s.t . xk̂ , i|t = ( ixk̂ , i − 1|t + )iΔuk , i − 1 ek̂ , i|t = ek̂ − 1, i|N − *xk̂ , i|t umin i − 1 ≤ Δuk , i − 1 + uk − 1, i − 1 ≤ umax i − 1

0k , −1 = 0k − 1, N ,

ymin i ≤ yi ̅ − ek̂ , i|t ≤ ymax i

(19)

(17)

is solved at times t = 0, 1, ..., N − 1 in batch k and the control sequence

Then the tracking error ek,t2 in batch k at time t2 given data up to and including time t1 is estimated as êk,t2|t1 = êk−1,t2|N − *xk̂ , t2 | t1 where the smoothened estimate of the error profile in batch k − 1 is

Δuk* = [Δuk′ ,0,0 Δuk′ ,1,1 ... Δuk′ , N − 1, N − 1]′

(20)

(18)

then approximates a closed-loop optimal control sequence. That is, at time t in batch k, eq 19 is solved based on updated state estimates, and the input uk,t = Δuk,t + uk−1,t is implemented on the process.

and ŷk−1|k−1 is the smoothened output profile estimate from batch k − 1, e.g., obtainable with a Kernel Smoother.9 The control design methodology used here for batch processing has been preliminarily presented.23 Before presenting the specific methodology there are two important control related points to be made about the trajectory tracking model form 4. First of all, as the error profile ek in batch k depends on the error profile ek−1 from batch k − 1, the effects of the batchwise persistent disturbances are integrated with respect to batch index. This means that a properly designed controller will be able to reject the effects of the batchwise persistent disturbances asymptotically with respect to batch indexe.g., removing the effects of recipe and model bias. Second, given the above-mentioned asymptotic behavior and as the control actions generated by such a controller are deviations from the control/input profile realized in the previous batch, the control actions (Δuk) due to batchwise persistent disturbances will converge asymptotically to zero with respect to batch index. In the literature24 it is said that the controller learns to reject the batchwise persistent disturbancesi.e., the resulting controller is an iterative learning control (ILC) scheme. A more precise formulation would be that both output and input errors are modeled using integrators with respect to the batch index. The trajectory tracking model representation 4 is similar to that of Lee et al.,1 but the model representation differs significantly since eq 4 includes the effect of initial conditions (HΔyk,0) and disturbance propagation (Fvk). Another important difference is that eq 4 does not include double dependence on the batchwise persistent disturbancesi.e., the trajectory tracking model representation 4 only includes the batchwise persistent disturbances as represented by ek−1 and not as both the part of ek−1 caused by the batchwise persistent disturbances and the batchwise persistent disturbances themselves. Following the discussion above, a multivariable feedback controller designed using the trajectory tracking SS model form 6 will be able to reject the effects of the batchwise persistent disturbances asymptotically with respect to batch index.

3. INDUSTRIAL BIOREACTOR The application of the presented methodology is evaluated on a fungal cultivation in an industrial pilot plant bioreactor producing an α-amylase protein called Fungamyl using the microorganism Aspergillus oryzae. A key modeling challenge stems from the fact that the rheology of the cultivation is complicated by the growth of filaments wherein the main activity resides during microorganism growth and protein production. The fermentation process follows a standard recipe at the Novozymes A/S Pilot Plant with the online and offline variables specified in Table 1. The physical connections between the process and the computer system are sketched in Figure 1. All sensor signals are connected to a DeltaV System that communicates with a DeltaV Server over a local network. The PI Database Server stores all the data acquired from the DeltaV Server using a “swinging door” algorithm. To avoid any influence from the “swinging door” algorithm the host PC collects data directly from the DeltaV Server every 10 s, and this data set is further prepared for analysis and model identification as detailed below. The fermenter feed line flowsheet is also shown in Figure 1. Two feed lines are available during the cultivations to excite or actuate the total inlet feed flow rate (FT) and the inlet substrate concentration (Sf). The latter is the total amount of substrate per mole of total feed. The cultivation is initiated by transferring a carefully designed and sterile batch substrate to the reactor. Subsequently the inoculum is transferred from the seed tank to the reactor where microorganism growth is initiated. During the following batch phase the abiotic variables reactor temperature and back pressure are controlled automatically while the reactor pH is manually controlled. After the inoculum is transferred to the reactor the seed tank is sterilized and now filled with sterile water. After the batch phase fed-batch operation is initiated with supplying substrate of concentration Sd and flow rate Fd to the bioreactor through control valve V1. During fedbatch operation the water flow rate from the “seed tank” is

ek̂ − 1|k − 1 = [ ek′̂ − 1,1|N ek′̂ − 1,2|N ... ek′̂ − 1, N |N ]′ = y ̅ − yk̂ − 1|k − 1

F

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

flow rate are (nearly) independently perturbed. The two mass balances give

Table 1. On-Line and Off-Line Pilot Plant Variables variables Online Output Variables fermentation time NH3 flow rate accumulated NH3 dissolved O2 tension (DOT) pH back pressure O2 uptake rate (OUR) accumulated O2 CO2 evolution rate (CER) accumulated CO2 evolution weight viscosity refractive index (RI) respiratory quotient (RQ) temperature air flow rate Off-Line Output Variables enzyme activity biomass, CTS ammonia concentration laboratory pH Online Input Variables feed dosing flow rate (Feedd), measured accumulated dosing Feedd flow dosing Feedd flow rate, set point water feed flow rate (Feedw), measured accumulated water Feedw flow Feedw flow rate, set point

unit

⎛ F ⎞ FT = Fd⎜1 + w ⎟ Fd ⎠ ⎝

h g NH3/h g % of saturation bar gauge mol/h mol mol/h moles kg cP no unit no unit °C NL/min

Sf = Sd

(21)

1 1+

Fw Fd

(22)

Therefore, the aim is to design correlated time sequences Fd and Fw, such that the reactor variables FT and Sf will be nearly independently perturbed. This is achieved by linearizing the mass balances and decoupling the variables. Defining the steady state flow rates [Fsd, Fsw] and a = Sd/((Fsd + Fsw)2), the linearized static model becomes s ⎛ FT ⎞ ⎛ FTs ⎞ ⎛ 1 ⎛ ΔFw ⎞ 1 ⎞⎛ ΔFw ⎞ ⎛ FT ⎞ ⎜ ⎟ ⎜⎜ ⎟⎟ = ⎜ s ⎟ + ⎜ ⎜⎜ ⎟⎟ ⎜ ⎟ = + A ⎟ s s ⎜ ⎟ ⎜ s⎟ ⎜ ⎟ ⎝ ΔFd ⎠ ⎝ Sf ⎠ ⎝ S f ⎠ ⎝−aFd aFw ⎠⎝ ΔFd ⎠ ⎝ S f ⎠

(23)

This defines the inlet perturbation A matrix. Defining the covariance matrix P for [FT, Sf ]T, the Lyapunov equation when assuming no process noise, becomes P = ARAT = diagonal[σ21 σ22]. Here σ21 and σ22 are the variances of Fw and Fd, respectively. The square root matrix H of R = HHT is determined from P. Thereby Fw and Fd become

FAU/g g/L mg/L L/h L L/h L/h L L/h

⎛ Fw ⎞ ⎛ Fws ⎞ ⎛ e1 ⎞ ⎜⎜ ⎟⎟ = ⎜⎜ s ⎟⎟ + H ⎜ ⎟ ⎝e2 ⎠ ⎝ Fd ⎠ ⎝ Fd ⎠

(24)

where e1 and e2 are the perturbed signals which are uncorrelated PRBS signals with a clock period of 4 sample intervals for e1 and 8 sample intervals for the e2 signal. The reference signals for FT and Sf are ramp signals, which start at the beginning of the fed-batch phase. 3.1. Bioreactor Model Development. The aim is to illustrate estimation and model assessment using the presented methodology. A preliminary model development was previously

manipulated using control valve V2. The two flows are mixed to obtain the desired perturbation in the inlet substrate concentration and the total flow rate. As seen in Figure 1, it is important to suitably design the perturbation signals to the two valves to ensure that substrate concentration and total

Figure 1. Flowsheet for bioreactor pilot plant feed line with a sketch of the data communication between the pilot plant and the computer system. G

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Table 2. Measured Variables for Each Batcha variable

type

1082

1098

1099

1108

1110

1111

input perturbations RI viscosity NH3 flow rate accumulated NH3 biomass enzyme activity feedback control calculated C feed conc

on-line on-line on-line on-line on-line off-line off line on-line C mol/L

X X X X 12.48

X X X X 11.08

X X X X 15.68

X X X X X X 14.20

X X X X X X -

X X X X X X X -

a Carbon concentration in dosing flow is shown in the bottom line. The concentration is considered constant during each fed-batch phase, except when the carbon concentration is perturbed during the fed-batch phase.

presented.8 First, the available data sequences for identification are presented. Then, variable selection is illustrated and models are estimated and validated. A total of 11 batches were available from production of α-amylase in Aspergillus oryzae cultivations on the same bioreactor from the Novozymes Pilot Plant. A Classification analysis25 revealed that only the six batches shown in Table 2 seem to represent the normal operational pattern for the process. To use the batches for development of a model set for normal operation, the bottom line in Table 2 shows the calculated carbon concentration in the dosing except for the batches wherein the dosing is perturbed. The data preparation for the variables used for model estimation is carried out by first removing outliers, i.e., data that is inconsistent with the normal operation by visual inspection is removed. Also, dead time produced by sensor delay is removed. A kernel smoother is used to resample data to reduce the effect of noise and of long sampling interval for the off-line analyzed variables. Thereby high frequency measurement noise is removed, and missing data may be reconstructed. Considerable care must be exercised, filtering the measurement noise to avoid removing system dynamics information. Table 2 illustrates that not all the measurements are available for all the batches. The ammonia flow rate measurements are not directly available for batches 1082 and 1098; hence, this flow rate is estimated through differentiation of measured accumulated ammonia. 3.1.1. Bioreactor Model Estimation. The input variables to the process are feed flow rate FT and substrate feed concentration Sf. In batches up to and including batch 1109 (not shown in Table 2), the feed trajectories prescheduled by the batch operations model were applied. But in the two subsequent batches the two input variables were perturbed both to improve the information content of the data and to prepare the resulting model for usage in feedback controlled bioreactor operation. Here the decoupling input variable design 24 was used in batch 1111, while correlated perturbations were introduced in batch 1110. A correlation analysis for the resulting FT and Sf in batch 1111 shows that there is a crosscorrelation of 0.0646, which indicates almost independency as desired. Initially a perturbation design25 was started with a perturbation magnitude of 10% of the actual scheduled trajectory values with a clock period of four or eight sample intervals as specified previously to leave sufficient time for the process to react to the perturbations. However, the operational personnel wanted larger perturbations such that the introduced variations were more clearly visible in the output variables. Therefore the perturbation magnitude for total flow rate was increased up

25% of the nominal value as clearly seen in the input sequences in Figure 2. For model development it is important to select output variables for model estimation that contain relevant dynamic information. The back pressure, temperature, and pH are all controlled; therefore, only their control actuator signals may be used. The air flow rate is kept constant during the cultivation and may be constrained by the capacity of the central compressor system. The remaining process variables do contain relevant information and may be used to estimate a model with the presented methodology. The model development followed the below sequential methodology: 1. Initial selection of output variables, of local model type and of maximumbut lowlocal model order. 2. Initial selection of the scalar value for model purpose (ρ) 3. Selection of batches used for estimation, validation, and insofar as possible an independent set for testing. 4. Selection of additional measurements to include into the modeling. The output variable data sequences shown in Table 2 were initially NH3 flow rate, CER, DOT, and enzyme activity. Later also refractive index (RI) and viscosity became available. Several different models have been estimated with different combinations of output variables, different sets of batches, and different values of the modeling purpose ρ in order to obtain a model with reasonable predictive power such that the model may be used to evaluate a control implementation on the pilot plant. After several model development attempts it was necessary to conclude that batch 1082 does not represent the normal process behavior despite the results of the classification analysis. With only five batches available for modeling the normal behavior it is relevant to investigate the influence of the total feed flow rate and the substrate feed concentration perturbation signals on the achievable model quality. For this purpose two initial models were estimated using two inputs (FT and Sf) and the four traditional outputs (NH3, DOT, CER, and enzyme concentration). The ARX model structure is used with a maximum local model order of two and with a model purpose value of ρ = 0.618. This value is selected due to the limited experimental data set available; hence, greater emphasis is placed on one step ahead than on pure simulation prediction. The first model, pertb0, was estimated using batches 1098 and 1099 without added perturbationsfor identification and batches 1108, 1110, and 1111 for validation. For testing of the model the same batch set was used as for validation. It should be noted that the estimation batches did have a few operator induced perturbations but only in the total feed flow rate. Table 3 shows H

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 2. Industrial bioreactor input data. Designed inlet perturbations applied in batches 1110 and 1111. In batch 1111 the nearly uncorrelated perturbation design was applied. Top: Total feed flow rate with 4 sample interval clock period. Bottom: Feed substrate concentration with 8 sample interval clock period.

the resulting fits for each output variable and the overall model fit. The second model, pertb1, was estimated used batches 1110 and 1111both with input perturbationsfor identification and batches 1098, 1099, and 1108 for validation and testing. The most pronounced effect is seen on the DOT fit which is poor when there are no deliberate perturbations of the input variables for estimation in model pertb0. The overall fit is also significantly better for model pertb1 which has perturbations on both inlet flow variables in the estimation set. Consequently it

was decided to include the batches with perturbations in the estimation set in the sequel for investigating models based upon different combinations of output variables. When estimating a data driven model, it is necessary to use data that contains the most relevant information about process behavior in order to obtain a model with good predictive capabilities. Consequently model sets were estimated with two additional measured outputs: refractive index RI and viscosity as soon as these measured variables became available. I

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Table 3. Two Models Developed from Batches without or with Designed Inlet Perturbations and All with a Local ARX Order of 2a output fits

batch sets

a

model

inputs

Nest

Nval

Ntest

NH3

DOT

CER

Enz.

total fit

pertb0

Ft

1098 1099

0.3407

0.1124

0.1322

0.1947

Ft Sf

1110 1111

1108 1110 1111 1098 1099 1108

0.195

pertb1

1108 1110 1111 1098 1099 1108

0.0806

0.1464

0.1168

0.1429

0.1367

Model pertb0: without designed perturbation around input trajectory. Model pertb1: with designed perturbations around input trajectory.

Table 4. Bioreactor Models Estimated with Different Combinations of Four to Six Measurementsa output fits model

inputs

NH3

DOT

CER

Enz.

Visc.

total fit

biomod1

Ft Sf Ft Sf Ft Sf Ft Sf

0.0837

0.1012

0.074

0.1158

-

-

0.095

0.0896

0.1078

0.085

0.1382

0.107

-

0.1052

0.0754

0.0841

0.0788

0.0625

-

0.1083

0.0832

0.0755

0.0873

0.0666

0.0873

0.0409

0.0838

0.0718

biomod2 biomod3 biomod4

RI

a All models have a local ARX order of 2. Enzyme activity (Enz) is measured off-line, but Kernel smoothed data is used for model estimation. The remaining variables are measured on-line when available. For all models the batch sets where selected as follows. Nest: 1099, 1110, and 1111. Nval: 1098 and 1108. Ntest: 1098 and 1108.

batch 1098 around the shift from batch to fed-batch operation as would be expected since the shift depends upon a pH increase toward the end of the batch phase. Otherwise the main deviations occur toward the end of the cultivation where some more variations are seen, especially for ammonia, DOT, and viscosity. For viscosity the model prediction variations seem to stem from measurement variations due to the perturbations in the input variables. The batch increment deviation between batches 1098 and 1108 measured and modeled with biomod4 shown in the lower part of Figure 3 clearly illustrates that even though the fit is quite reasonable for biomod4 there are minor deviations in the estimated enzyme activity. It must be noted that in these experiments only four measurements were obtained during batch 1098 of the off-line measured enzyme activity. The other variables actually fit reasonably well. The grid point modeling approach provides reasonable modeling flexibility toward the variable timing of the shift from batch to fed-batch. In summary the grid point modeling approach enables obtaining a model set with a reasonable predictive power throughout the batch duration. Investigating the performance of this model during a L-MPC controlled experiment is desirable as described in the following. 3.1.2. Bioreactor Control Experiment. The estimated biomod4 is used for an L-MPC control design as detailed in Section 2.3 and further specified below. The control objective is batch reproducibility, and the batch used as reference for the control implementation is batch 1111. This batch was chosen because the DOT is kept high with a low viscosity, which enables increasing the inputs without the risk of inducing a lack of oxygen, and furthermore the enzyme activity is relatively high. The minimum input constraints are zero while the maximal input constraints are umaxi = [8 15]T. The minimum output constraints are all zero while the maximal constraints are ymaxi = [400 100 70 15 20 700]T for i = 1, ..., ny, where y(1) is NH3

Both measurements provide information about biomass and product concentration during the cultivation, while RI also provides information about substrate concentration. These two measurements were attempted combined with the four more traditional measurements used initially. Table 4 shows the fits for four estimated models. The five batches available for the estimation are divided into three batches, 1099, 1110, and 1111 for estimation, and two batches, 1098 and 1108 for validation. Therefore, there is not an independent set of batches for testing; hence, the batch set for validation is also used for testing. Calculation of the generalization error 16 is used to continue the iteration until convergence when the error is less than the user specified value. This procedure is supplemented with a visual inspection validation. The second model biomod2 in Table 4 which is also based on the RI (refractive index) does not fit as well as biomod1. The main difference is a poorer fit to the enzyme activity. However, biomod3 which instead includes the viscosity measurement clearly is overall better than the two above, mainly due to better fits of enzyme activity and DOT. Including both the RI and the viscosity measurement when estimating biomod4 improves the overall fit significantly. In particular the fits of RI, CER, and viscosity are improved, while enzyme activity is not fitted as well as by biomod3. These results clearly indicate that including both RI and viscosity provide complementary information probably on substrate breakdown as well as on the influence of the microorganism filaments on the rheology and enzyme production. This information clearly is relevant for the overall fit obtained with biomod4. To simulate a batch using a GoLM model set, it is necessary to use a previous batch as reference, since GoLM estimates a model set for the difference between two batches, as yk = Δŷk + yk−1. In Figure 3 the validation for biomod4 when simulating batch 1098 is shown. The model displays some deviations from J

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 3. Validation simulation of the developed model biomod4 for batch 1098. Top: Full measurement values. Bottom: Increment between batches 1098 and 1108.

flow rate (g/h), y(2) is DOT (% of saturation), y(3) is CER (mol/h), y(4) is RI, y(5) is the viscosity (cP), and y(6) is the enzyme activity (FAU/g). Following the control objective the weighting matrices were designed as constant diagonal matrices: Qt,diag = [100 100 10 10 10 100]T and Rt,diag = [1.1 1.1]T. This controller is applied in a subsequent batch 1112 which has perturbations on the inlet variables as batch 1111. During the batch phase of run 1112, shown in Figure 4, the growth proceeds as usual in open-loop since there is no feed stream to actuate on. At the end of the batch phase, the

viscosity and enzyme activity for batch 1112 are both higher than in batch 1111. The feed supply starts at time 30 in batch 1112. At time 34, the biotic control loops are closed. The batch phase terminates earlier for batch 1112 than for the reference batch; however, fairly soon after the start of the feedback operation the measurements in Figure 4 follow those of the reference batch, except that DOT rises a little higher than in the reference batch. Around time 57 the viscosity starts to be clearly lower than for the reference batch while DOT rises even higher and RI starts to increase above the reference. At time K

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 4. Online variables measured for experiment AFF1112 and AFF1111 (the reference batch).

Figure 5. Inlet variables for experiment AFF1112 and AFF1111 (the reference batch). Top: Total flow rate FT. Bottom: Substrate feed concentration.

considering the modeled behavior. Consequently the cultivation was stopped at time 88 after the inoculation start. Later the analysis of the off-line measurements confirmed the abovementioned inference developed during the cultivation that this run did not follow the intended behavior well after around time 60. Figure 6 shows the off-line measured biomass concentration and enzyme activity for batches 1112 and 1111. Immediately after the substrate inflow initiates the fedbatch phase, the biomass concentration in batch 1112 decreases a little, but from time 62 onward the biomass concentration remains low. The enzyme activity also become low around time 70 and onward.

67 also the CER decreases below the reference. At time 76 the NH3 flow rate starts increasing significantly above the reference while DOT remains high and viscosity is even lower. All these deviating signs indicate that biomass is not as high as for the reference batch. The actuator signals Sf and FT in Figure 5 generally follow the reference with minor modifications until time 65 where the feed concentration is temporarily decreased and especially around time 85 where the feed flow rate is significantly increased compared to the reference flow rate. Clearly the fermentation behavior now deviates from the intended trajectories even though the actuator moves are sensible L

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 6. Off-line variables measured for experiments AFF1112 and AFF1111 (the reference batch).

After a closer inspection it turned out that the dosing tank unfortunately was contaminated with a bacillus infection. Thus once substrate feeding started it subsequently slowly contaminated the cultivation. The above results demonstrate that the designed model based controller is able to control the cultivation reasonably well up until around time 60 whereafter the infecting strain starts taking over due to faster growth. The above results also demonstrate that the developed model may be used by operators for monitoring to detect infection of a cultivation such that it may be stopped as early as possible.

applications in control and potentially also for optimization of batch chemical processes. The methodology produces both a linear time-invariant (LTI) state space model set capable of interbatch prediction as well as a linear time-varying (LTV) state space model set capable of intrabatch prediction. The methodology furthermore presents a computationally efficient algorithm for estimation of pseudo-linear regression (PLR) model approximations of rank-deficient systems based on an innovative application of Tikhonov Regularization. The methodology finally presents a novel scheme for model quality assessment. The output profile and state estimation has been formulated as a combination of a Kernel Smoother and a Kalman filter. Batch processes are typically sampled much more frequently than necessary for their model representations to be usable for design of Kalman filters and ILC or L-MPC controllers. When the output profiles are estimated separately with a Kernel Smoother, which is model-free, all available process data can be utilized resulting in better estimates with lower covariances. The experimental modeling results demonstrate that a predictive cultivation model set indeed may be developed from a limited set of production experiments whereof a few batches have suitably designed input perturbations of the future actuator variables. In fact when the experienced industrial personnel saw the responses from the input perturbations and realized that they did not impair the cultivations they were enthusiastic and wanted to employ these for other purposes as well. The experimental input perturbation design for the industrial cultivation uses an amplitude of up to 25% of the nominal value which indeed yields informative measurement data, as realized during two cultivations. However, clearly lower amplitudes also can provide most useful information; e.g., the amplitude may be further reduced as process constraints are approached. Concatenating the perturbed batches to previous unperturbed batches, it is possible to obtain sufficient data for successful GoLM model identification for control with half a dozen batches. From the previous batches it is most important to eliminate batches with a qualitatively different behavior

4. DISCUSSION AND CONCLUSIONS When modeling batch processes for monitoring and control, the methodological choices are dominated by three functional requirements. The first is the desire to enable rejection of inter- and intrabatch production process disturbances. The second is the limited available knowledge for the particular process, in the presented case of the microbial physiology where the regulatory networks of microorganisms still are largely unknown. Consequently a data driven modeling methodology has to be selected. The third requirement originates from the control aspect where it is considered mandatory to enable execution of control actions fairly often during cultivation to follow a desired trajectory and to prevent the cultivation from entering into operating regions where irrecoverable microbial stress responses may be triggered which might lead to opening of new pathways where substrate may be lost to other purposes than producing the desired product. This third overriding consideration lead to the choice of a discrete time sequence modeling framework. The discrete time model set is developed using the data driven GoLM approach from sets of existing batch operating data complemented with a few runs with designed input perturbations in order to obtain informative data for the modeling for control. The presented methodology develops a batch auto-regressive moving average model set with exogenous inputs (ARMAX) and its transformation into state space representations designed for specific M

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

it also can be most relevant to use the data driven methods to design controlled experiments, e.g., for improving specific aspects of the first principles batch processing knowledge. Thereby improved batch reactor design and batch operational design may be developed. Given the presented methods it may be expected that batch monitoring and control can enable significant improvements in reproducibility of batch operations and also in their optimization for cases where reliable first principles models are not directly available. Such improvements have been exploited in multivariable control of many continuous plants, where these plants are no longer competitive if they are not implemented with advanced model predictive control which most often is based on data driven linear models. However for batch chemical processes which are extensively operated with open loop implementation of the batch operations trajectory, it can be expected that the achievable benefits of closed loop optimization may be substantially larger than seen for continuous processes.

mainly using classification. Furthermore, the importance of using measurement variables that contain most relevant information about the process is also clearly demonstrated. The experimental control results demonstrate that the developed controller actually is able to track the desired trajectories for the key reactor variables, such as dissolved oxygen tension (DOT), carbon dioxide evolution rate (CER), viscocity, refractive index (RI), and enzyme activity at least until an infecting strain starts taking over the substrate consumption approximately one-third into the cultivation due to a faster growth rate. Since the infection later was detected to originate from the dosing tank, then it can be stated that the infection develops slowly in the cultivation, but unfortunately faster than the growth of the desired microorganism. The experimental cultivation control results also indicate that the developed estimator may be combined with the actuator signals to uncover potential infections through monitoring earlier than normally would be possible, by detecting when the bioreactor behavior deviates significantly from the behavior represented in the model set. Thereby this study illustrates the potential importance of developing monitoring of the modeled batch process for unfalsification of process performance relative to the behavior represented in the developed plant model set used for control design. In practice such unfalsification could be carried out at every sample instant before more advanced control action is permitted to be executed. The data driven approach applied in the presented modeling method obviously restricts the usability of such a model set to similar cultivations of the same microorganism; however, if the product is a bulk product which is produced relatively often, then many batches will be able to benefit from an estimator and control design, such that there may be a sound economical basis for such an investment into advanced batch process control. The benefit is obviously even larger if a high value product is produced sufficiently often. Unless otherwise stated, the methodologies presented in the present contribution are based on the assumptions: (1) that batch processes can be represented by an LTI model set describing the batch process evolution from batch to batch and an LTV model set describing the batch process evolution both from batch to batch and in time within a batch and (2) that optimal solutions to optimal control problem formulations are independent of the replacement of stochastic variables with their conditional means (i.e., the certainty equivalence principle applies), that optimal control formulations can be separated into an optimal regulator problem and an output and state estimation problem, and that this separation is optimal (i.e., the separation theorem for linear systems with quadratic costs,26 which is a manifestation of the certainty equivalence principle, applies). These two general assumptions are most likely not valid in practical implementations., If the closed-loop performance of a batch process is superior to its open-loop performance, it is, however, of less importance whether or not the certainty equivalence principle applies. Several issues remain open for further investigation, e.g., related to batch modeling, including development of GoLM models for each phase of the batch and/or varying the sampling interval and perturbation amplitude around the shifts between the different batch phases. Also, development of optimization of the batch operations trajectory is a relatively low hanging fruit of the developed technology. With such a methodology



APPENDIX: MODEL MATRICES, DISTURBANCE MODELING, AND MODEL TRANSFORMATIONS For batch modeling it is convenient to define the following variables and references for each discrete time instant t during batch k of length N: • Output variable yk,t ∈ ny(t) with reference yt̅ ∈ ny(t) • Input variable uk,t ∈ nu(t) with reference u̅t ∈ nu(t) • Disturbance variable wk,t ∈ nu(t) Using an ARX model parametrization, the output deviation from the reference yt̅ − yk,t at time t in batch k may be given as a weighted sum of nA(t) past output deviations and nB(t) past input deviations yt ̅ − yk , t = −at , t − 1(yt −̅ 1 − yk , t − 1) − ... − at , t − nA(t ) (yt −̅ n (t ) − yk , t − n (t )) + bt , t − 1(ut̅ − 1 − uk , t − 1) + ... A

A

+ bt , t − nB(t )(ut̅ − nB(t ) − uk , t − nB(t )) + wk , t

(25)

where nA(t), nB(t) ∈ [1, ..., t] are the grid-point ARX model orders and ai,j ∈ ny(i)×ny(j) and bi,j ∈ ny(i)×nu(j) are the structured grid-point ARX model parameter matrices. The model parameter matrices are structured due to the causality. The model orders nA(t ) = max{nA(i , j , t )|i , j = 1, ..., ny(t )} nB(t ) = max{nB(i , j , t )|i = 1, ..., ny(t ), j = 1, ..., nu(t )} (26)

where the order for connection j to i in matrix m at time t nm(i,j,t) is the highest model order at time t, for notational simplicity. Note that as the grid point intervals are modeled with individual grid-point models, the sample points do not have to be equidistantly spaced in time. Let N be the nominal batch length in terms of number of samples, and define the input uk, output yk, one sample shifted output yk0, and disturbance wk profiles: y = [y1′y2′ ...yN′ ]′ u = [u0′u1′...uN′ − 1]′

y 0 = [y0′ y1′...yN′ − 1]′ w = [w1′w2′...wN′ ]′

(27)

Note that the model variables y and u may well be basis functions of the process variables and that not all initial N

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

conditions y0 are measurable and/or physically meaningful e.g., off-gas measurements. Thus, the ARX model set may be expressed in matrix form y ̅ − yk = −A(y ̅ 0 − yk0 ) + B(u ̅ − uk) + wk

Considering the difference between two successive batches the model form 2 given in the paper results in Δyk = yk − yk − 1

(28)

= −A(yk0 − yk0− 1) + B(uk − uk − 1) − wk + wk − 1

where A,B are structured lower block triangular matrices with the dimensions N

dim A =

dim B =

= −AΔyk0 + BΔuk − Cvk

N−1

A batch ARMAX model that is independent of the reference profiles (y,̅ u)̅ and batchwise persistent disturbances has been obtained. With such a batch ARMAX model the path is prepared for multivariable, model-based monitoring, control, optimization, and of course simulation. During the model derivation above it was assumed that the outputs are known. This is however not the case in practice, where only a sequence zk = [zk,0 ′ , zk,1 ′ , ..., zk,N ′ ]′, zk,t ∈ ny(t) of noisy observations of the outputs is available

∑ n y (t ) × ∑ n y (t ) t=1

t=0

N

N−1

∑ ny(t ) × ∑ nu(t ) t=1

(29)

t=0

To exemplify, if it is assumed that nA(t) = nA, then A has the following structure ⎡ a1,0 ⎤ ⎢ ⎥ ⎢ ⋮ ⋱ ⎥ ⎥ A = ⎢ anA ,0 ··· anA , nA − 1 ⎢ ⎥ ⋱ ⋱ ⋱ ⎢ ⎥ ⎢⎣ aN , N − nA − 1 ··· aN , N − 1⎥⎦

zk = [ yk′,0 yk′]′ + εk

where εk = [ε′k,0, ε′k,1, ..., ε′k,N]′, εk,t ∈  is a sequence of measurement noise terms that are assumed to be zero-mean, independent, and identically distributed. Interbatch Model. The above batch ARMAX model 2 may be converted into different representations dependent on the particular application task. If the task at hand is to predict (or simulate) the behavior of a batch before it is started, the form 3 in the paper may be developed as follows:

The profile w is a sequence of disturbance terms caused by several effects, which include bias in the reference input profile u̅ (i.e., the reference input profile does not produce the reference output profile), the effect of process upsets, modeling errors from linear approximations, and errors due to bias in transition times between sets of active constraints. Hence, the disturbance w contains contributions from both batchwise persistent disturbances, such as recipe/input bias, model bias, and erroneous sensor readings, as well as from random disturbances, which occur with no batchwise correlation. It thus seems reasonable to model the disturbance profile w as a random walk model with respect to the batch index k wk = wk − 1 + Δwk

Δyk = −AΔyk0 + BΔuk − Cvk = −ÁΔyk ,0 − [Ǎ 0]Δyk + BΔuk − Cvk = −E−1ÁΔyk ,0 + E−1BΔuk − E−1Cvk = H Δyk ,0 − GΔuk + Fvk

where Á and Ǎ are partitions of A corresponding to y0 and [y′1, y′2, ..., y′N−1]′, respectively, and

(30)

E = I + [Ǎ 0]

where the increment disturbance profile Δwk is modeled with a moving average (MA) model with respect to time Δwk , t = vk , t + ct , t − 1vk , t − 1 + ··· + ct , t − nC(t )vk , t − nC(t )

nC (t ) = max{nC (i , j , t )|i , j = 1, ..., ny(t )}

(31)

(32)

In matrix form the disturbance model is expressed as Δwk = Cvk

(33)

where the dimension of the structured lower block triangular matrix C is N

N

wk ,0 = wk − 1,0 + vk ,0

∑ n y (t ) × ∑ n y (t ) t=1

(36)

The matrix E is obviously lower-triangular and positive definite, and its inverse can thus always be efficiently computed by forward substitution. Note that the disturbance matrix F models the propagation of batchwise nonpersistent disturbances including batchwise nonpersistent model−plant mismatch. The initial condition Δyk,0 can be considered either as an input/control variable or as a disturbance. The distinction between the two possibilities depends on the information on and control of the outputs prior to a batch. If the initial conditions are considered disturbances, it is necessary to model them. Then the initial output deviation from the reference is also modeled as a random walk with respect to batch index

with model order nC(t) ∈ [0, ..., t − 1]

dim C =

(35) ny(t)

y0̅ − yk ,0 = wk ,0

(34)

t=1

and the sequence vk = [v′k,1,v′k,2, ..., v′k,N]′, vk,t ∈  represents batchwise nonpersistent disturbances that are assumed to be zero-mean, independent, and identically distributed. ny(t)

(37)

where vk,0 is assumed zero-mean, independent of, and distributed identically with vk. This means that the initial condition Δyk ,0 = yk ,0 − yk − 1,0

Assumption 1

The model approximation errors are partly due to persistent (from batch to batch) modeling and batch operations model bias and partly due to zero-mean random disturbances.

= −wk ,0 + wk − 1,0 = −vk ,0 O

(38)

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

is given as a zero-mean random variable. Intrabatch Models. For online estimation, monitoring, feedback control, and optimization, it is convenient to use a SS realization of the batch ARMAX model. To achieve a SS realization in a simple manner, it is desirable to simplify the batch ARMAX model structure with the assumption that the number of outputs is constant ny(t) = ny for t = 0, 1, ..., N. The SS realization of eq 5 in the paper results: xk , t = ( t xk , t − 1 + )t Δuk , t − 1 + ,t vk , t Δyk , t = *xk , t

(39)

with the SS model dimension nx = ny max(ni(t)|1 ≤ t ≤ N, i = A, B, C) and initial condition

xk ,0 = *′Δyk ,0

(40)

The SS model matrices ( t ∈ nx × nx , )t ∈ nx × nu(t ), ,t ∈ nx × ny , and * ∈ ny × nx contain the corresponding block columns in the batch ARMAX model (A, B, C). To exemplify, assume that nA(t) = nB(t) = nC(t) + 1 = 3 and the SS model matrices will be given as ⎡ − at , t − 1 I 0 ⎤ ⎢ ⎥ ( t = ⎢−at + 1, t − 1 0 I ⎥ , ⎢ ⎥ ⎢⎣−at + 2, t − 1 0 0 ⎥⎦ ⎡ bt , t − 1 ⎤ ⎢ ⎥ )t = ⎢bt + 1, t − 1⎥ , ⎢ ⎥ ⎢⎣bt + 2, t − 1⎥⎦



REFERENCES

(1) Lee, J. H.; Lee, K. S.; Kim, W. C. Model-based iterative learning control with a quadratic criterion for time-varying linear systems. Automatica 2000, 36, 641−657. (2) (a) Wang, Y.; Gao, F.; Doyle, F. J. Survey of Iterative Control, Repetitive Control and Run-to-Run Control. J. Process Control 2009, 19, 1589−1600. (b) Lee, K.; Lee, J. Iterative Learning Control-Based Batch Process Control Technique for Integrated Control of End Product Properties and Transient Profiles of Process Variables. J. Process Control 2003, 13, 607−621. (c) Flores-Cerrillo, J.; MacGregor, J. F. Control of Batch Product Quality by Trajectory Manipulation Using Latent Variable Models. J. Process Control 2004, 14, 539−553. (d) Bonné, D.; Jørgensen, S. B. Systematic Methodology for Reproducible Optimizing Batch Operation. Comput.-Aided Chem. Eng. 2006, 21B, 1275−1280. (e) Lee, J.; Lee, K. Iterative Learning Control Applied to Batch Processes: An Overview. Control Eng. Pract. 2007, 15, 1306−1318. (3) Nomikos, P.; MacGregor, J. F. Monitoring batch processes using multiway principal component analysis. AIChE J. 1994, 40, 1361−75. (4) Gregersen, L.; Jørgensen, S. B. Supervision of Fed-Batch Fermentations. Chem. Eng. J. 1999, 75, 69−76. (5) (a) Qin, S.; McAvoy, T. J. Nonlinear PLS Modelling Using Neural Network. Comput. Chem. Eng. 1992, 16, 379−391. (b) Ahmad, Z.; Zhang, J. Combination of Multiple Neural Networks Using Data Fusion Techniques for Enhanced Nonlinear Process Modelling. Comput. Chem. Eng. 2006, 30, 295−308. (6) Zhang, S.; Wang, F.; He, D.; Jia, R. Real-Time Product Quality Control for Batch Processes Based on Stacked Least Squares Support Vector Regression Models. Comput. Chem. Eng. 2012, 36, 217−226. (7) Bonné, D.; Jørgensen, S. B. Data-Driven Modeling of Nonlinear and Time-Varying Processes. Proc. SYSID 2003, 1655−1660. (8) Alvarez Villanueva, M. A.; Stocks, S. M.; Jorgensen, S. B. In Bioprocess Modelling for Learning Model Predictive Control (L-MPC); Nicoletti, M. C.; Jain, L., Eds.; Studies in Computational Intelligence; Springer-Verlag: Germany, 2009; Chapter 9, pp 237−280. (9) Bonné, D. Ph.D. thesis, Technical University of Denmark, Department of Chemical Engineering, 2005. (10) Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer: New York, 2001. (11) (a) Akaike, H. Markovian representation of stochastic processes by canonical variables. SIAM J. Control 1975, 13, 162−73. (b) Larimore, W. E. Canonical Variate Analysis in Identification, Filtering, and Adaptive Control. Proc. 29th IEEE Conf. Decision Control 1990, 596−604. (12) (a) Larimore, W. E. Statistical optimality and canonical variate analysis system identification. Signal Process. 1996, 52, 131−144. (b) Verhaegen, M. Identification of the deterministic part of MIMO state space models given in innovations form from input-output data.

⎡ −I ⎤ ⎢ ⎥ ,t = ⎢−ct + 1, t ⎥ , ⎢⎣−ct + 2, t ⎥⎦

* = [ I 0 0]

For tracking control applications the SS model form 5 can be modified directly to form 6 in the paper: xk , t = ( t xk , t − 1 + )t Δuk , t − 1 + ,t vk , t



ILC iterative learning control L-MPC learning model predictive control LS least squares LTI linear time-invariant LTV linear time-varying MA moving average ρ model purpose MPC model predictive control OSA one step ahead PCR principle component regression PRBS pseudo random binary signal PS pure simulation RR ridge regression SS space state TLS total least squares TR Tikhonov Regularization TSVD truncated singular value decomposition wLS weighted least squares

ek , t = yt ̅ − yk , t = yt ̅ − yk − 1, t − Δyk , t = ek − 1, t − *xk , t

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected] (S.B.J.). Present Addresses ‡

Novozymes, Bagsværd, Denmark (D.B.). Facultad de Ingenieriá en Electricidad y Computación, Escuela Superior Politécnica del Litoral, Guayaquil, Ecuador (M.A.A.). §

Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors acknowledge the commitment of both Dr. Henrik Steen Jørgensen and Dr. Stuart Stocks to having the data driven methodology investigated on one of their industrial pilot plant cultivations at Novozymes A/S.



ACRONYMS ARMAX autoregressive moving average models with exogenous inputs ARX autoregressive models with exogenous inputs Gρ estimated generalization errors GoLM Grid of Linear Models P

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Automatica 1994, 30, 61−74. (c) Verhaegen, M.; Yu, X. A Class of Subspace Model Identification Algorithms to Identify Periodically and Arbitrarily Time-varying Systems. Automatica 1995, 31, 201−216. (13) (a) van Overschee, P.; de Moor, B. N4SID: subspace algorithms for the identification of combined deterministic-stochastic systems. Automatica 1994, 30, 75−93. (b) Wenfu, K.; Storer, R. H.; Georgakis, C. Disturbance detection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Syst. 1995, 30, 179−196. (c) Chen, G.; McAvoy, T. J. Predictive on-line monitoring of continuous processes. J. Process Control 1998, 8, 409−420. (14) Dorsey, A. W.; Lee, J. H. Building Inferential Prediction Models of Batch Processes Using Subspace Identification. J. Process Control 2003, 13, 397−406. (15) Simoglou, A.; Martin, E. B.; Morris, A. J. Statistical performance monitoring of dynamic multivariate processes using state space modelling. Comput. Chem. Eng. 2002, 26, 909−920. (16) Huffel, S.; Vandewalle, J. The Total Least Squares Problerm; SIAM: Philadelphia, 1991. (17) Hansen, P. C. Rank-Deficient and Discrete Ill-Posed Problems; Polyteknisk Forlag: Lyngby, 1996; pp 53−57. (18) Kailath, T.; Sayed, A. H.; Hassibi, B. Linear Estimation; Prentice Hall: Englewood Cliffs, NJ, 2000. (19) Golub, G. H.; Hansen, P. C.; O’Leary, D. P. Tikhonov Regularization and Total Least Squares. SIAM J. Matrix Anal. Appl. 1999, 21, 185−194. (20) Fierro, R. D.; Golub, G. H.; Hansen, P. C.; O’Leary, D. P. Regularization by Truncated Total Least Squares. SIAM J. Scientific Comput. 1997, 18, 1223−1241. (21) Spliid, H. A Fast Estimation Method for the Vector Autoregressive Moving Average Model With Exogenous Variables. J. Am. Stat. Assoc. 1983, 78, 843−849. (22) Ljung, L. System Identification: Theory for the User; Prentice Hall: Englewood Cliffs, NJ, 1999. (23) Bonné, D.; Jørgensen, S. B. Iterative Learninng Model Predictive Control of Batch Processes. BatchPro - Symposium on Knowledge Driven Batch Processes; Thessaloniki: 2004; pp 67−72. (24) (a) Moore, K. L.; Xu, J.-X. Special Issue on Iterative Learning Control. Int. J. Control 2000, 73, 819−823. (b) Longman, R. W. Iterative Learning Control and Repetitative Control for Engineering Practice. Int. J. Control 2000, 73, 930−954. (25) Petersen, N. Multivariable Modeling for Control of Industrial Fedbatch Cultivations M.Sc. thesis, Department of Chemical Engineering, Technical University of Denmark, 2006. (26) Bertsekas, D. P. Dynamic Programming and Optimal Control, 2nd ed.; Athena Scientific: Belmont, MA, 2000; Vol. 1.

Q

dx.doi.org/10.1021/ie402532t | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX