7312
Ind. Eng. Chem. Res. 2008, 47, 7312–7322
Constrained Bayesian State Estimation Using a Cell Filter Sridhar Ungarala,* Keyu Li, and Zhongzhou Chen Department of Chemical and Biomedical Engineering, CleVeland State UniVersity, CleVeland, Ohio 44115
Constrained state estimation in nonlinear/non-Gaussian processes has been the domain of optimization based methods such as moving horizon estimation (MHE). MHE has a Bayesian interpretation, but it is not practical to implement a recursive MHE without assumptions of Gaussianity and linearized dynamics at various stages. This paper presents the constrained cell filter (CCF) as an alternative to MHE, requiring no linearization, jacobians, or nonlinear program. The CCF computes a piecewise constant approximation of the state probability density function with support defined by constraints; thus, all point estimates are constrained. The CCF can be more accurate and orders of magnitude faster than MHE for problems of a size as investigated in this work. 1. Introduction We consider the problem of state estimation in nonlinear dynamic process and measurement systems, which include uncertainties characterized by non-Gaussian probability density functions (pdfs). In addition, the state and noise variables may be subject to constraints such as multivariate algebraic equality and inequality relationships or simple upper and lower bounds. The general solution to the constrained nonlinear/non-Gaussian state estimation problem is the conditional a posteriori pdf of the states p(xk|yk), where xk is the state vector and yk is the measurement vector. The support of the conditional pdf is characteristic of the applicable constraints on the random variables. Typically optimal point estimates are made by choosing appropriate summary statistics such as the mean or the mode of the conditional pdf. It is well-known that the state estimation algorithm is a recursive Bayesian update of p(xk|yk).1 The difficulties of following the evolution of this time-varying multimodal function are also well documented.2 Many suboptimal estimation methods are available for unconstrained nonlinear systems which include the extended Kalman filter (EKF),3 unscented Kalman filter (UKF),4 Gaussian sum filter,5 sequential Monte Carlo (SMC) filter,6 and, more recently, the cell filter.7 The EKF is based on the evolution of mean and covariance subject to linearized dynamics, which is computationally efficient but prone to divergence. The UKF follows the evolution of a few sigma points of Gaussian pdf due to nonlinear dynamics, but the sigma points limit the flexibility needed to model a transient pdf. The Gaussian sum filter is a weighted average of EKFs, which is generally difficult to tune for the number of Gaussians. The SMC filter employs samples in state space to follow the evolution of the conditional pdf and density estimation techniques are available to compute the summary statistics. It is completely recursive and no simplifying assumptions about nonlinearity or non-Gaussianity are necessary; however, SMC can incur significant realtime computation. Physically meaningful constraints may be imposed during the selection of samples in state and input spaces by incorporating appropriate screening steps in the SMC algorithm.8 Methods based on nonlinear optimization have been the main recourse to dealing with constraints, most of which attempt to * To whom correspondence should be addressed. Mailing address: 2121 Euclid Avenue, SH 455, Cleveland State University, Cleveland, OH 44115. Phone: (216) 687 9368. Fax: (216) 687 9220. E-mail:
[email protected].
solve for the constrained mode of the conditional pdf. The origins of these methods are purely numerical and are not necessarily motivated by probabilistic considerations. The sum of errors in the measurements and model predictions are generally minimized in a least-squares sense.9–11 In order to bound the size of the problem, optimization is performed in a moving window of data or horizon. A probabilistic interpretation of moving horizon estimation (MHE) was discussed in detail by Findeisen12 using Bayes rule. Several papers in the open literature introduced the MHE as a generalized nonlinear/non-Gaussian Bayesian estimator under constraints.13–15 A joint density is used in MHE instead of the conditional density to draw the mode estimates. Since a general Bayesian filter is infinite dimensional, practical MHE formulations often use simplifying assumptions to pose optimization problems that can be conveniently solved by convex optimization algorithms. Relaxation of some assumptions means that global optimization may be necessary. There are several shortcomings to the commonly implemented MHE formulation. First, the realtime computational load for repeatedly solving a large optimization problem can be high.6 Book-keeping techniques such as offline preparation and feedback16,17 and in situ adaptive tabulation18 are shown to alleviate the optimization load by avoiding some repetition. Second, the MHE approach includes a paradoxsthe conditional density is needed to recursively summarize past information14 and if it is available, MHE is not needed. The summary is typically assumed to be a shape invariant uniform or Gaussian pdf. Assumption of shape invariance can be an oversimplification as shown in Figure 1, where the pdfs of concentrations in a batch reactor simulation evolve in shape (section 6.2). On the other hand, long horizons rely less on past information and also offer robustness to unknown disturbances and plant-model mismatch. Third, the mode estimate is assumed to be universally optimal.19 The choice of a point estimate, according to a loss function, is user specific and must not influence future estimates.7 However, MHE can contain this feedback due to approximate arrival cost. Since the conditional pdf is not available, other statistics or error estimates such as covariance are also not available. Recently, we showed that for low to moderate dimensional problems, the cell filter provides accurate state estimates without heavy realtime computations.7 This paper extends our previous work with the cell filter to include constraints. A Markov chain is built in discretized state space defined by constraints, which
10.1021/ie070249q CCC: $40.75 2008 American Chemical Society Published on Web 08/27/2008
Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008 7313
density function of the current states p(xk|yk). In order to solve the estimation problem, one must determine the temporal evolution of this conditional density at each instance in the measurement sequence. The formulation for recursive state estimation in unconstrained Markov processes has been available for decades.20,1 Using Bayes rule, the conditional pdf is expressed as p(xk|yk) ∝ p(yk|xk)p(xk|yk-1)
(3)
where the measured evidence, represented by the likelihood function p(yk|xk), transforms the a priori knowledge, represented by p(xk|yk-1), into the a posteriori knowledge, which is the pdf p(xk|yk). The recursive formulation is written by using the Chapman-Kolmogorov equation as follows, p(xk|yk) ∝ p(yk|xk) Figure 1. Marginal a posteriori densities of concentrations in a nonlinear batch reactor at time instances k ) 1, 2, and 3.
models the probabilistic dynamics. Recursive evolution of a piecewise constant approximation of the conditional density is shown to be a computationally feasible linear transformation. Since the supports of all pdfs and the transition density of the Markov chain are defined by constraints, the statistical inferences are constrained. The cell filter is a recursive Bayesian method with no restrictions on models and probability densities. The purpose of this communication is twofold: (1) to show a comparison between MHE and constrained cell filter (CCF) in a probabilistic framework and outline the simplifying assumptions in MHE that are not needed in CCF and (2) to demonstrate the effectiveness of CCF via simulation examples used in MHE literature, where we show that CCF can serve as an alternative to MHE for low dimensional problems. Due to the brute force nature of the method, the use of CCF will be limited by storage needs, which scale poorly with increasing state dimensions.
∫ p(x |x
k k-1)p(xk-1|yk-1)
xk ) f(xk-1, wk-1)
(1)
where f: (R × R ) f R is a nonlinear function and wk is a stochastic noise process representing unmeasured disturbances and model errors. The noise is assumed to be independent and identically distributed (iid) with a probability density function pw(wk), which is generally a non-Gaussian density. The states are related to the process measurements via, n
m
n
yk ) h(xk, νk)
(2)
where h: (R × R ) f R is a nonlinear function and νk is an iid stochastic process representing measurement errors with a probability density function pν(νk), which is generally asymmetric and non-Gaussian. Note that the noise processes do not necessarily need to be additive terms in the process model and measurement equations. It is assumed that historical information about the states is represented by a pdf of the initial condition p(x0), which is also a non-Gaussian in general. Bayesian state estimation means drawing point estimates as optimal inferences from the a posteriori conditional probability n
p
p
(4)
The integral in the above equation represents the evolution of the a posteriori conditional pdf p(xk-1|yk-1) at time instance k - 1 into the a priori conditional pdf p(xk|yk-1) at time instance k, due to the state transition probability density p(xk|xk-1). The recursion may also be described through the action of a linear integral operator Pf, called the Foias operator, which operates on general Lebesgue measures,21 p(xk|yk) ∝ p(yk|xk)Pfp(xk-1|yk-1)
(5)
The Foias operator is implicitly written as
∫P
f
p(xk-1|yk-1) )
∫ ∫ {p (w w
f-1
k-1)
dwk-1}p(xk-1|yk-1) dxk-1 (6)
where it is assumed that the process model f, is nonsingular. The likelihood function may be explicitly written for nonsingular and continuously differentiable measurement function h, as follows3
| |
p(yk|xk) ) Lh(xk, yk) ) pν(h-1(xk, yk))
2. Bayesian Estimation 2.1. Problem Formulation. The objective of state estimation is to recover measured and unmeasured states of a dynamic process xk, given the process measurements yk at time instance k, in an optimal and preferably recursive manner. Let the discrete-time dynamic process be modeled as a vector difference equation of the form,
dxk-1
∂h-1 ∂yk
(7)
Now the recursion equation for the evolution of the conditional density is reformulated as p(xk|yk) ) ηLh(xk, yk)Pf p(xk-1|yk-1)
(8)
where the normalizing constant η is η)
(∫ L (x , y )P h
k
k
f
-1
)
p(xk-1|yk-1 dxk)
(9)
The mean, the mode, and the median of the conditional pdf are common choices for representative point estimates of the states. Various criteria are used to compare the possible estimates by choosing appropriate loss functions such as convex functions and symmetric convex functions.3 For instance, the conditional mean is the minimum variance estimate if the loss function is symmetric positive definite. Clearly, if p(xk|yk) is available, then other statistical parameters such as higher moments and probabilities such as Pr|xk| e M can be readily computed. For example, estimation error is commonly represented by covariance and region of highest posterior density. The mode, which is obtained by maximizing the conditional pdf, is known as the maximum a posteriori (MAP) estimate. In summary, recursive state estimation is performed in three stages at each time instant: (1) a prediction stage, where the prior pdf is generated using the Foias operator, (2) an update stage, where the posterior pdf is computed using the prior and the likelihood function, and, finally, (3) an inference stage where the estimate, xˆk, is drawn from the a posteriori pdf. It is important
7314 Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008
to note that the inference stage is not part of the recursion; hence, the estimated state xˆk has no influence on the subsequent estimate xˆk+1. More detailed discussion and equations for each stage may be found in ref 7. 2.2. Constrained Estimation. The Bayesian framework treats all variables as stochastic in nature, and we are specifically concerned with continuous random variables. Generally, the density functions of random variables xk, yk, wk, and νk have their domains defined on the entire real number axis. However, the domains of the pdfs in practice may be restricted to some portions or subsets of R. Incorporating physically meaningful constraints on the descriptions of the noise processes wk and νk adds clarity to their representation by their respective densities. For instance, one could constrain the domains of pw(wk) and pν(νk) such that, wk ∈ Wk and ν ∈ Vk
(10)
where the sets Wk and Vk may be closed and convex. It is common to choose upper and lower bounds via polyhedral convex sets such as14 Wk ) {wk : wl e Wwk e wr}
(11)
Vk ) {νk : ν e Vνk e ν }
(12)
l
r
Such constraints clearly place more restrictive and complicated geometries for integration in the evaluation of the Foias operator along f-1, as well as the domain of the inverse operator h-1, in the likelihood function. Physically meaningful and relevant state variables are typically bounded in some ways. Hence, the support of the a posteriori conditional density p(xk|yk) must be restricted as a consequence of the physical limitations on noise processes, saturation limits, and multivariate relationships among state variables. For instance, saturation limits may yield a closed set for xk, not necessarily convex, such that Xk ) {xk : xl e g1(xk) e xr}
(13)
It is not uncommon to encounter further restrictive multivariate algebraic equality relationships in the state space, which may originate from constitutive relationships and material and energy balances as g2(xk) ) 0
(14)
Once again the constraints lead to altered domains for the Foias operator and the likelihood function. Figure 2 depicts several possibilities for the support of the conditional density in eq 8. Equality constraints restrict the valid state trajectories and the probabilities over a hypersurface of dimension n - 1 in an n-dimensional state space. Inequality constraints allow the states values above or below such a hypersurface, which may also be bound by upper and lower limits. Thus, the inferences or estimates drawn from the a posteriori conditional pdf with support defined by the constraints will naturally obey the constraints. Unfortunately with such wide ranging restrictions and the generalities attributed to the process model and measurement equations, it is impossible to find analytical forms for the desired conditional density most of the time. In the following sections we briefly summarize two different approaches to approximate the state estimation problem: the widely used moving horizon estimation and the proposed constrained cell filter. 3. Moving Horizon Estimation As an alternative to constructing the a posteriori conditional density of the current state, a conditional joint density may be
Figure 2. Constraints on states.
constructed for a sequence of the discrete state trajectory and the joint density may be maximized. The optimization problem becomes larger since the joint probability is a function of the state trajectory in a horizon instead of a single point. Here we describe a probabilistic interpretation of moving horizon estimation in order to highlight the practical difficulties.12 The state trajectory of a dynamic system is estimated by solving an optimization problem with the following objective, max p(xk-H,...,k|y0,...,k)
ˆxk-H,...,k
(15)
subject to appropriate constraints on the decision variables. The function p(.) is the joint probability of a sequence of discrete trajectory over the most recent time horizon H, conditioned on all the available measurements. Using Bayes rule, p(xk-H,...,k|y0,...,k) ∝ p(yk-H,...,k|xk-H,...,k)p(xk-H,...,k|y0,...,k-H-1) (16) If xk and yk sequences are Markovian in nature, the above equation is written as p(xk-H,...,k|y0,...,k) ∝
k
k-1
∏ p(y |x ) ∏ p(x
j+1|xj)p(xk-H|yk-H-1)
j j
j)k-H
j)k-H
(17) where p(xk-H|yk-H-1) is the a priori prediction based on the summary of information p(xk-H-1|yk-H-1) just prior to the beginning of the horizon, p(xj+1|xj) is the state transition probability density and p(yj|xj) is the likelihood function. In general, the conditional joint pdf is a time varying, infinite dimensional, multivariate function. It is not possible to obtain an analytical form of the pdf for the optimization problem. In order to establish a more tangible objective function, the MHE formulation requires several simplifying assumptions for practical implementation. Some of the most common assumptions and the resulting simplifications are discussed below. A1. Noise Processes Are Gaussian or Variants of a Gaussian. The densities pν and pw are most commonly assumed to be Gaussians. When inequality constraints are imposed on noise processes, they yield truncated Gaussian pdfs. Several asymmetric densities have also been proposed by piecing together portions of Gaussians of different parameter sets.15
Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008 7315
A2. Process Model and Measurement Equations Are Nonsingular and Twice Differentiable. If the process model f and measurement function h are invertible, continuously differentiable, and monotonic, the conditional joint pdf can be explicitly written as p(xk-H,...,k|y0,...,k) ∝ k
∏ p (h ν
| |∏
-1)
j)k-H
∂h-1 ∂yj
k-1
| |(
pw(f-1)
j)k-H
∂f-1 p xk-H|yk-H-1)(18) ∂xj
Using the above definition of conditional joint probability, the problem formulation in eq 14 is known as general moving horizon estimation. A3. Noise Processes Are Independent and Identically Distributed Additive Random Variables. If f(xk-1, wk-1) ) jf(xk-1) + wk-1 and h(xk, νk) ) hj(xk) + νk, then eq 17 is simplified into p(xk-H,...,k|y0,...,k) ∝ k
∏
k-1
pν(yj - h(xj))
j)k-H
∏
pw(xj+1 - f(xj))p(xk-H|yk-H-1)(19)
j)k-H
The assumption of additive noise processes is not entirely necessary because f and h are assumed to be invertible in assumption A2.14 A4. Past Information Is Summarized in the Predicted xk-H and the Associated Covariance Matrix. If the a priori density p(xk-H|yk-H-1) is a multivariate Gaussian it is completely described by the mean vector and the covariance matrix. This is an inherent assumption necessary for the ease of implementation of MHE. At worst, the prior is not known, which corresponds to a uniform pdf, and at best it is represented by the first two moments of variants of Gaussians such as truncated or half-Gaussian pdfs due to bounds imposed on the support.15 A5. Conditional Joint pdf Is Obtained As a Product of Exponential Functions. By virtue of assumptions A1 and A4, maximizing the logarithm of the product of exponentials is replaced by minimizing a cost function written as a sum of the arguments of the exponentials, which are classified into two types as follows,22,14 k
min
ˆxk-H,w ˆk-H-1,...,k-1,ν ˆk-H,...,k
∑
Lj(wj, νj) + Γ(xk-H)
(20)
j)k-H
The function Lj is known as the stage cost derived from the likelihood function and the transition probability. The cost assigned due to uncertainties in the measurements and the process model are penalized here. The function Γ is known as the initial penalty or arrival cost, which represents the uncertainty in the a priori information at the beginning of the horizon. Although any pdf may be used to represent arrival cost, a poor choice may lead to loss of accuracy and divergence.14 Assumption A5 is valid for pdfs such as Gaussian, exponential, and Laplacians which are log-concave and may give rise to convex optimization problems depending on the functions jf and hj.23 A6. Random Variables x, w, and ν Belong to Closed and Convex Sets Xk, Wk, and Vk, Respectively. Considering the simplifying assumptions A1–A6, we now summarize the MHE problem statement in its most popular format:
wk-1 ) xk - f(xk-1) νk ) yk - h(xk) xk ∈ Xk, wk ∈ Wk,
νk ∈ Vk
where µ is the estimate of xk-H based on measurements y0,...,k-H-1 and P is the covariance matrix. The weighting matrices Q and R are typically chosen as the covariance matrices of Gaussian process and measurement noises, respectively. It is common to use inequality constraints on the noise variables and states.14,15,13 A7. Arrival Cost Is Updated Using Linear-Gaussian Dynamics. As the horizon is moved forward, the arrival cost must be updated at each time instance at the beginning of the horizon as follows, p(xk-H|yk-H-1) )
∫ p(x
k-H|xk-H-1)p(xk-H-1|yk-H-1)
dxk-H-1 (22)
Note that this involves the a posteriori density of eq 4, which if known would render the MHE unnecessary. Generally, the arrival cost is recursively approximated by relying on the shapeinvariance of the Gaussian pdf subject to linearized dynamics. The extended Kalman filter or smoother are a common choice. The problem of accurately summarizing the past information as arrival cost remains an open issue in MHE.24 The MHE is a procedure to approximately compute the mode estimates. The ability to impose constraints has been a most useful feature of MHE; on the other hand, nonlinearity and nonGaussianity are not necessarily the prime motivations to consider the use of MHE.14 Since the actual state pdf is not available, other statistics such as the mean and the covariance are not computable. The performance of MHE is dependent on the horizon size H and the weights P, Q, and R, which are used as tuning parameters. Recently parametric programming approach has been proposed for reducing the online computational load.16 4. Constrained Aggregate Markov Chain The Foias operator is a Markov operator also and may be associated with a stationary Markov chain in discrete time and continuous space. We focus on determining an approximate aggregate Markov chain in discrete space containing a finite number of states. A finite region of continuous n dimensional state space X ⊂ Rn, is chosen such that constraints on states confine the temporal evolution of the state vector to X. In Figure 3, simple upper and lower bounds on each of the three state variables define X as a cuboid. More generally, X may be an (n - 1)-dimensional hypersurface including the region above or below the hypersurface.
k-1
min
ˆxk-H,w ˆk-H,...,k-1
-1
(xk-H - µ)TP (xk-H - µ) +
∑
wTj Q-1wj +
j)k-H k
∑
νTj R-1νj(21)
j)k-H
subject to
Figure 3. Generalized cell mapping on constrained state space.
7316 Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008
Each state variable is treated as a collection of intervals, which makes the state space as a collection of n-dimensional cells instead of an n-dimensional continuum. Such a state space is called a cell state space comprising indivisible intervals or cells as a finite set of discrete states. Let X be partitioned into a connected set {zi: i ) 1, 2,..., N}, where z is an n-tuple position identifier of a cell. For instance, Figure 3 shows the constraint surface partitioned into one hundred cells. One would have ten times more number of cells of the same resolution in the cuboid if only bounds are considered as constraints. Therefore the presence of restrictive constraint surfaces aids in defining a more meaningful cell space and reduces the possible number of cells. j ) Rn - X, is a single infinite sized cell called the The space X 0 sink cell z . The process model f is a point mapping of xk-1 to xk in j continuum state space. Analogously, a transition from cell zk-1 i to cell zk may be defined by a cell mapping, j zik ) F(zk-1 , wk-1)
i, j ) 0, 1, ..., N
(23)
where F: (Z × R ) f Z . Transitions from the set {z , i ) 1, 2, ..., N}, into the sink cell z0 are terminal due to constraints. A Markov chain defined over the cell space represents coarsegrained dynamics of the system, and a cell trajectory converges weakly to the state trajectory as the cell interval tends to zero. Let the probability of the state of the system being in a cell zi at time k be mik. Hence, the cell probability mass vector (pmv) is p(zk) with elements mik, i ) 0, ..., N. The conditional probability pij is the transition probability of the system being in cell zi knowing that the system is currently in cell zj, n
m
n
i
pij ) prob{xk is in zi|xk-1 is in zj}
i, j ) 0, 1, ..., N (24)
The (N + 1) × (N + 1) stochastic matrix P, with elements pij, is the transition probability matrix. The matrix form of the discrete Chapman-Kolmogorov equation may be written as the following linear transformation, p(zk) ) Pp(zk-1)
(25)
P is the discretized approximation of the linear integral Foias operator Pf occurring in eq 5. The elements of P can be approximately computed using Monte Carlo integration based on the theory of generalized cell mapping (GCM).25 The general class of numerical algorithms known as cell-to-cell mapping have been developed for global analysis of nonlinear dynamic systems.25 They have been extensively used in the search for invariant probability densities or eigenfunctions of a wide range of state transition operators, including ordinary differential equations (ODEs). Since many chemical process models exist in state space in the form of ODEs, the cell mapping approach is suited for exploring the dynamics of state pdfs. According to GCM, the probability of transition from cell zj to cell zi is
density is specified by the number and location of nodes at which the function values are indicated. If the nodes {xik: i ) 1, 2, ..., N} are defined subject to constraints, the propagation of the state pdf at each node is N
p(xik) )
∑ p(x |x
i j k k-1
j )p(xk-1 )(xj - xj-1)
i ) 1, ..., N (27)
j)1
where the numerical integration is performed by rectangular rule. If the process model f is nonsingular and twice differentiable as in assumption A2 of MHE, the transition probability density function for nodes is explicitly computed as
| |
j j p(xik|xk-1 ) ) pw(f-1(xik, xk-1 ))
∂f-1 ∂xik
(28)
See ref 2 for details on updates to nodes using the EKF and numerical integration using the trapezoidal rule. The transition probability in eq 25 is computed using f and Monte Carlo integration, whereas the transition probability in eq 26 is computed using f-1 and numerical integration. The advantage of the cell mapping approach is the flexibility accorded to the choice of the mapping function f. There are few restrictions on the nature of the process model in eq 1 except that it must be computable. The process noise term remains an argument of f, i.e., not necessarily additive, with the stipulation that samples may be generated according to its pdf in order to compute f. Constraints on wk can be readily incorporated in the definition of pw(w), and the samples are thus generated. Cell mapping is an exercise in large scale simulation to precompute many possible state transitions. Although small cell sizes are the key to obtaining high resolution information, the main purpose is to adequately capture the shape of the state pdf as it is evolving in time. Hence, the benefits of small cells will plateau after a certain resolution. The computational burden grows exponentially with the dimensionality of the system. Storage and computational burdens generally limit the approach to low to moderate dimensions similar to NIF. Unlike NIF, the computational cost is a one time offline burden. The relative efficiency of the proposed cell filter stems from the fact that the computationally intensive probabilistic modeling problem is solved offline. 5. Constrained Cell Filter We start with the prior information on the initial condition in cell space, i.e., the prior pmv p(z0), which is a piecewise approximation of p(x0). The probability mass of a given cell z is assigned to its geometric center point for convenience, N
p(xk) ≈
∑ m δ(x - x ) i k
i
k
(29)
i)0
si (26) sj where sj are the number of sampled initial conditions in a cell, zj and si are the number of forward mapped images in the image cell, zi, where the mapping is performed by the state space model f. In Figure 3, six initial conditions are uniformly sampled in cell z1 and mapped forward for one time step. The final conditions are located at the end of short state trajectories into cells z2, z3, and z4, which form the set of image cells of z1. The results show that the transition probabilities are p21 ) 1/6, p31 ) 1/3, and p41 ) 1/2. Previously, density functions were approximated by piecewise linear functions in numerical integration filter (NIF).26 The pij ≈
where δ is the Dirac delta function and jx represents the coordinates of the cell center. Naturally, the support of the prior pmv is restricted to the cells confined to X only. Analogous to the likelihood function Lh in eq 7, when a measurement y is available, one must evaluate the likelihood of that datum being a measurement of any given state cell z. In a benign case of invertible and differentiable measurement function h, one can readily find a likelihood mass vector (lmv) l(zk, yk) as follows,
| |
l(zk, yk) ) pν(h-1(x, yk))
∂h-1 ∂yk
(30)
Previously, the same approach has been used by numerical integration filter for computing the likelihood over nodes.2 The
Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008 7317
shape and support of pν must reflect the applicable constraints on the measurement noise. Once again the support of the likelihood function is rightly restricted to the cells in X. More generally h is considered simply as a map from state and noise spaces to measurement variable space. Consider a region of interest in the measurement space Y ⊂ Rp, where measurements of x ∈ X, i.e., y ∈ Y are likely to be obtained. Y is discretized into a finite set of measurement cells forming {di: i ) 0, 1,..., L}, where d0 is a sink cell. The likelihood of an output cell di being the measurement of state cell zj is lij ) prob{yk is in di|xk is in zj}
i ) 0, 1, ..., L; j ) 0, 1, ..., N(31)
The (L + 1) × (N + 1) likelihood mass matrix L contains the elements lij, which is a discretized approximation of the likelihood function. It is computed by implementing a cell mapping like procedure between state cell and cells defined in the measurement space, lij ≈
ri sj
(32)
where sj are number of samples of states in cell zj and ri are number of measurement samples mapped into cell di. It is only required that h is computable and samples of measurement noise ν are available to implement the mapping. Given the measurement yk, which falls in the measurement cell dik, the lmv is presented by the ith row in L. Given the posterior probability mass vector p(zk-1|yk-1) at time instance k - 1, the current measurement yk and the associated likelihood mass vector l(yk, zk), the current posterior probability vector p(zk|yk) is recursively constructed using Bayes rule, p(zk|yk) )
l(zk, yk) . Pp(zk-1|yk-1)
∑ (l(z , y ) . Pp(z k
(33)
k-1|yk-1))
k
where . is the element by element multiplication of two vectors known as the Haddamard product. The expectation of any real valued function φ(x), is approximately obtained from the a posteriori pmv with a simple dot product, E{φ(x)} ≈ φ(x) · p(zk|yk)
(34)
where φ(xj) is a vector of function values at cell centers. The conditional mean provides the minimum variance estimate xˆMV k , and the associated error covariance may be subsequently computed. If one desires a MAP estimate, the maximization of the one-dimensional p(zk|yk) is straightforward, zMAP ) arg max p(zk|yk) k
(35)
and the MAP estimate xˆMAP is taken as the center point of the k cell where the mode is located. We point out that, although eq 33 bears strong similarity to the following NIF recursion equation for the a posteriori conditional density at a node,2 N
p(yk|xi) p(xik|yk) )
∑ p(x |x
i j k k-1
j)1 N i
N
∑ p(y |x )∑ p(x |x k
j)1
j |yk-1)(xj - xj-1) )p(xk-1
i j k k-1
j |yk-1)(xj - xj-1)2 )p(xk-1
j)1
(36) the most significant difference is that the likelihood function
and transition density are explicitly evaluated by NIF using h-1 and f-1 with numerical integration at each time instant whereas the cell filter employs h and f with Monte Carlo integration offline. In contrast to MHE’s common assumptions A1–A7, the constrained cell filter requires only computable process model and measurement equations and that the pdfs and/or samples of noise processes are available. The approach is completely recursive on the constrained conditional pdf. Since an approximation of the state pdf is computed, all relevant summary statistics can be readily obtained in contrast with MHE. While the online computations in CCF are essentially limited to a matrix-vector product and a Haddamard product, these scale very poorly with the dimension of the state vector. As in the case of NIF, a large number of cells are necessary to approximate the density adequately and avoid biases in estimates. This significant drawback limits the use of CCF to low dimensional problems, where it may provide a computationally efficient alternative to MHE without requiring simplifying assumptions. 6. Simulation Examples The simulation examples presented here are intended to illustrate the benefits of the constrained cell filter over online optimization based methods on examples from the literature. Constraints on states and noise processes are considered for systems of one to three states. The current focus is mainly on constrained estimation because examples involving nonlinearity in system model and measurement equations have been presented before.7 The estimation accuracy is measured by the following term, K
MSE )
n
∑∑
1 (x (i) - ˆxk(i))2 Kn k)1 i)1 k
(37)
where K is the number of measurements and n is the length of the state vector. In each example a typical sample path is shown in plot while the MSE and computational time are tabulated from averages of 100 realizations implemented in MatLab on a 3 GHz Intel Linux computer. 6.1. Constraints on Noise. 6.1.1. Measurement Noise. Consider the following random walk system with zero mean, unit variance Gaussian process noise, wk ∼ N(0, 1), and additive measurement noise, xk ) xk-1 + wk-1
(38)
yk ) xk + νk
(39)
The measurement noise is sampled from an asymmetrical pdf composed of left- and right-handed half-Gaussian pdfs facing each other at their common mean µ with different variances, p(ν) ) ce-
(ν
- µ)2/2σ2
(40)
where c is a normalizing constant, µ ) -3.5, σ ) 0.5, if ν < µ, and σ ) 5, otherwise. The parameters of the half-Gaussians are chosen such that the noise has a mean of zero and the variance equals 10.15 A sequence of 200 data points are simulated from an initial condition of zero. It is assumed that the prior information about the initial condition is available as p(x0) ∼ N(0, 1). Moving Horizon Estimation. A standard Kalman filter may use the noise statistics of zero mean and variance of ten. In MHE this results in a quadratic penalty function with fixed weight. However, the double half-Gaussian pdf suggests that
7318 Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008 Table 1. Average MSE and CPU Time for Random Walk System estimator KF MHE CCF
horizon/cells 1 2 3 20 40 60
modeling (CPU s)
estimation (CPU s)
MSE
0.02 12.7 21.8 0.04 0.04 0.05
2.7 2.7 2.7 2.2 2.1 2.0
n/a n/a n/a 0.07 0.08 0.10
the penalty function must be asymmetric. The MHE is formulated in a horizon of 2, with Q ) 1 and the prior statistics of the state, P ) 1. The weight on the penalty function corresponding to the measurement error is R ) 0.25, if ν < µ, and R ) 25, otherwise. A standard Kalman filter is used to approximate the arrival cost in MHE. Constrained Cell Filter. An arbitrarily chosen state space, x ∈ [-20, 20], is discretized into 60 uniform cells. The cell transition probability matrix P, is computed by GCM with 25 uniformly sampled initial conditions per cell. Essentially, the linear system is simulated one step forward for 1500 random initial conditions with process noise sampled from N(0, 1). Table. 1 lists the computational cost of GCM for increasingly higher resolution in cell space, all with 25 samples per cell. The measurement equation is easily invertible, and the following likelihood function is used to compute the likelihood mass at the center of each cell with σ ) 0.5, if x > y - µ; σ ) 5, otherwise; and (41) p(y|x) ) ce- y - x - µ /2σ The estimation results from the three methods are shown in Figure 4. The estimation error of MHE is 2.2, which is lower than the estimation error of 3.3 for the Kalman filter. Switching the tuning parameter R based on the shape of the noise pdf is beneficial; however, the improvement is gained at a higher computational cost due to online optimization. The estimation error of the CCF is 1.8, which compares favorably to that of MHE. See Table 1 for the average estimation error and CPU time for all three estimators. For the sample path in Figure 4, the MHE produced significant improvement over the Kalman filter; however, on the average, the MHE is no more accurate than Kalman filter (Table 1). The CCF showed better accuracy even for large cell sizes at a computational cost comparable to that of a Kalman filter. The non-Gaussian nature of this simple linear problem is seen in Figure 5. At time instances, k ) 15 and 19, respectively, the location of the measurements and the asymmetric lmvs are shown. Although the prior pmvs are fairly symmetric and close (
)2
2
Figure 4. Estimation results for random walk system.
Figure 5. A priori probability, a posteriori probability, and likelihood mass vectors at time instances k ) 15 and 19 for random walk system.
to a Gaussian in shape, the shape of the non-Gaussian posterior pmv depends on the overlap between the likelihood and prior, which shifts at each time instance. In Figure 5, the means and the location of the modes of the posterior pmv are shown along with the correct value of the state. Note that multiple modes in the pmv are artifacts of insufficient resolution and sampling. 6.1.2. Process Noise. Consider the following discrete-time linear dynamic system taken from the work of Rao and Rawlings,14 xk )
[
]
[
]
0.9962 0.1949 0.03393 x + w -0.1949 0.3815 k-1 0.19490 k-1 yk ) [1 -3 ]xk + νk
(42) (43)
where the process noise wk ) |zk|, zk ∼ N(0, 1), and the measurement noise νk ∼ N(0, 0.12). The mean value of wk is (2/π)1/2 and the variance is 1 - 2/π with the inequality wk g 0. The system is simulated for 100 time steps from the initial condition [2.5 0]T. The prior information on the initial conditions is assumed to be p(x0) ∼ N([2.5 0]T, 0.32I2), where Ii is an identity matrix of size i. Moving Horizon Estimation (Case 1). The assumption of zero mean and unit variance for process noise will result in poor estimation. However, for a fair comparison zero mean and unit variance are set for wk, both for a Kalman filter (KF1) and moving horizon estimation (MHE1) formulated in a horizon of 5 with weights Q ) 1, R ) 0.01, P ) 0.09I2, and the nonnegativity constraint on noise is imposed as reported previously.14,16 A sample path realization is shown in Figure 6. The MHE1 estimation results have an estimation error of 2.68 × 10-2, which is superior to the performance of the Kalman filter with an estimation error of 19.4 × 10-2. While KF1 is biased due to incorrect statistics of the process noise, the inequality constraint in MHE1 aided in partially compensating for the incorrect statistics. Darby and Nikolaou16 reported improved accuracy and reduced variance of the estimates with larger horizons. Moving Horizon Estimation (Case 2). Figure 7 shows the estimation results of moving horizon estimation (MHE2) and Kalman filter (KF2) with the correct mean and variance of wk. MHE2 showed only small improvement over MHE1 with estimation error of 2.55 × 10-2. The performance of KF2 is a big improvement with an estimation error of only 1.14 × 10-2. Note that KF2 did not use the inequality constraint wk. Even
Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008 7319 Table 2. Average MSE and CPU Time for Linear System with Nonnegative Process Noise estimator
horizon/cells
modeling (CPU s)
KF1 KF2 MHE2
1 1 3 5 7 30 × 20 60 × 40 90 × 60
n/a n/a n/a n/a n/a 0.5 4.8 31.5
CCF
Figure 6. Kalman filter and moving horizon estimation with zero mean and unit variance for wk.
estimation (CPU s)
MSE × 102
0.01 0.01 8.26 16.7 28.2 0.03 0.15 0.29
23.2 1.21 3.11 2.68 2.39 0.63 0.45 0.43
generating the transition probability matrix P, for various cell sizes all with 400 samples per cell is shown in Table 2. Since the state space is densely sampled for cell mapping, the CPU time is on the order of minutes. The estimation results of CCF are shown in Figure 8 with an estimation error of 0.43 × 10-2. The average mean squared error and CPU time for 100 realizations are shown in Table 2 for all three estimators. The results indicate that CCF can be accurate even for relatively coarse discretization of the cell space. More importantly, greater accuracy is achieved at a fraction of the online computational cost of MHE. Figure 9 shows the marginal posterior pmvs at time instants 16, 17, and 18. The pmvs display a tendency to be one sided due to the half-Gaussian nature of process noise, while multiple modes are artifacts of coarse cell sizes.
Figure 7. Kalman filter and moving horizon estimation with correct mean and variance for wk.
with an implicit assumption that the first two moments adequately represent the non-Gaussian pdf of wk, using the correct statistics is beneficial to reduce the bias in the Kalman filter estimates. Rao and Rawlings14 concluded that use of inequality constraint and proper statistics of wk explained the superior performance of MHE2 over KF1. A more appropriate comparison can be made between MHE2 and KF2, both of which use correct noise statistics, which has not been reported in the literature. The poor performance of MHE can be attributed to the choice of the tuning parameters. The arrival cost from a Kalman filter (such as KF1) in MHE1 is biased due to incorrect noise statistics. The bias is partially compensated for by the use of the inequality constraint for the specific tuning parameters. On the other hand, using correct noise statistics (such as KF2), the arrival cost in MHE2 is far more accurate and the constraint does little to improve the estimates. Recognizing the greater accuracy of the arrival cost with KF2, the estimator must be tuned such that Q . 1 to improve the performance of MHE2 over KF2. Constrained Cell Filter. The region of state space containing x(1) ∈ [0, 3] and x(2) ∈ [-1, 1] is discretized into 60 × 40 uniform cells. Four hundred initial conditions are uniformly sampled per cell. In total, 0.96 million initial conditions are mapped forward for one time step. The computational cost of
Figure 8. Constrained cell filter estimation of linear system with nonnegative process noise.
Figure 9. Marginal a posteriori probability mass vectors at time instances k ) 16, 17, and 18.
7320 Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008 Table 3. Average MSE and CPU Time for Nonlinear Batch Reactor (Modeling Performed by Compiled C Code Using CVODE Libraries) estimator EKF MHE2 CCF
modeling (CPU min)
horizon/cells 1 2 3 4 20 30 40 50 60
× × × × ×
20 30 40 50 60
× × × × ×
n/a n/a n/a n/a 8.20 23.4 43.2 94.1 128
40 50 60 70 80
estimation (CPU s)
MSE × 103
0.78 602 3430 4189 0.63 1.98 4.31 8.07 13.7
300 2.25 2.31 2.22 2.25 1.81 1.56 1.50 1.44
6.2. Constraints on States. 6.2.1. Nonlinear Batch Reactor. Consider the following gas-phase reversible reactions in a constant volume isothermal batch reactor19 k1
A y\z B + C
(44)
k2 k3
2B y\z B + C
(45)
k4
where k ) [0.5 0.05 0.2 0.01]. Unsteady-state mass balances yield the following set of nonlinear ODEs, dCA ) - k1CA + k2CBCC dt
(46)
dCB ) k1CA - k2CBCC - 2k3CB2 + 2k4CC (47) dt dCC ) k1CA - k2CBCC + k3CB2 - k4CC (48) dt where Cj is the concentration of species j in moles per liter. The measurements are generated by sampling at a constant time interval ∆t ) 0.25 according to the following reactor pressure measurement equation y ) RT(CA + CB + CC) + ν
(49)
with RT ) 32.84 mol · atm/L and ν ∼ N(0, 0.25 ). The system is simulated for 120 time steps from the initial conditions CA(0) ) 0.5, CB(0) ) 0.05, and CC(0) ) 0. The state vector is defined as x ) [CA CB CC]T, and the following discrete-time process model is available to the state estimators, 2
∫
dx dτ + wk-1 (50) dτ where wk ∼ N(0, 0.0012I3). It is assumed that the information about the initial condition is poorly known, which is summarized by the Gaussian prior p(x0) ∼ N([0, 0, 4], 0.52I3). To be physically meaningful, the concentrations cannot be negative, i.e., the states must obey the inequality constraint x g 0. Moving Horizon Estimation. For most data sets the EKF results in negative concentrations for components A and B and converges to wrong steady states. On the average the EKF has a large estimation error of 0.3. The MHE is formulated in a horizon of 2 with parameters R ) 0.252, Q ) 0.0012I3, P ) 0.52I3, and the constraints. The model is simulated by Runge-Kutta integration with ode45 function in MatLab. The arrival cost is approximated by the EKF. Even with poor a priori information, the MHE quickly converged to the true concentrations, with an average estimation error of 2.25 × 10-3. Larger horizon sizes only slightly improved the accuracy, but dramatically increased the computation time (see Table 3). Numerical xk ) xk-1 +
k∆t
(k-1)∆t
Figure 10. Constrained cell filter estimation of nonlinear batch reactor.
integration accounts for a large fraction of the load. Euler integration is faster but causes buildup of discretization errors. More efficient computation was reported with Octave and compiled code in C for optimization.19 Constrained Cell Filter. The state space bounded by CA ∈ [0, 0.5], CB ∈ [0, 0.5], and CC ∈ [0, 0.7] is discretized into 40 × 40 × 60 cell space. The lower limits are imposed by the constraints and upper limits are chosen to avoid region where the concentrations are unlikely to occur. Using 500 samples per cell GCM requires 48 million computations of the model. Interpreted code such as MatLab is impractical for such large scale simulation. Numerical integration is performed with Adams method using CVODE libraries in C, and the transition probability matrix is imported into MatLab for CCF. Table 3 lists the computational time required for generating P, all with 500 samples per cell. The plots in Figure 10 demonstrate that CCF converges to the correct states swiftly from the bad initial condition. Since both the a priori pmv and the lmv assign zero probability for the region outside the cell space, inequality constraints are not violated by the a posteriori pmv; hence, the mean and mode are constrained. The online computational load is a fraction of that of MHE. Trends in Table 3 indicate that very small cell sizes are not essential for lowering the estimation error. Cell size has little effect on the mean and mode when the shape of the a posteriori pmv is captured adequately as shown in Figure 1 6.2.2. Linear Batch Reactor. Consider a batch reactor system with the following first order reactions, adapted from the work of Haseltine27 k1
A y\z B 98 C k2
(51)
k3
where k ) [0.06 0.03 0.001]. The total number of moles remain constant in the reactor. A set of linear ODEs is used to describe the dynamics of the mole fractions, dx ) dt
[
]
-k1 k2 0 k1 -k2 - k3 0 x k3 0 0
(52)
where x ) [xA xB xC]T is the mole fractions vector, which must obey 0 e xi e 1 and Σxi ) 1. Noisy measurements of the mole fractions of only species A and B are available online. The objective is to filter the measurements and estimate the unmeasured mole fraction. The system is simulated for 50 time
Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008 7321 Table 4. Average MSE and CPU Time for a Linear Batch Reactor estimator KF QP CCF
horizon/cells (constrained) 1 1 50 × 50 (148) 100 × 100 (491) 200 × 200 (1765) 300 × 300 (3823)
modeling estimation MSEAB MSEC (CPU min) (CPU s) × 105 × 105 n/a n/a 0.2 0.6 2.3 5.3
0.005 0.147 0.009 0.018 0.051 0.104
9.9 4.9 3.4 2.2 1.6 1.4
22 3.2 4.8 3.2 2.0 1.4
1 min Φ ) xTWx + hTx 2 x˜ s.t. Ax ) b xg0
(55) T -1
Figure 11. Constraint plane in a unit cube.
Figure 12. Estimation results for a linear batch reactor.
steps from the initial condition x(0) ) [1 0 0]T and sampled at ∆t ) 1 according to the following equation
[
]
1 0 0 x + νk (53) 0 1 0 k where ν ∼ N(0, 0.022I2). A discrete-time linear dynamic model based on Euler numerical integration is available for state estimation as follows, yk )
[
]
1 - k1 k2 0 1 - k2 - k3 0 xk-1 + wk-1 k1 (54) k3 1 0 where wk is zero mean Gaussian noise with covariance matrix Q ) diag([0.012 0.012 0.00012]). The prior mean is µ ) [1 0 0]T with covariance matrix P ) diag([12 12 0.012]). Kalman Filter. The system is not observable because the measurement matrix is not of full column rank. One may estimate xA and xB using a Kalman filter with a two-dimensional model with Q ) 0.012I2, R ) 0.022I2, and P ) I2 and subsequently compute xC ) 1 - xA - xB. The bottom panel of Figure 12 displays the estimate of xC, where the non-negativity constraint is violated and the estimates show large variance. Quadratic Programming. At each sampling time, the Kalman filter for all three sates can be posed as a quadratic programming problem with additional constraints as follows, xk )
-1
where the weighting matrices are W ) 2(C R + P ), h ) -2(yR-1 + µP-1), and A ) [1 1 1], b ) 1. Note that constraints are not imposed on covariance prediction and correction. The estimates of xC in the bottom panel of Figure 12 are non-negative and the variance is also reduced. Constrained Cell Filter. Bounds on the mole fractions naturally limit the physically meaningful state space to a unit cube at the origin (see Figure 11). However, the region of interest may be more effectively constrained by choosing the upper limits to be more practical, for instance the value of xC is mostly close to zero than one. Cell space is defined over xA ∈ [0.3 1], xB ∈ [0 0.7], and xC ∈[0 0.03]. The algebraic equality constraint places additional restrictions inside this cuboid. The allowable mole fractions exist only on a triangular plane, i.e., a two-dimensional state space (see Figure 11). One need only to define regular cells through which the constraint plane passes, and the remaining state space is labeled as the sink cell. Since a cell is still three-dimensional, not every point in the cell satisfies the constraint. Hence, the sampled initial conditions in a cell are chosen such that they are close to the constraint plane within (10-6 tolerance. When the range of xA and xB is discretized into 300 × 300 cells, the number of constrained cells is only 3823. See Table 4 for modeling CPU times and the number of constrained cells for different resolutions with 400 samples per cell. A majority of this time is consumed for identifying the constrained cells. Figure 12 plots the CCF estimates showing noticeably less variance for xC. The results are summarized in Table 4. Estimation errors are separately shown for xC and the results indicate improved estimates for xA and xB also. 7. Conclusions The state of a nonlinear/non-Gaussian dynamic system can be inferred from its conditional probability density function (pdf) whose support must be delineated by realistic constraints or bounds on the random variables. This paper proposed an extension to our work on cell filter to incorporate constraints and recursively generate a piecewise constant approximation of the conditional pdf. State equality and inequality constraints are imposed by restricting the probabilistic dynamics of the system to a discretized constraint surface or set of cells. Constraints on process inputs are naturally included in the computation of the transition probability matrix. Thus, any point estimate drawn from the transient conditional pdf will naturally obey the constraints. The constrained cell filter (CCF) is a departure from the traditional online optimization based approach of moving horizon estimation (MHE) for constrained processes. The CCF can be implemented in about thirty lines of code without linearization, jacobians or nonlinear programming. In the problems investigated, the online computational
7322 Ind. Eng. Chem. Res., Vol. 47, No. 19, 2008
time was several orders of magnitude smaller than MHE. The method is limited to similarly low dimensional problems, because memory and storage needs for offline computations rapidly increase with dimension. The simulation examples in this paper provide the typical computational cost of GCM required for one to three-dimensional systems when the process models are simple difference equations as well as numerically integrated differential equations. Acknowledgment This material is based upon work supported by the National Science Foundation under Grant Nos. CTS-0433527 and CTS0522864. Support from the EFFRD program at CSU is gratefully acknowledged. Literature Cited (1) Ho, Y. C.; Lee, R. C. K. A Bayesian approach to problems in stochastic estimation and control. IEEE Trans. Auto. Control 1964, 9, 333– 339. (2) Tanazaki, H. Nonlinear Filters: Estimation and Applications; Springer: New York, 1996. (3) Jazwinski, A. H. Stochastic Processes and Filtering Theory; Academic Press: New York, 1970. (4) Julier, S.; Uhlmann, J. Unscented filtering and nonlinear estimation. Proc. IEEE 2004, 92, 401–422. (5) Alspach, D. L.; Sorenson, H. W. Nonlinear Bayesian estimation using Gaussian sum approximations. IEEE Trans. Auto. Control 1972, AC-17 (4), 439–448. (6) Chen, W. S.; Bakshi, B. R.; Goel, P. K.; Ungarala, S. Bayesian estimation via sequential Monte Carlo sampling: unconstrained nonlinear dynamic systems. Ind. Eng. Chem. Res. 2004, 43, 4012–4025. (7) Ungarala, S.; Chen, Z. Z.; Li, K. Bayesian state estimation of nonlinear systems using approximate aggregate Markov chains. Ind. Eng. Chem. Res. 2006, 45, 4208–4221. (8) Lang, L.; Chen, W. S.; Bakshi, B. R.; Goel, P. K.; Ungarala, S. Bayesian estimation via sequential Monte Carlo sampling - constrained dynamic systems. Automatica 2007, 43 (9), 1615–1622. (9) Jang, S.; Joseph, B.; Mukai, H. Comparison of two approaches to on-line parameter and state estimation of nonlinear systems. Ind. Eng. Chem. Process Des. DeV. 1986, 25, 809. (10) Ramamurthi, Y.; Sistu, P. B.; Bequette, B. W. Control-relevant dynamic data reconciliation and parameter estimation. Comp. Chem. Eng. 1993, 17 (1), 41–59.
(11) Liebman, M. J.; Edgar, T. F.; Lasdon, L. S. Efficient data reconciliation and estimation for dynamic processes using nonlinear programming techniques. Comput. Chem. Eng. 1992, 16, 963. (12) Findeisen, P. K. Moving horizon estimation of discrete time systems. MS thesis, University of Wisconsin, Madison, WI, 1997. (13) Robertson, D. G.; Lee, J. H.; Rawlings, J. B. A moving horizonbased approach to least squares estimation. AIChE J. 1996, 42 (8), 2209. (14) Rao, C. V.; Rawlings, J. B. Constrained process monitoring: Moving-horizon approach. AIChE J. 2002, 48 (1), 97–109. (15) Robertson, D. G.; Lee, J. H. On the use of constraints in least squares estimation and control. Automatica 2002, 38, 1113–1123. (16) Darby, M. L.; Nikolaou, M. A parametric programming approach to moving horizon state estimation. Automatica 2007, 43, 885–891. (17) Diehl, M.; Bock, H. G.; Schloder, J. P. A real time iteration scheme for nonlinear optimization in optimal feedback control. SIAM J. Control Optim. 2005, 43 (5), 1714–11736. (18) Hedengren, J. D.; Edgar, T. F. In situ adaptive tabulation for realtime control. Ind. Eng. Chem. Res. 2005, 44, 2716–2724. (19) Haseltine, E. L.; Rawlings, J. B. Critical evaluation of extended Kalman filtering and moving horizon estimation. Ind. Eng. Chem. Res. 2005, 44, 2451–2460. (20) Lee, R. C. K. Optimal Estimation, Identification, and Control; MIT Press: Cambridge, MA, 1964. (21) Lasota, A.; Mackey, M. C. Chaos, Fractals, and Noise: Stochastic Aspects of Dynamics; Springer-Verlag: New York, 1994. (22) Rao, C. V.; Rawlings, J. B.; Mayne, D. Q. Constrained state estimation for nonlinear discrete-time systems: Stability and moving horizon approximations. IEEE Trans. Auto. Control 2003, 48 (2), 246–258. (23) Schon, T. F.; Gustafsson, T. S.; Hansson, A. A note on state estimation as a convex optimization problem. In Proceedings of ICASSP; Hong Kong, April 6–10, 2003. (24) Haseltine, E. L. Systems Analysis of Stochastic and Population Balance Models for Chemically Reacting Systems. PhD thesis, Department of Chemical Engineering, University of WisconsinsMadison, 2005. (25) Hsu, C. S. Cell-to-Cell Mapping: A Method of Global Analysis for Nonlinear Systems, Vol. 64 of Applied Mathematical Sciences; SpringerVerlag: New York, 1987. (26) Kitagawa, G. Non-Gaussian state-space modeling of nonstationary time series. J. Am. Stat. Assoc. 1987, 82 (400), 1032–1063. (27) Haseltine, E. L. Applications of moving horizon estimation. Texas Wisconsin Modeling and Control Consortium, Madison, WI, Fall 2000.
ReceiVed for reView February 15, 2007 ReVised manuscript receiVed May 6, 2008 Accepted June 10, 2008 IE070249Q