Bayesian State Estimation of Nonlinear Systems Using Approximate

The conditional probability density function (pdf) is the most complete statistical representation of the state from which optimal inferences may be d...
0 downloads 0 Views 470KB Size
4208

Ind. Eng. Chem. Res. 2006, 45, 4208-4221

Bayesian State Estimation of Nonlinear Systems Using Approximate Aggregate Markov Chains Sridhar Ungarala,* Zhongzhou Chen, and Keyu Li Department of Chemical and Biomedical Engineering, CleVeland State UniVersity, CleVeland, Ohio 44115

The conditional probability density function (pdf) is the most complete statistical representation of the state from which optimal inferences may be drawn. The transient pdf is usually infinite-dimensional and impossible to obtain except for linear Gaussian systems. In this paper, a novel density-based filter is proposed for nonlinear Bayesian estimation. The approach is fundamentally different from optimization- and linearization-based methods. Unlike typical density-based methods such as probability grid filters (PGF) and sequential Monte Carlo (SMC), the cell filter poses an off-line probabilistic modeling task and an on-line estimation task. The probabilistic behavior is described by the Foias or Frobenius-Perron operators. Monte Carlo simulations are used for computing these transition operators, which represent approximate aggregate Markov chains. The approach places no restrictions on system model or noise processes. The cell filter is shown to achieve the performance of PGF and SMC filters at a fraction of the computational cost for recursive state estimation. Introduction The problem of state estimation from noisy measurements in dynamic systems was posed in a Bayesian framework at least four decades ago.1,2 It is also a nested problem in the field of optimal control of stochastic systems.3 The conceptual solution is to construct a probability density function (pdf) of the states based on all available information, both historical and present. This conditional pdf embodies the most complete statistical information about the state, on which any optimality criterion may be used to draw a representative state estimate. The functional forms of the system dynamic model and the measurement equation, as well as the type of density functions of the uncertainties, determine the evolution of this conditional pdf. It is translated, distorted, and spread over the state space at each time instance in the observation sequence. A model of the temporal evolution of the state pdf is necessary to formulate a recursive estimation algorithm. Generally, it is impossible to physically realize the conditional pdf since it is infinitedimensional, except for special cases such as linear Gaussian systems and truly discrete state systems.4 The state-estimation problem continues to receive significant attention for nonlinear systems. A common theme among many estimation methods found in chemical engineering literature is a functional approximation of the nonlinear system and/or fixedshape state pdfs represented by the first two moments.5-7 Linearization allows the formulation of a tractable least-squares optimization problemsoften not appropriate, but more convenient to solve. Because of the shape invariance of the Gaussian pdf subject to linear dynamics, it is possible to use recursive relationships for the transient mean and covariance. Consequently, the Kalman filter and its variants based on Taylor series approximations are popular approaches for recursive estimation.8 The pitfalls of linear Gaussian approximations are widely recognized for nonlinear/non-Gaussian systems.5,9,10 For instance, bias in the residuals, correlation of states and nonGaussian residuals, correlation of the residuals in system and measurement equations, etc., yield nonrobust filters, causing divergence.11 Improved suboptimal methods are advocated for * Corresponding author. Phone: (216) 687-9368. Fax: (216) 6879220. E-mail: [email protected].

problems where nonlinearity and non-Gaussianity are sufficiently severe to render the Taylor series approximations of nonlinear functions invalid. Optimization-based estimation methods have become more practical with advances in optimization software and computing. The moving-horizon estimators (MHE) pose nonrecursive batch least-squares problems subject to nonlinear models.5,7,12-16 The optimization is performed using nonlinear or quadratic programming, which allows constraints yielding improved performance. A probabilistic interpretation of the least-squares objective function still requires the Gaussian assumption (or truncated Gaussian under constraints).15,17 Gaussian series approximations have been proposed to handle non-Gaussian densities, but tuning the series is a formidable task because of the transient nature of the underlying density.10 The MHE are computationally demanding and require careful formulation and initialization. Instead of approximating functional nonlinearity, a different class of suboptimal filters are proposed, which represent the state pdfs with a finite set of quantities. These density-based methods are not yet common in the chemical engineering field. Among the earliest, the Gaussian sum filter approximates the non-Gaussian pdf as an expansion of Gaussian densities.18,19 The filter basically involves a collection of extended Kalman filter (EKF) equations and follows the weighted average of EKFs. The number of Gaussians and their summary statistics are difficult to tune, and the filter is still biased. Recently, the use of the unscented transform has been shown to improve the performance of the EKF. The unscented Kalman filter (UKF) employs a set of deterministically selected points from the Gaussian approximation of the state pdf.20,21 Since the evolution of the Gaussian is followed by propagating these points through the nonlinear model, the approach is both more accurate and more complex than the EKF. However, the Gaussian approximation will fail if the true pdf is multimodal or highly skewed. During the past two decades, several numerical density-based approches have been proposed, collectively known as grid methods. The pdfs of interest are represented as piecewise linear or discrete values on a grid of points in state space. Notable among these are the numerical integration filter (NIF)22,23 and probability grid filter (PGF).24,25 The grid filters follow the evolution of state pdfs as a collection of lines on the grid at

10.1021/ie050362l CCC: $33.50 © 2006 American Chemical Society Published on Web 05/06/2006

Ind. Eng. Chem. Res., Vol. 45, No. 12, 2006 4209

each time instance. NIF is applicable to a wide class of nonsingular and differentiable nonlinear systems but suffers from rapidly increasing memory and computational demands for highdimensional systems. The approach is optimal if the state space is discrete and finite. The PGF is more efficient since it uses an adaptive grid, but it is limited to monotonic nonlinearities. Ten years ago, the grid filter was proposed for nonlinear batch reactors citing the then-available computing power.25 It appears that the large computational load is still a major contributor for the lack of attention to grid filters in the chemical engineering area. The Monte Carlo density-based methods abandon the quest for finite parameter approximations of the conditional pdf in the state space. Instead, they rely on a set of samples distributed according to the unknown pdf to estimate its properties. These filters are known by various names such as particle filters, weighted bootstrap filters, and sequential Monte Carlo (SMC) filters.26-28 They were introduced for state estimation more than three decades ago.29 More recently, they have received attention in many areas including statistics,30,31 forecasting,32 and chemical processes.28 The renewed interest in Bayesian methods is attributed to the simplicity of Monte Carlo implementations and the availability of cheap computing power.33,34 The SMC filters have practically no restrictions on the models and types of densities and are effective alternatives to nonlinear optimizationbased batch-estimation methods.28 Generally, all the density-based algorithms tend to be rather slow numerical processes requiring large amounts of computation for on-line estimation tasks. This disadvantage is manifested in the efforts of these methods to capture the transition of the state pdf repeatedly at every time instance. For instance, the forward propagation of a set of sigma points in UKF, a set of random samples in SMC, and all the grid nodes in PGF entails the solving of an increasingly larger number of initial-value problems on-line. Although they are computationally more demanding, the density-based algorithms are known to be less biased and more robust than the methods based on function approximations.9 In this paper, we submit a novel density-based state estimation approach called the cell filter, which is fundamentally different from the optimization-based methods and the Kalman filter variants. The cell filter approximates the continuous state space as a finite collection of discrete hypervolumes called cells; hence, it is a grid method. The system dynamics are represented as coarse-grained cell-to-cell stochastic transition sequences. This is an important distiction from grid filters, which use state trajectories growing from the grid nodes. The transition probabilities between the cells are computed using Monte Carlo sampling and simulation; hence, the cell filter is also a Monte Carlo-based density filter. Unlike SMC, the Monte Carlo simulations are performed off-line. The cell filter splits the implementation of the generic densitybased filter into two tasks: (1) a modeling task and (2) an estimation task. A class of linear integral operators are known to exist that describe densities (or more general measures) in transition due to discrete-time state transition maps.35 These Markov operators, describing the transition probability in the state space, have been extensively studied in measure theory36 and global analysis.37 In the modeling task of the cell filter, a linear operator is developed for the transient cell probabilities in the form of an aggregate Markov chain. The modeling is a one-time task performed off-line for a given discretization. The resulting transition probability matrix is a discretized approximation of the Foias or Frobenius-Perron operators. This

appears to be the first work to utilize these operators in Bayesian estimation. The cell filter places no restrictions on the functional nonlinearities or the types of densities, unlike the traditional grid filters. The only requirement is that the nonlinearities are computable and samples can be generated from the noise pdfs. Finally, the estimation stage, performed on-line, requires significantly less computation in comparison with grid filters or SMC, because the on-line propagation of state trajectories is completely eliminated. The main motivation for the cell filter is to speed up the density-based filter for on-line applications. The outline of the paper is as follows: the formulation of the Bayesian estimation problem is laid out in state space along with the relevant discussion on the difficulties of realizing the solution. The grid filter and SMC are briefly explained, since the cell filter bears a close resemblance to them. Estimation in cell space is detailed with the cell filter algorithm. The development of the Markov chain model is shown using generalized cell mapping. The cell filter is compared with the SMC filter and PGF using simulation examples of a benchmark nonlinear problem, the quadratic map, a continuous stirred tank reactor (CSTR), and a batch reactor. In each case, the computational demands and estimation accuracy are compared. 2. Bayesian Estimation in State Space Consider the following discrete-time stochastic dynamic system model and measurement equation,

xk ) f(xk-1,wk-1)

(1)

yk ) h(xk,νk)

(2)

where f: (Rn Rm) f Rn and h: (Rn Rp) f Rp are nonlinear vector-valued functions of states and noise processes. The state vector is xk ∈Rn, and the system noise wk ∈Rm is an independent and identically distributed (i.i.d.) white noise process with a known pdf pw(wk). The measurement vector is yk ∈ Rp, and the measurement noise νk ∈ Rp is an i.i.d. white noise process distributed according to the pdf pν(νk). Noise processes wk and νk are independent of each other. It is assumed that the initial condition x0 is characterized by a known pdf p(x0). There are no restrictions on the functional forms, and all pdfs are generally non-Gaussian. The central issue in the problem of state estimation is the construction of a pdf for the unknown state xk, based on all available information at time instance k. Bayesian estimation aims to construct the conditional pdf of the state by fusing historical knowledge and measurement information according to the Bayes rule,

p(xk| yk) )

p(yk| xk)p(xk| yk-1) p(yk)

(3)

where the a priori knowledge about the state, represented by the conditional pdf p(xk|yk-1), is modified into the a posteriori conditional pdf p(xk|yk), in light of the current measurement information represented by the likelihood function p(yk|xk). The denominator is a normalizing constant independent of xk. Recursive Bayesian estimation of dynamic systems is performed in three stages at each time instance: (1) a prediction stage, where the prior pdf is generated from the previous posterior pdf; (2) an update stage, where the posterior pdf is constructed from the prior pdf and the likelihood function; and (3) an inference stage, where the appropriate moments or the largest mode of the posterior pdf are computed for optimal point

4210

Ind. Eng. Chem. Res., Vol. 45, No. 12, 2006

used for propagating the posterior pdf conditioned on measurements up to time instance k - 1 into a conditional prior pdf at time k,

p(xk| yk-1) ) Pf p(xk-1| yk-1)

Figure 1. Recursive Bayesian estimation.

estimates of the state. Figure 1 shows the three stages of Bayesian estimation at any time instance. Recursive Bayesian estimation is intuitively appealing, yet in practice it poses formidable difficulties for nonlinear systems. 2.1. Prediction Stage. In stochastic dynamic systems such as eq 1, the pdf of the initial condition p(x0) is distorted, translated, and spread with time by virtue of the nonlinear map f and the characteristics of the stochastic excitation due to w. In principle, given the current pdf of the state p(xk-1), the dynamic model may be used to predict the future state pdf p(xk). The temporal evolution of the probability density is given according to the Chapman-Kolmogorov equation,

p(xk) )

∫p(xk| xk-1)p(xk-1) dxk-1

(4)

which involves the transition probability density function p(xk|xk-1). Under the restrictive assumptions of a nonsingular, continuously differentiable, and monotonic nonlinear map f, the density evolution may be explicitly written as follows,8

p(xk) )



|| ||

pw(f-1(xk,xk-1))

∂f-1 p(xk-1) dxk-1 ∂xk

where ||‚|| is the absolute value of the determinant of the Jacobian matrix. This formulation, with the restrictive assumptions, is used in the grid filters to compute the probabilities at grid nodes.22,25 More generally, for an arbitrary nonsingular map f, the evolution of the state pdf is described by a linear integral operator Pf, known as the Foias operator in measure theory,35

p(xk) ) Pˆf p(xk-1)

(6)

where the operator is implicitly represented as

∫Pf p(xk-1) ) ∫ ∫{pw(wk-1) dwk-1} p(xk-1) dxk-1 f-1

Unfortunately, Pf is difficult to obtain analytically, since the inverse map of f may have complicated geometry even for relatively simple nonlinearities. Nonmonotonic and discontinuous functions further complicate the problem. Except in the case of linear Gaussian dynamic systems, very little can be done, either analytically or numerically, to obtain a general solution to Foias and Frobenius-Perron operators in order to follow the evolution of the state pdf due to an arbitrary state transition map. 2.2. Update Stage. The update stage requires that the measurement noise pdf, pν(νk), be known to compute the likelihood function. If the measurement map h and its inverse map h-1 are continuously differentiable, the likelihood function is explicitly obtained as follows,8

p(yk| xk) ) pν(h-1(xk,yk))

|| ||

∂h-1 ) Lh(xk,yk) ∂yk

(7)

Although the focus is limited to probabilistic measures here, the Foias operator describes the evolution of general Lebesgue measures. If wk-1 ) 0 the operator Pf is known as the Frobenius-Perron operator, a special case of the Foias operator for deterministic systems. Both the Foias and the FrobeniusPerron operators are particular Markov operators and, thus, inherit the Markov properties. Since Pf is Markov, it may be associated with a stationary Markov chain. All the information about the evolution of the stochastic nonlinear system is encoded into the linear operator Pf, on the space of probability measures. This operator acts on p(xk) in the same manner as the nonlinear map f acts on points xk, in state space. The transition operator Pf provides the system model for the prediction stage of recursive Bayesian estimation to generate the prior information. An extension of eq 6 may be

(9)

Except for linear measurement maps or for scalar systems, it is generally not possible to analytically compute the likelihood function Lh(xk,yk), since the inverse map, h-1, may not be wellbehaved. If the prior pdf and the likelihood function are available according to eqs 8 and 9, respectively, the recursive Bayesian estimation solution is

p(xk| yk) ) ηLh(xk,yk)Pf p(xk-1| yk-1) (5)

(8)

(10)

which obtains the posterior pdf of state xk conditioned on measurement yk, given the likelihood function at time k and the previous posterior pdf of state at k - 1. The normalizing constant η is



η ) ( Lh(xk,yk)Pf p(xk-1| yk-1) dxk)-1

(11)

2.3. Inference Stage. The state of the nonlinear estimator is represented by the entire conditional pdf; hence, it is infinite in size. The generally non-Gaussian posterior pdf cannot be characterized by a finite set of summary statistics. Although the whole pdf is needed to make any conclusions about the states, point estimates are often used as descriptive of the state. The estimated state, xˆ k, is, thus, an inference drawn from the a posteriori pdf. An optimal inference is typically drawn by computing the maximum or a conditional expectation of the posterior,

xˆ k )

{∫

arg maxxˆ k p(xk|yk) φ(xk)p(xk|yk) dxk

or (12)

where φ(‚) is any suitable real function. The conditional mean, mode, and median are commonly used as optimal inferences according to minimum variance (MV), maximum a posteriori (MAP), and minimax (MM) optimality criteria, respectively.38 If the prior pdf is assumed uniform, the mode of the likelihood function yields the maximum likelihood (ML) estimate. In the case of linear Gaussian systems, the pdfs are characterized by mean vectors and covariance matrices. The optimal filter, which is two conditional expectations, is finite in size. Therefore, recursive computation of the conditional pdf in eq 10 reduces

Ind. Eng. Chem. Res., Vol. 45, No. 12, 2006 4211

to iterative relationships for conditional mean and covariance quantities, which is the Kalman filter. On the contrary, the nonlinear/non-Gaussian case does not lend itself to analytical treatment. The computational burden of finding the entire pdf at each sampling time can be formidable and places severe load on recursive density-based methods such as PGF and SMC. Furthermore, the pdf could be defined over a multidimensional state space Rn. As a result, the computation of its moments in eq 12 is a nontrivial task requiring multidimensional integrations. Last, the reader should note that the inference stage and the recursion operation (prediction and update in Figure 1) are completely independent of each other. Since the recursion is on the entire pdf, rather than on moments chosen as state estimates, approaches such as the extended Kalman filter lead to divergence. 3. Density-Based Filters 3.1. Probability Grid Filter. The grid filters represent the posterior pdf on a finite set of grid nodes in state space {xik, i ) 1, ‚‚‚, NG}, NG

p(xk| yk) ≈

i mk|k δ(xk - xik) ∑ i)1

(13)

i where mk|k is the probability mass at the grid node xik and δ is the Dirac delta function. The filter is a recursive algorithm to predict and update NG probability masses. The following is a short description of the PGF algorithm reported in chemical engineering literature.25 The PGF is based on several simplifying assumptions: (1) The noise processes w and ν are additive. (2) They are Gaussian. (3) The functions f and h are invertible, continuously differentiable, and monotonic. The filter algorithm is summarized in the following steps (refer to Figure 1 for the stages): (1) Initialize: discretize the continuous state space into NG i regularly spaced grid nodes xk-1 and assign probability masses i mk-1|k-1. (2) Predict: pass each grid node through system model to generate a new deformed grid of nodes xik. Compute each i on the new grid according to probability mass mk|k-1

NG

i mk|k-1

)

j j mk-1|k-1 p(xik|xk-1 ) ∑ j)1

(14)

where the transition probability is obtained by numerically integrating eq 5 at the grid nodes using rectangular rule. (3) Update: using the measurement yk, update the predicted i probability masses to mk|k i mk|k

)

i mk|k-1 p(yk|xik) NG

(15)

j mk|k-1 p(yk|xjk) ∑ j)1

where the likelihood mass is obtained by numerically integrating eq 5 at the grid nodes using rectangular rule. (4) Infer: compute the weighted average of grid nodes with the probability masses. Return to step 2. The PGF involves only numerical approximations related to grid size and integration. Under the simplifying assumptions mentioned before, it is the exact theoretical formulation. The more general NIF relaxes the assumptions of Gaussian noise

and monotone nonlinearities at the expense of increased computation due to fixing the grid size a priori.22 The number of nodes necessary to approximate the density increases rapidly with the dimension of the state. The computational load quickly becomes unreasonable for on-line implementations, because grid deformation is performed by solving NG initial-value problems. However, for low-dimensional problems, the grid filter is shown to offer superior performance compared to the EKF.25 3.2. Sequential Monte Carlo Filter. Sequential Monte Carlo methods are recursive algorithms, where NMC random samples (or particles) of the state xk-1 at time k - 1 are transformed into representative samples of xk. With a large number of samples, it is possible to provide a faithful description of the state pdf at every time instance. The summary statistics of the pdf can be obtained from the particles as follows,



φ(xk)p(xk) ≈

1 NMC

NMC

φ(xki) ∑ i)1

(16)

where xik are samples drawn according to p(xk). Kernel density estimators may be used to fit functional forms to the pdf in order to locate the mode for the MAP estimate. In this work, we use the weighted bootstrap filter from ref 26, since it is intuitive, simple to implement, and also fairly representative of the computational cost incurred by Monte Carlo density-based methods. The filter algorithm is summarized in the following steps (refer to Figure 1 for the stages): (1) Initialize: start with NMC random particles distributed according to the known posterior p(xk-1|yk-1). (2) Predict: pass each particle through the system model along with samples drawn from noise pdf pw(wk-1), to generate particles distributed according to the prior pdf p(xk|yk-1). (3) Update: using the datum yk, evaluate the likelihood of each particle and assign weights. Resample NMC times from the discrete distribution of the weights to generate particles according to the posterior pdf p(xk|yk). (4) Infer: compute Monte Carlo averages and the associated confidence intervals. Return to step 2 The great advantage of this simple algorithm is that no restrictions are placed on the model or the pdfs of noise processes. The reader is referred to refs 26-28 and 39 for more details on alternate update techniques and practical issues regarding number of samples, convergence, and effects of degeneracy. Similar to the grid filters, the vexing problem with all SMC methods is the need to solve NMC initial-value problems at every time instance of the measurement sequence to generate prior information. If the system model (an ODE instead of a difference equation) requires numerical integration between long sampling times, it exacerbates the computational burden. Moreover, the information gleaned from NMC simulation results is immediately discarded after the update stage. If need arises to predict in the same region of the state space at another time instance, the simulations must be performed all over again. 4. Preliminaries 4.1. Space-Time Discretization. We consider the nonlinear dynamic system in eq 1 as the approximation of the dynamics in continuous time, shown below as a general integral map,

xk ) xk-1 +

k∆t x˘ dt ∫(k-1)

(17)

Equation 17 is a point mapping in state space parametrized by the discrete time interval ∆t, called the mapping time (possibly

4212

Ind. Eng. Chem. Res., Vol. 45, No. 12, 2006

corresponding to the sampling time in the observation sequence). In practice, one might be forced to abandon point mapping in a continuum of state space beyond a certain resolution in space. There are limits to our capacity to observe the state space at a fine scale along with limits placed by numerical roundoffs during computation. One would then be faced with a large set of discrete state intervals at the resolution of interest. Each interval serves as the birth place of an ensemble of trajectories, which may be distinguished from each other only in the long term. With this motivation, we will consider a discretized approximation of the continuum state space parametrized by a small volume of state space or a cell. We refer to this discretized state space as cell space. We consider the case of finite state spaces only, with the justification that many estimation problems of practical interest are posed with bounds on state variables arising because of physical limitations and safety concerns. 4.2. State Cells. Consider a finite region of continuous n-dimensional state space R ⊂ Rn, where system dynamics of practical interest are likely to occur. Let R be partitioned into a finite collection of connected sets called cells, {zi, i ) 1, 2, ..., N}. z is typically a vector of n integers identifying its position in R. Cell zi represents a small hypervolume of the state space such that

ψil(zi,∆x) e x < ψir(zi,∆x)

(18)

where ψil and ψir define cell bounds and ∆x ∈ Rn is the corresponding cell size vector. All the state space outside the region of interest R h ) Rn - R is considered a single infinitesized cell called sink cell z0. The continuous state space Rn is N . The approximated by the discrete cell space Z ) {zi}i)0 discrete approximation is parameterized by the cell size vector ∆x, such that Z f Rn as ∆x f 0 or N f ∞. 4.3. Data Cells. In practice, values of the measurement variable separated by less than the precision of the measurement device, say ∆y, cannot be resolved and must be treated the same. Considering the physical limitations on obtainable measurement accuracy, the discretization of measurement space is a realistic way of representing the measurement variables. Consider a region of interest in the continuous space of measurements S ⊂ Rp, where measurements of x ∈ R are likely to be obtained. Let S be discretized into a finite set of measurement cells, {di, i ) 1, 2, ..., M}, where d ∈ Z p, with the corresponding measurement cell size vector ∆y ∈ Rp. A measurement sink cell d0 represents the infinite measurement space Sh ) Rp - S. M The collection D ) {di}i)0 represents the set of values likely to be obtained as measurements of z. 4.4. Cell Transition. Consider a point mapping in state space from xk-1 to xk described by eq 17. There is an analogous transition in cell space from cell zj containing xk-1 at time k 1 to cell zi containing xk at time k. The evolution of the system as a finite-state aggregate Markov chain over the cell space represents coarse-grained dynamics of the system, parametrized by the cell size vector ∆x. The cell transition may be described by the following dynamic relationship known as cell mapping,

zik

)

j F(zk-1 ,wk-1),

i, j ) 0, 1, ..., N

(19)

where F: (Zn Rm) f Zn. A discrete-time cell trajectory converges weakly to the discrete-time state trajectory as ∆x f 0. Transitions from cells {zi, i ) 1, 2, ..., N} into the sink cell z0 are considered as terminal, which is useful for focusing on system dynamics limited to R. The cell measurement equation

corresponding to the state measurement map h may be written as

dik ) H(zjk,νk), i ) 0, 1, ..., M; j ) 0, 1, ..., N

(20)

where H: (Zn Rp) f Zp. 4.5. Probability Mass Vector. Let the pdf of the state vector in state space p(x) be approximated as a piecewise constant function in cell space representing the probabilities of the cells p(z). At time k, if the probability mass associated with a cell zi is represented as mik; then the cell probabilities are represented by the probability mass vector (pmv),

[]

m0k m1 p(zk) ) · k ·· mNk

N

with

mik ) 1 ∑ i)0

(21)

The state pdf can be approximately retrieved from cell pmv by using, N

p(xk) ≈

mikδ(xk - jzi) ∑ i)0

(22)

where δ is the Dirac delta function and jzi is the geometric center of the cell zi. 5. Bayesian Estimation in Cell Space 5.1. Prediction Stage. Given the current pmv p(zk-1), it is desired to predict the future pmv p(zk). The relationship between j and the final mik is obtained by the initial probability mass mk-1 a discrete analogue of the Chapman-Kolmogorov equation, N

mik )

j pijmk-1 ∑ j)0

(23)

where pij is the probability of transition from cell zj to zi,

pij )

∫ψ ex