Reconciliation of Process Flow Rates when Measurements Are

Jun 11, 1999 - Crowe's iteration formula for the outer iteration loop when there are no detection limits is given by where n is the vector of unknown ...
0 downloads 0 Views 117KB Size
Ind. Eng. Chem. Res. 1999, 38, 2861-2866

2861

Reconciliation of Process Flow Rates when Measurements Are Subject to Detection Limits: The Bilinear Case V. G. Dovı`* and A. Del Borghi ISTIC “G. B. Bonino”, Universita` di Genova, Via Opera Pia 15, 16145 Genova, Italy

In this paper we generalize the fundamental methods of process data reconciliation developed by Crowe (AIChE J. 1986, 32, 616-623), to allow for the presence of censored concentration measurements in the bilinear mass balance equations. The resulting statistical analysis leads to the maximization of a general likelihood function subject to linear constraints. The difference with respect to both Crowe’s analysis and a previous algorithm that considered only linear mass balances (Dovı` et al. Chem. Eng. Sci. 1997, 52 (17), 3047-3050) is considered, and an example is examined for illustration purposes. Introduction The chemical industry is presently under severe pressure for the production of high-quality goods at low cost. Frequently, this means detection and removal of impurities whose concentrations are close to the detection limits of industrial on-line samplers. Similarly, a more and more stringent environmental legislation requires the control of polluting substances down to levels which, again, are hardly detectable by industrial samplers. Furthermore, the accurate measurement of low-concentration compounds that have been monitored only recently in the chemical industry can be difficult because of inadequate analytical techniques. Thus, censored quality measurements (i.e., measurements subject to detection limits) are present, to an everincreasing extent, in the reconciliation of process data. The error distributions of censored data are generally assumed to be equal to

[

{

1/Tξ 0 e  e Tξ  g Tξ 0 p() ) p(ξ′-ξˆ ) ) 2 if ξ′ g Tξ N(0,σξ ) if ξ′ ) 0

(1)

where ξ′ is the experimental observation, ξˆ is the unknown exact value, Tξ is the detection limit, and σξ2 is the variance of the error distribution when the measurement is above Tξ. In other words, we assume a uniform distribution between zero and Tξ if a zero concentration value has been observed and a normal distribution otherwise. An alternative definition has recently been proposed in reconciliation problems by Dovı` (in press), but it will not be considered further because the focus of this paper is the presence of censored measurements in bilinear mass balance equations and not the particular form of the error distribution of censored variables. Neglecting the censoredness of measurements implies either their rejection and, if observable, their reestimation with the other unmeasured streams, according to the techniques first developed by Vaclavek (1969) and then by Mah et al. (1976), Crowe et al. (1983), Crowe (1986), Sanchez and Romagnoli (1996), and Kretsovalis and Mah (1987) or the introduction of bounds as * Corresponding author. Tel. and Fax: +39-010-3532921. E-mail: [email protected].

proposed by Narasimhan and Harikumar (1993). In both cases the functional form of p() would be overlooked. Similarly, the maximum entropy approach proposed by Crowe (1996) would neglect the statistical information available. We shall consider first the classification proposed by Crowe (1986) for an arbitrary distribution of measurements. It was shown that only three categories of variables need to be considered for each flow rate: (1) the total flow rate and concentration measured; (2) the concentration measured; (3) no variable measured. This classification makes it possible to rewrite, in a more convenient way, the material balances

Bx + STη ) 0 where x is the vector of true component and total flow rates in all streams, B is the relevant incidence matrix, S is the stoichiometric matrix, and η is the extents of reactions. The columns of B are permuted and partitioned so that

B f [B0|B1|B2|B3] where the columns of B0 correspond to exactly known flow rates k and those of Bi (i ) 1, ..., 3) to components in category 1, 2, or 3, respectively. If we set P ) [B3|ST] and all unknown flow rates and extents of reaction are indicated by v, we obtain the relation

B0k + B1(x + a) + B2N(d + δ) + Pv ) 0

(2)

where N is a diagonal matrix of unknown flow rates, with as many entries as there are measured concentrations in the stream considered, x and d are the measured component flow rates and concentrations, and a and δ represent the unknown measurement errors. The reconciliation problem is to determine the values of a and δ that satisfy eqs 2 and are optimal according to some convenient statistical criterion. If the statistical criterion used is the maximization of the likelihood function and a and δ are distributed normally as N(0,Va) and N(0,Vδ), we can define the reconciliation problem as

10.1021/ie980726s CCC: $18.00 © 1999 American Chemical Society Published on Web 06/11/1999

2862 Ind. Eng. Chem. Res., Vol. 38, No. 7, 1999

Min aTVa-1a + δTVδ-1δ a,δ,N,v

(3)

subject to eq 1. The aim of this paper is to modify eq 3 and reformulate eqs 2 when some components a and/or δ are not distributed normally due to measurement censoredness. The theory will be developed in the next section, whereas an application will be examined in the last paragraph.

whereas the two matrices Γ1 and Γ2 are given by

[

0 0 0 0 -1 0

where the first submatrix represents the first 14 columns, the second submatrix represents column 15, and the third submatrix represents the last three columns and

[

Theoretical Development Let us rewrite eqs 2 as

˜ 1 + ν)ξ1 + B0k + B′1(x + a) + B′2N(d + δ) + Γ1(N Γ2N2ξ2 + Pv ) 0 (4) B′1, B′2, Γ1, and Γ2 are derived from B1 and B2, whereas N2 is a subset of N. ξ1 and ξ2 are concentrations subject to detection limits, and N ˜ 1 is the experimental value of the flow rates. While N ˜ 1 and N2 can be set up by selecting the flow rates that include censored variables, the matrices B′1, B′2, Γ1, and Γ2 can be built using the following procedure. 1. Set up the matrices B0, B1, and B2 as described in the introduction. 2. Set B′1 ) B1 and B′2 ) B2. Set Γ1 equal to a matrix with the same dimensions of B1 and with all elements equal to zero. Set Γ2 equal to a matrix with the same dimensions of B2 and with all elements equal to zero. 3. If a variable in both B1 and B2 is censored, set the corresponding element ((1) to zero and move (1 to the same position in Γ1 or Γ2. As an example, consider the process of Figure 1 and suppose each stream contains three components. If the third component is censored in streams 5 and 6 and if only concentrations are measured in stream 6, the original matrix B1 + B2

[ [

1

-1

1 1

-1

1 1

-1

1 -1

-1

1 -1

-1

1 -1

-1

1

-1

1

-1 -1

1

-1 -1

1

] ]

-1

(where the first three rows represent the reactor and the other six rows represent the two separation units and blank entries are zero) is changed to

1

-1

1 1

-1

1 1

-1

1 -1

-1

1 -1

-1

1 -1

-1

1

-1

1

-1 -1

1 1

-1 0

0

]

0 0 0 -1

]

where the first submatrix represents 17 columns and the last submatrix represents the last column. The first step for the solution of the reconciliation problem given by eq 4 coincides with the procedure described in Crowe (1986). Thus, we have to determine the two matrices Y and Z such that

PTY ) 0 DTZ ) [B′21d21|B′22d22|...]TYZ ) 0

(5)

As demonstrated by Crowe

YT ) [-P2P1-1|I]

(6)

where

[

P P P ) P1 P2 3 4

]

(7)

with P1 equal to the maximum square matrix contained in P and

ZT ) [-D2D1-1|I]

(8)

with D1 and D2 being defined analogously. Thus, eq 4 becomes

˜ 1ξ1 + ZTYT[B0k + B′1(x + a) + B′2Nδ + Γ1N Γ1(νξ1) + Γ2N2ξ2] ) 0 (9) The reconciliation problem is given by the optimization of a suitable statistical function of the random variables a, δ, and νξ subject to eq 9. As a preliminary step for the optimization procedure, let us determine the probability distribution of νξ, i.e., the probability distribution of the product of a uniform and a Gaussian random variable. Errors in the measurements of censored variables will be assumed to be uncorrelated. Under these assumptions the probability distribution for each censored measurement can be written as (Papoulis, 1984)

p(νξ) )

∫0τw1 e-ν ξ /2w σ

1

2 2

2 2

x2πστ

dw

(10)

where τ is the detection limit for ξ and σ its standard deviation when it is not censored. By a suitable change of variable

p(νξ) )

1

-s

∫ν∞ξ /2τ σ es

2x2πστ

2 2

2 2

ds

(11)

While it can easily be proved that the normalization ∞ p(νξ) d(νξ) does exist and is, correctly, integral ∫-∞

Ind. Eng. Chem. Res., Vol. 38, No. 7, 1999 2863

Figure 1.

equal to 1, the function p(νξ) tends to ∞ as νξ tends to zero. This makes it impossible to use the method of maximum likelihood, because the likelihood function would have no maximum. However, any integral of the type ∫bap(νξ) d(νξ) does exist even if the interval a-b straddles zero. Thus, we have chosen to modify the objective function (3) by maximizing the generalized likelihood function

∫z ∏ k

-aTva-1a+δTvδ-1δ

e

zk,max k,min

p(νξ) d(νξ)

T T -1 -1 Min a Va a + δ Vδ δ a,δ,N,νξ

zk,max k,min

p(νξ) )

-s

∫ν∞ξ /2τ σ es

1

2 2

2x2πστ

2 2

ds

(14)

(12)

max or, equivalently, by subtracting the term ln[∫zzmin p(νξ) d(νξ)] from eq 3, i.e.

∑κ ln[∫z

to an optimization procedure. In fact, this procedure enables us to eliminate the discontinuity of the likelihood function involving censored measurements by taking an integral over a finite, if small, integral. It remains to indicate how the integral ∫bap(νξ) d(νξ) is computed numerically. Because

p(νξ) δ(νξ)] (13)

for every component flow rate k subject to detection limits. The values of zk,max and zk,min can be evaluated as functions of a and δ by setting, in eq 9 that contain the ˜ 1ξ1 and Γ2N2ξ2 equal to 0 variables νξ, the values of Γ1N ˜ 1τ and Γ2N2τ, respectively, and computing the or to Γ1N relevant values for Γ1(νξ1). This provides the values for zk,max and zk,min. The equations of the system (9) containing the variables νξ are then ignored, and the objective function is computed subject only to the equations that do not contain them. In other words, the information contained in the mass balances that include censored measurements, which should be used as inequality constraints by setting ξ to its upper or lower values, is used instead to modify the objective function and make it amenable

is an even function of νξ, we can limit our analysis to b > a g 0. If ν2ξ2/2τ2σ2 g 1, p(νξ) can be computed accurately using the Laguerre integration formula (Davis and Rabinowitz, 1984). Thus, if both a and b are greater than x2στ, ∫bap(νξ) d(νξ) can be evaluated using standard integration techniques, with the value of the integrand function being provided by the Laguerre integration formula. If a < x2στ

p(νξ)) )

1

[ [∫

2x2πστ 1

2x2πστ

-s

∫ν1ξ /2τ σ es 2 2

2 2

1

1

ν2ξ2/2τ2σ2 s

-s

2 2

2 2

∫∞1es

]

ds

ds +

∫ν1ξ /2τ σ e



-s

ds +

-1 ds + s

[ [ ]

-s

∫1∞es

]

ds

1 νξ -2 ln + x2στ 2x2πστ 1 e-s - 1 ds + 1.537 (15) ν2ξ2/2τ2σ2 s



]

The integrand function has been replaced by a cubic polynomial, so that approximately

2864 Ind. Eng. Chem. Res., Vol. 38, No. 7, 1999

[ [ ] (

(νξ)6 νξ p(νξ) ≈ -2 ln - 0.08255 2 2 3 + (2σ τ ) x2στ 2x2πστ (νξ)2 (νξ)4 + 1.537 ... (16) 0.234859 2 2 2 + (2σ τ ) (2σ2τ2) Consequently, if b < x2στ, ∫bap(νξ) d(νξ) can be ap1

)

]

proximated by the analytical integral of this function; otherwise

∫abp(νξ) d(νξ) ) ∫ax2στp(νξ) d(νξ) + ∫xb 2στp(νξ) d(νξ) (17) where each of the two integrals can be computed using the two approximations described above. If, as in the method proposed by Crowe (1986), the unknown values of N are determined in an outer loop, each optimization task becomes a nonlinear optimization problem subject to linear equality constraints, which is computationally equivalent to the optimization of unconstrained problems. Unconstrained convex programming is generally considered to be a comparatively simple problem, for the solution of which a considerable number of methods is available. Crowe’s iteration formula for the outer iteration loop when there are no detection limits is given by T

Dn ) Y [B0k + B1(x + a) + B2Nδ]

(18)

where n is the vector of unknown flow rates. However, in our case, it cannot be applied unless the elements of Γ2 are all equal to zero; i.e., there are no censored measurements in the streams in which the total flow rates are not measured. On the other hand, even when applicable, this iteration scheme, which corresponds to a direct substitution method, is not very efficient anyway. Thus, we have chosen to use a gradientless optimization with respect to n in the outer loop. However, the overall structure of the algorithm, consisting of an outer and an inner loop, has not been changed in any significant way. Thus, the method has been completely outlined. The main difference from Crowe’s algorithm is the replacement of a quadratic programming problem with a general nonlinear one in each optimization task, which makes the problem not amenable to an analytical solution. A similar difference had been encountered in the linear case (Crowe et al., 1983; Dovı` et al., 1997), where a quadratic problem subject to linear equality constraints had been replaced by a lower dimensional quadratic problem subject to bounds on the variables. The presence of inequality constraints had made it impossible, also in this case, to express the solution in closed form. At convergence we can solve the mass balance equations (used in the optimization algorithm as inequality ˜ 1ξ1 + Γ(νξ1) + Γ2N2ξ2. As in the constraints) for Γ1N linear case (Dovı` et al., 1997), we can estimate the values of N ˜ 1ξ1 + νξ1 and N2ξ2 provided there is at most one such variable in each mass balance equation involved. Otherwise, only linear combinations of these variables can be estimated. Numerical Example A simple process, which is presently being implemented on a pilot-plant scale, has been considered for illustration purposes. Three components (reagent, prod-

Table 1.a stream 1 stream 2 stream 3 stream 4 stream 5 stream 6

mol/s mol/dm3 mol/s mol/s mol/s mol/dm3

component 1

component 2

component 3

0 0.329 9.49 7.086 7.015 0.0286

9.597 0.683 10.44 2.891 0.0702 0.924

0.0845 0.013 0.159 0.08 0 0.0161

a For each data point the variance was assumed to be equal to 10-4 + 10-2 times the experimental value. The detection limit was assumed to be equal to 10-3 mol/dm3. The flow rates of streams 2 and 6 have not been measured.

Table 2. Rectified Values of Component Flow Rates (Censoredness Neglected) stream 1 stream 2 stream 3 stream 4 stream 5 stream 6

mol/s mol/s mol/s mol/s mol/s mol/s

component 1

component 2

component 3

0.00022 3.64 10.36 6.72 6.63 0.086

9.73 6.90 9.91 3.01 0.070 2.94

0.04 0.135 0.175 0.04 0.00098 0.039

Table 3. Rectified Values of Component Flow Rates (Censoredness Considered)a stream 1 stream 2 stream 3 stream 4 stream 5 stream 6 a

mol/s mol/s mol/s mol/s mol/s mol/s

component 1

component 2

component 3

0.0006 2.76 9.77 7.01 6.97 0.043

9.73 7.15 9.87 2.72 0.059 2.66

0.063 0.093 0.156 0.063 0.021 0.042

The entry of component 3 in stream 5 is given by N5ξ + ν5ξ.

uct, and catalyst) are present in each stream. The homogeneous catalyst is fed to a reactor whose output undergoes two separation processes. The top product of the first separation unit is recycled to the reactor, whereas the bottom product is fed to the second separation unit. The flowsheet of the process is represented in Figure 1, where all of the streams have been labeled. The matrices B′1 and Γ1 are equal to those evaluated in the previous paragraph, whereas B′2 ) B2 and the elements of Γ2 are all equal to zero. The component flow rates for streams 1 and 3-5 are measured, whereas only concentrations are measured in streams 2 and 6. The matrices Y, D, and Z, which are necessary to set up the constraints (9) that do not include censored variables, are evaluated in the appendix. We have considered the experimental data reported in Table 1. The concentration of component 3 in stream 5 is censored. The data have been rectified according to the traditional method (in this case the value zero for d35 is supposed to be a sample of a normal distribution) and according to the procedure developed in this work. The estimates of the component flow rates are reported in Tables 2 and 3. The data rectification obtained by neglecting the censoredness of the measurement of the third component concentration in stream 5 (Table 2) gives rise to a number of apparent gross errors. In fact, the mass balance around unit 3 reduces the value of the flow rate of the third component due to the insufficient contribution of stream 5. Similarly, the mass balance around units 1 and 2 sets the flow rate of the third component of stream 1 equal to that of stream 5. Thus, the corresponding rectified values appear to be gross errors. On the other hand, the global mass balances, which

[]

Ind. Eng. Chem. Res., Vol. 38, No. 7, 1999 2865

include strongly biased values of the flow rates of component 3, influence the reconciliation of the flow rates of the other two components. Thus, the flow rates of component 1 in streams 2 and 3 are gross errors. The reconstruction obtained using the algorithm described in this work removes the presence of gross errors in the estimated flow rates of the third component in streams 1 and 4. In fact, the presence of a much larger flow rate of component 3 in stream 5 makes it possible for the flow rates in streams 1 and 4 to have higher values. Another consequence is the reduction of the values of the flow rates of component 1 in streams 2 and 3, although the gross error in stream 2 is not removed. The rectification appears to be improved in that the number of gross errors has been reduced. Furthermore, they are the result of a procedure that, according to the theory developed in this work, appears to be statistically more accurate. Conclusions We have proposed a general algorithm that includes bilinear mass balances for the reconciliation of process measurements when some data can be below the detection threshold. Even in the simple case considered, nonnegligible differences between the results obtained using our approach and those provided by procedures that disregard detection limits have been encountered. Nomenclature a ) component flow rate measurement errors (B0, B1, B2, B3)(B′0, B′1, B′2, B′3)(Γ1, Γ2) ) incidence matrices for general, uncensored, and censored variables D, P, Y, Z ) auxiliary matrices d ) concentration measurements k ) known component flow rates N, n ) unmeasured total flow rates (diagonal matrices and vectors) p ) probability distribution T ) detection limit v ) vector of unknown flow rates and extents of reactions x ) component flow rates zk,min ) lower value for the variable (νξ)k zk,max ) upper value for the variable (νξ)k Greek Symbols  ) experimental error η ) extents of reactions ξ ) censored variables σ ) standard deviation τ ) detection limit Superscripts ′ ) with respect to uncensored variables ˆ ) exact value T ) transpose

Appendix In this appendix we evaluate the matrices D, D1-1, D2, DT, and ZT for the example considered. It follows from the definition of P that

1 -1 0 0 P) 0 0 0 0 0

[ ]

and from PTY ) 0 or YT ) [-P2P1-1|I] that

1 0 0 0 T Y ) 0 0 0 0

1 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0

0 0 1 0 0 0 0 0

0 0 0 1 0 0 0 0

0 0 0 0 1 0 0 0

0 0 0 0 0 1 0 0

0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 1

Similarly, from the definition DT ) [B′21d21|B′22d22|...]TY, it follows that

[

]

1 d12 + d22 0 -1 D1 ) -1 D2 ) 0 d16

[ ] d32 -d12 -d22 -d32 0 0

0 0 0 0 -d26 0

[ ]

and from the definition ZT ) [-D2D1-1|I]

-d32 d12 + d22 d12 d12 + d22 d22 ZT ) d12 + d22 d32 d12 + d22 0

0

0

1 0 0 0 0 0

0

0 1 0 0 0 0

0

0 0 1 0 0 0

0

0 0 0 1 0 0

-d26 0 0 0 0 1 0 d16 0 0 0 0 0 0 1

Literature Cited

Crowe, C. M. Reconciliation of Process Flow Rates by Matrix Projection: Part II: The Nonlinear Case. AIChE J. 1986, 32, 616-623. Crowe, C. M. Formulation of linear data reconciliation using information theory. Chem. Eng. Sci. 1996, 51 (16), 3359-3366. Crowe, C. M.; Garcia Campos, Y. A.; Hrymak, A. Reconciliation of Process Flow Rates by Matrix Projection: Part I: Linear Case. AIChE J. 1983, 29, 881-888. Davis, P. J.; Rabinowitz, P. Methods of Numerical Integration; Academic Press: Boston, 1984; p 223. Dovı`, V. G.; Reverberi, A. P.; Maga, L. Reconciliation of Process Measurements when Data are Subject to Detection Limits. Chem. Eng. Sci. 1997, 52 (17), 3047-3050. Dovı`, V. G. Reconciliation of Censored Measurements in Chemical Processes: an Alternative Approach, (in press).

2866 Ind. Eng. Chem. Res., Vol. 38, No. 7, 1999 Kretsovalis, A.; Mah, R. S. H. Observability and redundancy classification in multicomponent process networks. AIChE J. 1987, 33, 70-82. Mah, R. S. H.; Stanley, G. M.; Downing, D. M. Reconciliation and rectification of process flow and inventory data. Ind. Eng. Chem. Process. Des. Dev. 1976, 15, 175-183. Narasimhan, S.; Harikumar, P. A method to incorporate bounds in data reconciliation and gross error detection: IsThe bounded data reconciliation problem. Comput. Chem. Eng. 1993, 17, 1115-1128. Papoulis, A. Probability, random variables and stochastic processes; McGraw-Hill: Auckland, 1984; p 352.

Sanchez, M.; Romagnoli, J. A. Use of orthogonal transformations in data classification-reconciliation. Comput. Chem Eng. 1996, 20, 483-493. Vaclavek, V. Studies on system engincering. III. Optimal choice of the balance measurements in complicated chemical systems. Chem. Eng. Sci. 1969, 24, 947-955.

Received for review November 18, 1998 Revised manuscript received March 29, 1999 Accepted March 30, 1999 IE980726S