Loss Accounting and Estimation of Leaks and Instrument Biases Using

In this work we consider the subject of loss accounting, particularly as it applies to the refining industry. The goal of loss accounting is to evalua...
0 downloads 0 Views 120KB Size
2336

Ind. Eng. Chem. Res. 2000, 39, 2336-2344

Loss Accounting and Estimation of Leaks and Instrument Biases Using Time-Series Data Donald J. Chmielewski and Vasilios Manousiouthakis* Chemical Engineering Department, University of California at Los Angeles, Los Angeles, California 90095

Bruce Tilton and Brad Felix Simulation Sciences Inc., 601 South Valencia Avenue, Suite 100, Brea, California 92823

In this work we consider the subject of loss accounting, particularly as it applies to the refining industry. The goal of loss accounting is to evaluate a plant’s efficiency through the utilization of all process measurements. Central to this activity is the extraction of emissions information from indirect measurement sources (i.e., process flow and hold up measurements). In this work we present a time-series based approach to the leak estimation problem. It is shown that, in a refinery setting, the proposed time-series approach can achieve the precision required to meaningfully estimate the low flow rates associated with process emissions. The subject of integrating the time-series leak estimates with the more traditional direct emission measurements is also discussed. 1. Introduction

Table 1. Typical Loss Sources from an Oil Refinery

In recent years, the hydrocarbon process industry has shown an increased interest in the monitoring of process emissions. Part of this interest has been driven by recent governmental requirements with regard to monitoring practices. An indirect motivation stems from the fact that a high-quality monitoring system is required to achieve cost effectiveness in emissions control. To this end, the Institute of Petroleum recently developed a manual for the accounting and control of hydrocarbon losses.1 In this document, “loss accounting” was defined as the action of determining the location of every barrel of material that has been received by the plant. This definition is somewhat larger in scope than the traditional “emissions monitoring” process (i.e., the measurement of emission rates at targeted sources). The key difference is that the latter definition relies heavily on specialized measurement equipment, while the former combines a variety of data sources, including emission measurements, to obtain an overall picture of plant efficiency. The advantage of the accounting viewpoint is that emission estimates will tend to be more accurate due to the corroboration of multiple measurement sources. Loss accounting consists of a number of distinct activities: collection of relevant data; extraction of loss information from each data source; integration of data/ information from different sources; and interpretation of results. Typically, data will be obtained from two sources, direct and indirect loss measurements. Extraction of emission information from the indirect measurements is, in general, a challenging task. As such, the bulk of our discussion will be the development and analysis of extraction/estimation methods. In the final section, we address the remaining loss accounting activities, with a particular focus in how these activities are influenced by the chosen extraction method. * To whom correspondence should be addressed. Tele: (310) 825-9385. Fax: (310) 206-4107. E-mail: [email protected].

real losses

apparent losses

fired fuel leakage to water slops/recycles storage tank vapor flares leakage to ground metering errors density errors evaporation line fills

2. Background A mass balance around an entire refinery will typically yield a net loss of material due to spills, leaks, evaporation, fired fuel, and flares. If one tries to identify the sources and rates of these “real” losses through standard measurements and internal mass balances, the calculations will be significantly corrupted due to a variety of measurement errors and “apparent” losses. Table 1 identifies many of the real and apparent losses expected in a refining facility. In this work we consider an apparent loss as any process or phenomena that causes a mass balance not to close. As seen in Table 1, metering errors are a special class of apparent losses. These errors typically consist of two components: random errors and systematic biases. The goal of the standard data reconciliation algorithm is to remove the random errors from the process measurements. Additionally, exploitation of measurement redundancies, as well as gross error detections schemes, allows for the estimation and removal of many other apparent losses. However, it should be stressed that many apparent losses (such as line fills, slops, and recycles) can be reduced or even eliminated by simply keeping better track of plant operations. Evaporative and leakage losses can occur from most of the common equipment used in a refinery, particularly storage tanks, exchangers, valves, flanges, pump seals, etc. While the loss at each source is small, the large number of sources will result in significant emissions. Present methods to estimate evaporative and leakage losses include the application of empirical models, direct measurements, and data driven gross error/bias detection algorithms.

10.1021/ie990573o CCC: $19.00 © 2000 American Chemical Society Published on Web 06/14/2000

Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000 2337

Semiempirical emission models for a variety of storage tanks have been developed by the American Petroleum Institute (API).2 In addition, the U.S. Environmental Protection Agency (EPA) has published a number of reports2 concerning emissions from other equipment typically found in the refining industry. Although, both the API and EPA models can predict the average emissions over all units with the same design, neither approach is capable of detecting the presence of faulty equipment on-line. Direct emission measurements are usually performed in the context of a leak detection and repair (LDAR) program. For small equipment (such as pumps, valves, and flanges) direct measurements include bag tests and sniff tests.3 For larger equipment (such as tanks and exchangers) a tracer test will allow the equipment to stay in service, while full inspections (i.e., API 653 for tanks) will require down time.4 The common thread with all of these tests is that they are labor-intensive and costly. The goal of many LDAR programs is to create a balance between measurement costs and detection effectiveness, which often results in a policy of testing each piece of equipment at some periodic interval.3,5 For small equipment, such as pumps and valves, these inspections may be as frequent as monthly or quarterly. Large equipment, such as storage tanks and exchangers, might be inspected only once a decade.4 Many data driven approaches to leak and bias detection are specific applications of a more general gross error detection scheme. A number of authors6-9 suggest the detection of biases and leaks through gross error detection schemes designed for dynamic data reconciliation problems. Tong and Crowe10 advocate a principal component analysis approach for the detection of “persistent” gross errors. Other approaches have been specifically designed for the loss estimation problem. Miles and Jelffs11 and Jelffs and Harrison12 both apply statistical process control methods for the detection of leaks from networks of and isolated storage tank systems. Liou13 suggests the direct application of mass balances to pipeline systems, and volumetric tank tests (volume balances) have found wide usage in testing the integrity of isolated underground storage tanks.14 3. Loss Estimates from Indirect Measurements The estimation of loss rates solely based on material movement and tank hold up data (and mass balance equations) has many advantages over existing methods, the largest being its minimal cost. Since movement and hold up measurements are required for the usual operation of the plant, no additional measurement equipment is required. In addition, the approach will be capable of identifying, on-line, the loss rate at each source location, thus becoming a valuable leak detection tool. The fundamental question is with regard to the accuracy of the loss estimates. Specifically, “Does there exist enough accuracy in the indirect measurements to infer loss rates with sufficient accuracy?” The following order-of-magnitude error analysis addresses this question for a single storage tank. 3.1. Loss Measurements. Consider a single storage tank operating in isolation. A mass balance around the tank defines the amount of lost product in one time period:

L ) mop - mcl + mio

(1)

Table 2. Typical Loss Rates from Storage Tanks loss (gal/day) for given tank diameter fixed roof tank internal floating roof external floating roof external floating roof with secondary seal

10 m

30 m

36.0 2.7 11.0 0.5

348.0 12.3 33.0 1.6

where L is the loss rate (per day) and mop, mcl, and mio are the opening, closing, and input/output masses. Using this identity, a loss measurement L+ can be defined (+ indicates a measurement and n/ are the random errors associated with each measurement): + + L+ ( m+ op - mcl + mio ) L + nop - ncl + nio

(2)

The variance of the loss estimate is defined as σ2 ) σop2 + σcl2 + σio2. Measurements of the volume of liquid in the tank are obtained via a height measuring component. Assume the accuracy of such a device is ∆h ) (1/ 16 in. (we assume constant liquid densities for this and the examples to follow). The diameter of a typical above ground storage tank found in a refinery is about 10 m. Thus, the error in the volume measurement is defined as follows: Error ) (π(D/2)2∆h = (33 gallons. Similarly, for a 30 m diameter tank, the volume measurement error is determined to be =(300 gallons. Assuming a best case scenario with regard to the input/output measurements (σio ) 0), the error in the loss measurement, L+, for a 10 m tank is (47 gallons and (425 gallons for a 30 m tank. As a means of comparison, Table 2 (from ref 15) shows the average evaporative emissions, as predicted from the API models, for storage tanks containing gasoline at 10 psi RVP. Comparing the above calculations with Table 2, it is clear that even in the least efficient tank (fixed roof), the loss from the tank will be at best (σio ) 0) on the order of the error associated with the loss measurement L+. Thus, it is clear that the use of a single day of measurements is not sufficient to estimate the low flow rates associated with evaporative emissions. 3.2. Time-Series Approach to Loss Estimation. Motivated by the above observation, a time-series, loss parameter estimation scheme has been developed to achieve the level of accuracy required. A fundamental assumption of the scheme is that the loss parameters vary slowly with time (a reasonable assumption since the deterioration of equipment is expected to be a slow process). As loss measurements of the form of (2) are available each day, we define the loss measurement on a given day, k, to be (op) (io) - n(cl) L+ k ) L + nk k + nk ) L + nk (io) - n(cl) where nk ) n(op) k k + nk . The most straightforward approach for estimating L is to average the measurements over time:

L ˆ (av) ) k

1

k

L+ ∑ i ki)1

A fundamental drawback of this scheme is that each loss measurement, L+ k , is weighted equally regardless of its accuracy. It is clear that day to day variations in plant operation result in variations in the accuracy of the loss measurements from day to day (see Appendix).

2338

Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000

As a result, an estimator that automatically compensates for changes in accuracy will be advantageous. A more obvious drawback of the averaging method is that will become less as k becomes large, the estimate L ˆ (av) k and less influenced by new data. As equipment will continually deteriorate with time (or be improved through repairs), the assumption of time invariant loss rate is clearly poor. Defining a moving average over a finite number of data points, M, will allow the estimator to track a time-varying loss rate. However, this yields a tuning parameter M that has very little physical significance. Many of the above difficulties can be alleviated by the introduction of a time evolution model for the loss parameter. A simple model of the time evolution of the loss rate is that the rate will change by a random amount, dk, during each time period (i.e., Lk+1 ) Lk + dk). Using this model, a physically meaningful tuning parameter emerges, namely, σd, the standard deviation of dk. Combining this time evolution model and the loss measurement equation, the following dynamic model is proposed:

Lk+1 ) Lk + dk

(3)

Lk+ ) Lk + nk

(4)

The optimal estimation problem to be solved is formulated as follows. Let L ˆ k(L B+ k ) be the estimation + + + function to be optimized, where L B+ k ) {L0 , L1 , ..., Lk }. A theoretically attractive and physically meaningful objective function is the mean squared error function. Thus, the minimum mean squared error (MMSE) estimation problem is stated as 2 min {E[(Lk - L ˆ k(L B+ k )) ]} L ˆ k(‚)

(5)

It is well-known that the optimal solution to this problem is given by the much celebrated Kalman filter.16,17 For the above single tank example, the Kalman filter is given as

ˆk + L ˆ k+1 ) L

Pk σk

2

(L+ ˆ k) k - L

(6)

where

Pk )

(Pk-1 + σd2)σk2 Pk-1 + σd2 + σk2

(7)

Unfortunately, the Kalman filter can only provide a loss estimate with MMSE if the dynamic system (3-4) is an accurate representation of the true loss process. While the dynamic model of eq 3 will certainly capture slow changes in the loss rate, it will have difficulties reflecting sudden changes or jumps. As these sudden changes in loss rate are expected to occur in the true process, although rather infrequently, we must analyze the filter’s ability to track them. Additionally, in cases for which an appropriate value for σd is unclear, the following analysis will provide a systematic design strategy independent of the statistics of Lk. Before this analysis, we will need the following preliminaries. In the following analysis we will need to assume σk is constant with respect to k (i.e., σk ) σ for all k). As

we have illustrated above, this is an unlikely scenario. However, one can always choose a nominal value to perform the analysis or do the calculations twice using the maximum and minimum values expected for σk and thus provide bounds on the performance of the filter. If we assume σ and σd are both nonzero, then it can be shown16,17 that, regardless of the initial P0, Pk will converge to a positive value P, satisfying

P2 + σd2P - σd2σ2 ) 0 However, if we choose σd to be approximately zero (which maybe physically motivated for certain processes), then P will be close to zero, resulting in a filter that is incapable of responding to step changes in any reasonable time frame. Thus, as a design method, we assume the following form for σd: σd2 ) σdo2 + σ˜ d2, where σdo2 is the actual variance of dk during nominal operation (i.e., with all jumps removed) and σ˜ d2 is a tuning parameter used to improve tracking performance. The two performance criteria of interest to us are the mean squared error and the ability of the estimator to track step changes in the loss rate. To understand the meaning of the mean squared error, first consider the ˆ k, and its time evolution error signal, ek ) Lk - L equation:

(

ek+1 ) 1 -

)

P P ek - 2nk + dk σ2 σ

The mean squared error is defined as Sk ) E[ek2]. Physically, Sk represents the variance of the error signal, ek. Thus, Sk1/2 indicates the amount of deviation one should expect between the loss estimate and the actual loss rate. It is easily shown that Sk will converge to a positive value S satisfying

(

S) 1-

) ()

P 2 P 2 S + 2 σ2 + σdo2 2 σ σ

If σ˜ d ) 0, then S and P will coincide and S will be the MMSE. If σ˜ d * 0, then S < P and neither will achieve the MMSE. However, as mentioned above, a nonzero σ˜ d will provide advantages with regard to tracking. To quantify tracking ability, we will need to borrow a concept from linear system theory. The time constant, τ, is loosely defined as the time required for the plant trajectory to catch up to a step input (τ = 0 implies good tracking and τ large implies poor). For a discrete time system, xk+1 ) axk + buk, the time constant is identified with the relation a ) exp{-1/τ}. The Kalman filter can be rewritten (neglecting both nk and dk, since they are expected to be small with respect to a step change in Lk) as

(

L ˆ k+1 ) 1 -

) ()

P P L ˆ k + 2 Lk σ2 σ

Thus, the relation between P, σ, and τ is P ) σ2(1 exp{-1/τ}). The relation between S, τ, σ, and σdo is

S)

(1 - exp{-1/τ})2σ2 + σdo2 (1 + exp{-2/τ})

(8)

This relation (shown in Figure 1 with σdo2 ) 0.1) indicates that for small τ the two performance criteria

Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000 2339

Figure 1. Trade-off between signal tracking and estimate accuracy.

Figure 3. Raw loss measurements.

Figure 4. Filtered loss measurements. Figure 2. Relation between signal tracking and loss rate variations.

are in fact competing. The rapid increase in S for larger τ is a result of the filter’s inability to track even slow changes (i.e., due to σdo). In fact, the τ values in this region correspond to σ˜ d2 values that are less than zero and are thus invalid under our design scheme. The relation between τ and the tuning parameter σ˜ d is

σ˜ d2 ) (2σ sinh{1/2τ})2 - σdo2

(9)

which is shown in Figure 2 (again assuming σdo2 ) 0.1). From the two figures, it is evident that as the tuning parameter, σ˜ d, increases, so will the performance criteria S. Thus, the suggested tuning scheme is as follows: (1) Choose nominal values for σ and σdo (if there is a physically motivated value for σdo, then use it; otherwise, choose zero). (2) Using relation 8, choose an operating point (S,τ) that provides a reasonable compromise between mean squared error and tracking ability. (3) Using relation 9, determine the value of σ˜ d that corresponds to the selected value for τ. (4) Reconstruct σd2 ) σdo2 + σ˜ d2 and employ the Kalman filter as usual. Example 1: Loss Estimation for an Isolated Tank. Consider an isolated tank, 10 m in diameter, with a 3 day operation cycle (in gallons/day), as shown in Table 3. Assume the loss rate from the tank is 25 gallons/day for 100 days, and then jumps to 50 gallons/

Table 3. Operation Cycle for Example 1 flow in flow out

day 1

day 2

day 3

200 000 150 000

0 50 000

0 0

day. The height measurement of the tank hold up has an accuracy of (1/16 in., giving a standard deviation of =16 gallons. The flow rate measurements have an accuracy of (1%. In addition, the rate meter is sampled once a minute, with an average flow rate of 900 gallons/ min when the valve is open. Using this data and the results of the Appendix, we find the standard deviation of the total mass flow measurements to be =0.15( + 1/2 m+ k ) , where mk is the total mass flow measurement. Thus, the total variance of the loss measurement L+ k is given as

σk2 ) 2(16)2 + (0.15)2(m(in)+ + m(out)+ ) k k Figure 3 shows the raw loss measurements for this situation. To compare the moving average scheme with the optimal estimator, each was tuned to have a time constant of about 25 days (assuming σk to be constant with a value equal to that of day 1). Figure 4 shows the output of each. Clearly, the optimal estimator outperforms the averaging scheme. The reason for this improvement is the ability of the Kalman filter to find the optimal estimates in the face of time-varying measurement noise. If the operation cycle of the tank was the same from day to day, we would expect the performance

2340

Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000

Figure 5. Simple network configuration.

of the two estimators to be similar. As we will see in the next section, the advantage of the Kalman filter formulation goes beyond the ability to handle timevarying noise. 3.3. Loss Estimation in Process Networks. While the time-series approach will provide significant improvements to the accuracy of the loss estimates, the low flow rates of leaks and fugitive emissions (as compared with the indirect measurement devices) may require additional improvements in estimate accuracy. The strength of the above formulation is that it can be generalized to a network of process nodes (i.e., tanks and units). This feature will allow the estimation scheme to be generalized in the following directions: exploitation of measurement correlations; ability to account for apparent losses; incorporation of operation dependent loss mechanisms. The most significant correlations in the measurements are due to the mass flow interconnections. Consider the example of two tanks connected to a blending unit (Figure 5). The loss measurement equations for this configuration are as follows: 1)+ 1)+ L(1)+ ) m(op - m(cl - m(1)+ ) k k k k

(op1) 1) - n(1) L(1) - n(cl k + nk k k

definite P∞ exists, then the problem is well-defined; if not, then the network may be unobservable (although there could be other reasons for the nonexistence). An apparent loss that appears to destroy the dynamic observability of the network is instrument bias. However, the strong dependence of these systematic metering errors on the unsteady operation of the plant will allow them to stand out from other losses. It can be shown (see the Appendix) that, for measurements derived from mass flow rate data, one can describe both random and systematic errors as m+ ) m + bm + n = m + bm+ + n. Using this mechanism, the following measurement equation is obtained, which accounts for both real losses and instrument bias (assume a single flow into the tank): (op)+ L+ - m(cl)+ + m(in)+ ) Lk + bkm(in)+ + nk k ) mk k k k

If we assume m(in)+ to be fixed at some nominal value k for all k, and then try to calculate P∞, we will quickly find that Pk is an unbounded sequence. However, if we force m(in)+ to fluctuate from day to day, we will find k that although Pk is fluctuating, it continues to remain bounded for all k. This phenomena is a simple illustration of the concept of persistent excitation (see ref 18, pp 70 and 266-7, for further details). In example 2, presented below, it is shown that normal fluctuations in refining operations are of sufficient magnitude to provide persistent excitation to the estimation scheme. Each of the above generalizations can be captured by a vector version of the dynamic model presented in eq 3-4)

2)+ 2)+ L(2)+ ) m(op - m(cl - m(2)+ ) k k k k

(op2) 2) L(2) - n(cl - n(2) k + nk k k (1) (2) (3) L(3)+ ) m(1)+ + m(2)+ - m(3)+ ) L(3) k k k k k + nk + nk - nk (2) Since the noise sequences {n(1) k } and {nk } are present in multiple equations, the measurements L(i)+ will be k correlated. Another advantage to the network perspective of the problem is the ability to account for both real and apparent losses simultaneously. This is done by considering the apparent losses in the same manner as would be done in a data reconciliation approach (i.e., consider apparent losses as unmeasured streams that are to be estimated using measurements of other streams). However, one must be careful not to destroy the observability of the network. Namely, the assumption of two unmeasured streams on the same process node means that, in the absence of other information, it will be impossible to decide how much flow should be assigned to each stream. For example, if we assume that a particular tank possesses a leak stream and an unmeasured slops stream, then unless we can utilize the facilities maintenance records (and possibly the hold up measurements of the slops holding tank), the proposed estimation problem will be ill-defined. A rigorous derivation detailing the conditions under which apparent losses can be estimated is the subject of future research. However, a simple test to determine if a dynamic estimation problem is not ill-defined is to calculate the steady-state value or limit of Pk (see eq 13) below and assume ∑k is constant). If a positive

bk, b ak are vectors containing the real losses, where L Bk, B instrument bias terms, and apparent losses, respectively, and B d(i) k are the respective changes in loss parameters. Notice the following characteristics of the model 10-11. The apparent losses are modeled such that they are not correlated with the apparent losses of the previous day (i.e., ak+1 ) d(a) k ). If a temporal correlation of the apparent losses is justified, then the above model can easily be adapted to reflect this case. The measurement eq 11 contains a row for each node in the network. The structure of the matrixes Ck and G are determined solely by the network configuration, and the matrix Ck is time-varying due to the fact that Ck contains the raw flow measurements, m(i)+ k , as coefficients to the bias terms, B bk. Finally, the matrix G contains all the information on the correlations in the flow measurement. The optimal estimator for this system is given by the following. Let xk ) [L BTk B bTk b aTk ]T and dk ) [d B(L)T B d(b)T k k (a)T T B dk ] ; then eqs 10 and 11 can be written compactly as (we drop the B / notation): xk+1 ) Axk + dk and y+ k ) Ckxk + Gnk. Thus, the optimal estimator is given by the

Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000 2341

Figure 6. Network configuration for example 2. Figure 7. Filtered loss measurements without instrument bias. Table 4. Operation Cycle for Example 2 into crude tanks 1-4 out of crude tank 1 out of crude tank 2 out of crude tank 3 out of crude tank 4 into jet tank 1 into jet tank 2 into jet tank 3 into jet tank 4 out of jet tanks 1-4

day 1

day 2

day 3

day 4

0 500 000 0 0 0 250 000 0 0 0 0

0 0 500 000 0 0 0 250 000 0 0 0

0 0 0 500 000 0 0 0 250 000 0 0

500 000 0 0 0 500 000 0 0 0 250 000 250 000

Table 5. Operation Cycle for Example 2 into waste oil tank out of waste oil tank gas oil flow

days 1-5

day 6

50 000 0 200 000

50 000 250 000 200 000 Figure 8. Filtered loss measurements with instrument bias.

vector equations17

xˆ k+1 ) Axˆ k + PkCTk (GΣkGT)-1(y+ ˆ k) k - CkAx Pk ) (I + Hk-1CTk (GΣkGT)-1Ck)-1Hk-1

(12) (13)

where Hk ) APkAT + ∑(d) and ∑k ) E[nknTk ], ∑(d) ) E[dk dTk ] are the covariance matrices of the noise sequences. Example 2: Loss Estimation in a Process Network. Consider the distillation network in Figure 6. The operation cycle for the plant (in gallons/day) is given in Tables 4 and 5. The height measurement devices in the tanks have the same accuracy as in example 1, (1/ 16 in. The diameters of the tanks are 15, 10, and 10 m for the crude, jet, and waste oil tanks, respectively. The accuracy of meters 1, 4, and 7 are 1/2% relative error, while meters 2, 3, 5, and 6 are 1% relative error. All process streams were assumed to flow continuously throughout the 24 h day except for receipts of crude oil and shipments of jet fuel, both of which were performed over an 8 h period. The sample period of the flow rate meters was once per minute. Two optimal estimation schemes were implemented, one with instrument bias detection and one without. Figure 7 shows the performance of each when there are no biases present, the performance of each is satisfactory. Figure 8 shows the estimates in the presence of a 1/ % bias in meter 3. Clearly, the simple scheme fails 2 in this situation, while the bias detection scheme is unaffected. To provide sufficient excitation to the instrument bias detection scheme, fluctuations of about

1% of the nominal flow rate had to be added to the operation cycle. Clearly, fluctuations at or above this level are expected in the operation of an actual plant. 4. Loss Accounting As discussed in the Introduction, loss accounting is comprised of a number of distinct activities. Having developed a method for the extraction of loss information from the indirect measurements, we now highlight a number of potential difficulties surrounding its interaction with other loss accounting activities. In this section, we illustrate that each of these issues can be overcome by simple application of existing technologies. 4.1. Data Collection and Preprocessing. The largest difficulty in implementing a loss accounting program is the collection and classification of the vast amounts of data required. Fortunately, much of the data required for the loss estimation algorithm are typically being collected for inventory purposes. Historically, material movement and hold up data would be recorded with pencil and paper. This information would then be sent to the accounting office, where it would be compiled into financial reports. In many cases, only feed and product inventories would be tracked, while data on intermediates would be ignored. However, in many facilities, they are beginning to automate the data collection process using process information systems and data historians. These systems electronically collect and store process measurements into a database, which is

2342

Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000

accessible by many of the computer systems at the facility. Furthermore, data consolidation methods, which convert raw process data into usable information, are beginning to emerge. A recently developed product, called Open Yield (Simulation Sciences Inc.), has been designed to perform this task specifically at refinery sites. It combines the interconnection diagram of the plant with movement and hold up measurements to calculate inventory and process yield information. This information is then stored in a self-contained, substantially smaller, database for eventual generation of financial reports. It is expected that automated data collection systems will become standard features of most refinery operations in the near future. With any data collection system, there is the possibility of gross misinformation getting into the system. Clearly, any system that includes manual input of data has the possibility of human error (transposing of numbers, miscategorization of values). Even a completely automated system is susceptible to measurement equipment failures. The presence of these outliers in the measurement data can significantly degrade the estimates. Clearly, these potentially large apparent losses are of the gross error class and can be efficiently detected and removed by any one of a number of gross error detection schemes found in the literature. The usual operating procedure for these outlier removal schemes is to use them as preprocessors to the main estimation algorithm. If a gross error is detected in a particular measurement, then unless the problem can be corrected (such as in miscategorization), the measurement will be unavailable to the main algorithm. Here the time-varying aspect of the Kalman filter will be of great utility. If a usually available measurement is unavailable on a particular day, then its absence can be mimicked through an inflation of the respective variance for that day. This artificial inflation, if large enough, will cause the filter to automatically ignore the faulty measurement with no change required in the filter structure. However, if the fault continues for many consecutive days, then the process may become unobservable, resulting in the Pk matrix becoming poorly scaled and difficult to calculate. A subtle but important preprocessing step is the conversion of all measurements into mass units. This step is essential due to the fact that masses balance, while volumes, in general, do not (the use of volumes in the above examples was purely to simplify the development and should not be followed in practice). An accurate conversion from volumetric to mass units will require many additional data sources such as temperature, pressure, and specific gravity measurements, tank strapping tables, and API correction factors. While these types of conversions are common to the chemical processing industry, the proposed loss estimation scheme will require an additional level of analysis. In particular, each data source should be analyzed and incorporated into the error analysis that drives the time-varying optimal filter (i.e., the determination of ∑k). Clearly, additional research is needed to develop and verify the appropriate analysis techniques. 4.2. Integration with Direct Loss Measurements. As specified by the overall philosophy of loss accounting, an important step in the analysis is to combine the timeseries loss estimates with the results of direct measurements of process emissions. There are two major difficulties with integrating the two data sources. The first

Figure 9. Integration of different measurement sources.

is due to the vast difference in sample times (i.e., indirect measurements are available daily while field tests are performed monthly at best). The second problem is with respect to the objects for which emission measurements are available. In particular, the indirect approach can only identify losses around a network node that, in general, consists of a number of loss sources, each of which has been field tested separately (see Figure 9). One approach to the problem is to develop a more comprehensive dynamic model of the process, which includes all measurements (both direct and indirect) and all loss sources and then apply the Kalman filter as usual. While this approach appears to be a reasonable solution to the problem, the impact of additional loss sources on the dynamic observability of the process is unclear. Ensuring the observability of the process may require additional modeling of the real loss mechanisms. As such, we consider this approach to be a topic for future research. In the following, an alternative approach is presented that is significantly simpler to implement, but still possesses a fair amount of statistical significance. The first step in this approach is to consider the output of the time-series estimation algorithm to be a fictitious loss measurement for the entire process node. Then, consider all the direct measurements to be measured flows into a mixing unit whose output is the time-series estimate. This imaginary mixing unit represents the correlation between the individual field tested losses and the total loss rate for the entire node (see Figure 9). If we assume each of these flow measurements has an associated uncertainty, then it is straightforward to apply the data reconciliation algorithm to our mixing unit. The time-series algorithm provides uncertainty information for each estimate, namely, the diagonal elements of Pk (or Sk, whichever is more appropriate). Unfortunately, the variance of the direct measurements is less clear. If the direct measurements were taken on the day of interest, then we should be able to determine some initial uncertainty value, σo, based on the quality of the testing equipment. However, as the days progress, the validity of the direct measurement will diminish due to the potential deterioration of the process equipment. Consequently, we will need to model the uncertainty of the direct measurements such that the variance of measurements increases with time. A possible uncertainty description for the direct measurements is

σk2 ) σo2 + f(k - ko) where ko is the day of the last field test, σo2 is the initial variance of the field test, and f (f(0) ) 0) is some

Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000 2343

monotonically increasing function. Clearly, the function f(k) should be dependent on the reliability (or average failure rate) of the particular piece of equipment being considered. 4.3. Interpretation of Loss Estimates. The final task of the loss accounting activity is the interpretation of the results. In general, this will be the most labor intensive of all the tasks. The most crude approach is to search for rising trends in the various time-series’ of the loss estimates. However, due to the random nature of the estimates (and low flow rates of emissions), it will be difficult to distinguish between the identification of an actual leak and a false alarm. Fortunately, the statistical foundation of our earlier methods will provide some meaningful guidance. The methods and philosophy of statistical quality control (SQC) are particularly applicable to this task. The basic method of SQC is to graph the time-series around a center line, xo, and search for trends with respect to its standard deviation, σ. In light of the previous section, the choice of σ is clear, while that of xo is not. There are two basic ways to arrive at a value for xo. The first is to use a prespecified value, which could be determined either from the equipment documentation or through the semiempirical models provided by the EPA/API. The second approach is to derive xo directly from the time history of the estimates. Specific to the leak detection application, a threshold line can be appended to the control chart. This line would identify the maximum allowable loss rate for a specific piece of equipment. Used in the context of the other SQC lines, the user will be capable of making rational decisions about the leak status of a particular loss source. In addition, many of the SQC “rules” can be automated so that only loss sources that have been flagged as possible leaks will require further investigation by plant personnel. The final aspect of the loss accounting activity is interaction with the LDAR program. Clearly, the identification of leaks through the indirect measurements will warrant additional field tests of specific equipment. By the same logic, the absence of a leak status should allow for a reduction in test frequency without sacrificing the effectiveness of the program. A quantitative investigation of this interplay between loss accounting and LDAR programs will be the subject of future research. 5. Conclusions In this work, we identified the concept of loss accounting as a more comprehensive approach to leak detection as compared to the usual field test based emission monitoring activity. Toward the implementation of a loss accounting program, we developed a statistically significant, time-series based method for the extraction of loss information from indirect measurement sources. The method was successfully tested under operating/noise conditions expected in a refining facility. In addition, the method was shown to be versatile enough to account for a number of apparent losses, particularly instrument biases. In the final section, we discussed the interaction of the time-series estimation scheme with the other aspects of the loss accounting activity. Acknowledgment Financial Support from the National Science Foundation Award No. GER-9554570 and Simulation Sciences, Inc., is gratefully acknowledged.

Appendix In this appendix, we consider the accuracy of total mass flow measurements. Consider a typical mass flow rate measurement m ˘ +. This measurement contains the actual flow rate, m ˘ , a random error term, dn, and a systematic error term that is typically a function of the actual rate, f(m ˘ ) (f(0) ) 0). It is well-known that the accuracy of a flow rate meter is a function of the rate (approximately 1-3% relative error). This is reflected in the standard deviation of the random error term. For relative error of (a%, the standard deviation is σdn ) (a%/2)m ˘ . The measurement of total mass passing through a meter in a day, m+, will then be calculated by numeric integration of the flow rate:

m+ )



1

m ˘ +(t) dt = 0

N

m ˘ i+∆t ( m + n + g(m) ∑ i)1

where m ˘ i+ ()m ˘ +(∆ti)) is in units of kilogram per day, N∆t ) 1 day, and m, n, and g(m) are defined as N

m)

m ˘ i∆t ∑ i)1 N

n)

dni∆t ∑ i)1 N

g(m) )

f(m ˘ i)∆t ∑ i)1

Random Errors. Consider the random error term n. Assume that ∆t is large enough such that the random error sequence, {dni}, has no autocorrelation (i.e., E[dnidnj] ) 0 for all i * j). As a result, N

σn2 )

∑ i)1

N

(σdni∆t)2 )

∑ i)1

(

)

a% m ˘ i∆t 2

2

To facilitate a simple expression for the standard deviation of n, we assume the flow rate through the meter to be constant for a portion of the day and zero for the rest. In other words, m ˘ i)m ˘ o for M time samples and m ˘ i ) 0 for the remaining N - M samples. As a result of this assumption, the following identity must hold: m ˘ oM∆t ) m, where m is the actual total flow. Thus,

σn2 ) M

(a%2m˘ ∆t) ) (a%2) ∆m m 2

o

2

o

where ∆mo ) m ˘ o∆t is the total mass flow through the meter during one sample period, ∆t. From this expression, it is clear that the size of the random error in the total mass flow measurement will depend on the total mass flow during the day and the mass flow during one sample period. To reduce the error of a total flow measurement, one should either sample the rate more frequently or reduce the average flow rate. Systematic Errors. Now consider the systematic error term g(m). If the flow rate is assumed to be constant for part of the day and zero the rest (as it was

2344

Ind. Eng. Chem. Res., Vol. 39, No. 7, 2000

assumed above), then the following simple expression for g(m) is found:

g(m) ) f(m ˘ o)M∆t )

( )

f(m ˘ o) m m ˘o

Overall Measurement Model. Thus, a simple mechanism is found for the total mass flow measurement

m+ ) m + bm + n = m + bm+ + n ˘ o is a constant and the standard where b ) f(m ˘ o)/m deviation of the random error, n, is proportional to the square root of the total mass flow. That is,

σn )

a% (∆mo)1/2m1/2 ) Cnm1/2 = Cn(m+)1/2 2

Literature Cited (1) Guide to Hydrocarbon Loss Accounting and Control in Petroleum Refinery Operations: Petroleum Measurement ManualPart XVII; Institute of Petroleum: London, October 1995. (2) Compilation of Air Pollution Emission Factors; Publication AP-42, 5th ed.; U.S. Environmental Protection Agency: Washington, DC, Vol. 1 with Supplement A. (3) Allen, D. T.; Rosselot, K. S. Pollution Prevention for Chemical Synthesis; John Wiley & Sons: New York, 1997. (4) Rapaport, D. Improve Above Ground Storage Tank Management. Hydrocarbon Process. 1997 (Oct), 50. (5) Siegell. Improve your Leak Detection and Repair Program. Chem. Eng. Prog. 1997 (Jul), 50. (6) Bagajewicz, M.; Jiang, Q. Gross Error Modeling and Detection in Plant Linear Dynamic Reconciliation. Comput. Chem. Eng. 1998, 22, 1789.

(7) Bascunana, M. B.; Rollins, D. K. New Strategies for the Detection of Measurement Biases in Processes Under Dynamic Conditions and Having Bilinear Constraints. AIChE Annual Meeting, Miami Beach, FL, November 1998; Paper 240C. (8) Carpentier, P.; Cohen, G. State Estimation and Leak Detection in Water Distribution Networks. Civ. Eng. Syst. 1991, 8, 247. (9) Kao, C.; Tamhane, A. C.; Mah, R. S. H. Gross Error Detection in Serially Correlated Process Data. 2. Dynamic Systems. Ind. Eng. Chem. Res. 1992, 31, 254. (10) Tong, H.; Crowe, C. M. Detecting Persistent Gross Errors by Sequential Analysis of Principal Components. AIChE J. 1997, 43, 1242. (11) Miles, J.; Jelffs, P. A. M. Computer Aided Loss Investigation and Monitoring. In Proceedings of the Second Oil Loss Control Conference; Jelffs, P. A. M., Ed.; John Wiley & Sons: New York, 1988. (12) Jelffs, P. A. M.; Harrison, P. S. Service Station-Monitoring. In Proceedings of the Third Oil Loss Control Conference; Jelffs, P. A. M., Ed.; John Wiley & Sons: New York, 1990. (13) Liou, J. P. Leak Detection by Mass Balance Effective for Norman Wells Line. Oil Gas J. 1996 (Apr 22), 69. (14) Maresca, J. W.; Starr, J. W.; Roach, R. D.; Naar, D.; Smedfjeld, R.; Farlow, J. S.; Hillger, R. W. Evaluation of Volumetric Leak Detection Methods Used in Underground Storage Tanks. J. Hazard. Mater. 1991, 26, 261. (15) Control Techniques for Volatile Organic Compound Emissions from Stationary Sources; 453/R-92-018; U.S. Environmental Protection Agency: Washington, DC, 1992. (16) Anderson, B. D. O.; Moore, J. B. Optimal Filtering; Prentice Hall: Englewood Cliffs, NJ, 1979. (17) Balakrishnan, A. V. Kalman Filtering Theory; Optimization Software, Publications Division: New York, 1987. (18) Ljung, L.; Soderstrom, T. Theory and Practice of Recursive Identification; MIT Press: Cambridge, MA, 1983.

Received for review August 2, 1999 Revised manuscript received April 10, 2000 Accepted April 17, 2000 IE990573O