Gross error detection and data reconciliation in experimental kinetics

Theory Analysis of Nonlinear Data Reconciliation and Application to a Coking Plant. Industrial & Engineering Chemistry Research. Hu, and Shao. 2006 45...
0 downloads 0 Views 924KB Size
Ind. Eng. Chem. Res. 1993,32, 2530-2536

2530

Gross Error Detection and Data Reconciliation in Experimental Kinetics Ailene G. Phillips Union Carbide Chemicals and Plastics, Znc., South Charleston, West Virginia 25303

Douglas P. Harrison’ Department of Chemical Engineering, Louisiana State University, Baton Rouge, Louisiana 70803

A formal gross error detection and data reconciliation analysis based upon the modified iterative measurement test (MIMT) has been applied to pilot plant kinetics data. Raw and reconciled data were used to determine reaction rate equations using linear regression. T h e reactor was an isothermal CSTR, and the reacting system involved competitive catalytic reactions with catalyst deactivation. The rates of the competitive reactions were approximately 1 order of magnitude different, and, as a consequence, regression results for the slower secondary reaction were poor. The “best” rate equations for both reactions resulted when data sets identified as containing gross errors were eliminated and regression was performed using reconciled data from the remaining data sets which contained no gross errors.

Introduction Laboratory and pilot plant data are required to develop the kinetic models needed for the design of a process involving chemical reactions. Today’s highly automated pilot plants often provide such an extensive array of measurements that redundant methods are available for checking material balance consistency. Due to the unavoidable presence of random errors and the possible presence of gross errors, the redundant material balances are rarely self-consistent, and calculated reaction rates, selectivities, and yields may vary widely, especially when conversions are low. Engineering judgment, experience, and “loose”statistics have traditionally been used to select data subsets which produce believable reaction rates and selectivities, while the remaining data are ignored. When only random errors are present, least-squares data reconciliation may be used to adjust the measurements so that the data set becomes consistent with the material balance constraints. The undetected presence of gross errors, however, may skew the reconciliation and lead to incorrect rate equations. When a gross error@)exists it is desirable to identify the measurement(s) containing the gross error(s), eliminate the value(s) from the matrix of measured values, and limit the data reconciliation to measurements which contain only random errors. A complete description of data reconciliation and gross error detection methods has been presented by Mah (1990). In this paper we have applied the modified interative measurement test (MIMT) of Serth and Heenan (1986) to analyze nonlinear pilot-scale reaction data. Kinetic rate equations determined by linear regression analysis provide a basis for evaluating the results.

Modified Iterative Measurement Test Serth and Heenan (1986) compared the performance of seven gross error detection algorithms applied to a steam metering network using 100randomly generated data sets. The MIMT method represented the best combination of computational speed and effectiveness. Serth and Heenan later applied the MIMT method to nonlinear problems involving metal grinding (1987) and steady-state distillation (1988).

Only a brief outline of the method is presented here. Readers interested in additional details are referred to the original paper (Serth and Heenan, 1986) or the thesis of Phillips (1991) which also contains a listing of the computer program written for the pilot plant reaction problem. The MIMT method begins with a constrained leastsquares minimization using all measured values to calculate an initial set of adjusted measurements. A test statistic, p, based on a normalized difference between the measured and reconciled values, is then calculated for each measurement. The measurement with the largest p greater than p-critical is deleted from the measurement vector and added to the computed vector. p-critical, as defined by Mah and Tahmane (19821, is a function of a, the probability of erroneously identifying a gross error where none exists. The data are then rereconciled. If any of the adjusted or computed values fall outside preselected upper and lower bounds, the last variable deleted is returned to the measurement vector and the variable with the next largest p greater than p-critical is deleted. The data are again rereconciled, and the results again checked against the lower and upper bounds. After the bounds requirements are met, the test statistic is recalculated for the last set of reconciled values to determine if another gross error exists. When the test statistic for each measurement is less than p-critical, or when removing the offending measurement(s) causes results to be out of bounds, the process is complete. The maximum number of gross errors is usually limited to some reasonable subset of the number of degrees of freedom of the problem. No strict guidelines exist for establishing the maximum number of gross errors. The limit is subjective and depends on the specific characteristics of the problem such as the degree of nonlinearity and the level of noise in the data. A maximum of five was used in this study although the number of gross errors identified was always less than the maximum. The widest possible bounds on each of the measured variables were specified, and the effect of varying the bounds was not studied. Flow rates were required to be equal to or greater than zero while mole fractions were limited to values between 0 and 1. These bounds were not violated in any of the tests using either computer-generated data or real data.

0s88-5885/93/2632-2530~04.00100 1993 American Chemical Society

Ind. Eng. Chem. Res., Vol. 32, No. 11, 1993 2531 A

1

Table I. Typical Experimental Material Balance Illustrating the Variation in Reaction Rates and Selectivity component

A B C

I

20

D

&%Ea

E F

C: 21, D 22, E: 23 FeedMk

C:Q,D10,E:ll

Figure 1. Schematic of the pilot reactor system.

The calculation of p-critical required that the value of CY (the probability of erroneously identifying a gross error when none exists) be specified. Following Serth et al. (1987), a value of CY = 0.10 was used in all tests involving pilot plant data and in most tests involving computergenerated data. CY = 0.05 was used in some tests with computer generated data to examine the direct effect of

calcn basis

C

calcn basis

F A, C

B,C E,C

ff.

Finally, percent standard deviations of the random error for each of the measured variables were required as input. In general, the standard deviations were based upon engineeringjudgement and knowledgeof the performance of the instruments used for chemical analysis and flow control and measurement in the pilot plant. In some tests using computer-generated data the effect of a change in the standard deviation was specifically examined.

Experimental System The reaction stoichiometry is 1 A + ;iB +C

A + 3B

-

-

D

+E

2E + 2F

(primary reaction) (secondary reaction)

Both reactions are irreversible and occur in the gas phase in the presence of a solid catalyst which deactivates with time. D is the desired product while neither E or F have sales or fuel value. A schematic of the pilot plant reactor is shown in Figure 1. Operating conditions in the Berty autoclave were designed to eliminate mass transfer and pore diffusion resistances and to ensure isothermal, perfectly mixed reaction conditions. Pure components A, B, F, and I (inert) were obtained from high-pressure gas cylinders with flow rates controlled by mass flow controllers. Two liquid mixtures, the first containing components C and E and the second containing C-E, were fed using positive displacement pumps. Reactor product was separated into three streams, a gaseous overhead containing only A, B, F, and I and two liquid mixtures containing different proportions of components C-E. The composition of all feed and product mixtures was determined using gas chromatography. The flow rates of all inlet and outlet streams except the overhead product were measured. The flow rate of this stream was calculated via material balance on component I. Twenty-three process measurements identified by the number codes in Figure 1 were made for each run. The material balance constraints consist of mole balances for the six components and mole fraction sums for the five feed and product streams involving two or more components. A single batch of catalyst was used for the entire study. The experimental plan consisted of 16 experiments using a fractional factorial statistical design based on reactor

~

calcn basis A, C

B,C E,C F, C

inlet 3.498 0.739 1.635 0.020 0.213 0.479

flow rate (mol/h) outlet 3.379 0.589 1.432 0.168 0.348 0.500

Primarv Reaction Rate calcd rate (mol of D/h) calcn basis 0.203 D Secondary Reaction Rate calcd rate (mol of F/h) calcn basis 0.021 A, D -0.168 B,D 0.033 E,D -0.068 Selectivity (A to D) % calcn basis 170.6 A, D 92.6 B,D 120.1 E,D 95.0 F, D

change -0.119 -0.150 -0.203 0.148 0.135 0.021 calcd rate (mol of D/h) 0.148 calcd rate (mol of F/h) -0.058 0.051 -0.013

% J

124.4 85.3 104.6 93.3

inlet conditions, seven centerpoint runs (using the same reactor temperature, pressure, and feed conditions) to evaluate catalyst deactivation, and a series of augment runs designed to make the final data set orthogonal in the reactor outlet conditions. The goal of the overall experimental plan was to minimize the amount of confounding of the independent variables in the final data set following the approach of Cropley and Burgess (1986). Eight independent variables were studied temperature, partial pressures of A-F, and time. Products D-F were included in the reactor feed to check for product inhibition. Temperature was controlled within f O . l O C of the set point. Total pressure was held constant at 140 f 1 psia with nitrogen serving as the inert. The reactor was operated continuously for more than 30 days. Temperature, feed rates, and feed compositionswere changed in the afternoon, and the reactor was allowed to line out overnight. Twohour data collection runs were carried out the following morning. Complete data sets for all runs are available (Phillips, 1991). The rate of consumption or formation of each component is the difference between the inlet and outlet component flow rates. By combiningthe component rates and reaction stoichiometry, the rate of each reaction can be calculated in several different ways. If the data contained no errors, each calculation method would yield the same value. The rate of the primary reaction, expressed as mol of D/h, can be calculated from either the rate of appearance of D or from the rate of disappearance of C. The rate of the secondary reaction, expressed as mol F/h, can be calculated directly from the rate of formation of F or from any of six combinations of the other reacting components: (1) A and C, (2) B and C, (3) E and C, (4) A and D, (5) B and D, and (6) E and D. In a similar manner, redundant methods exist for calculating the selectivity of A to product D. Table I illustrates the problem which arises using raw data from an experimental run which, from subsequent MIMT analysis, is believed to contain only random errors. Each calculation basis for the reaction rates and selectivity provides a different numerical value. In a number of cases,

2532 Ind. Eng. Chem. Res., Vol. 32, No. 11, 1993

unreasonable results involving negative reaction rates and selectivities greater than 100% are calculated.

Reaction Rate Equation The complete study consisted of two separate statistical analyses. Data reconciliation and gross error detection using the MIMT method were followed by linear regression to correlate the final reconciled data using the linearized version of the following rate equation rj = Aj exp(-Ej/RT) exp(-bjt)IIPy Index j = 1 represents the primary reaction while j = 2 represents the secondary reaction. r is the reaction rate (mol/h), A the frequency factor (units depend on reaction order),E the activation energy (cal/mol),R the gas constant (1.987cal/mol K), T the temperature (K), t the time (h), b the deactivation constant (h-l), Pi the partial pressure of component i (psia), and ai the reaction order for component i. All reactant and product species were considered in the rate equation, i.e., i = A, B, C, D, E, and F. Nine constants-A, E , b, and a r w e r e evaluated by linear least-squares methods using the software package Data Desk. In the initial regression, the full form of the rate equation was used, and best values of all constants were determined. That constant having the lowest statistical significancewas then eliminated and the regression repeated. Constants were eliminated one-by-one until all remaining constants were statistically significant at the 95% confidence level. On the basis of prior experience with similar reactions in both plant and pilot-plant settings, the activation energy of the primary reaction was expected to be between 14 and 20 kcal/mol while the activation energy for the secondary reaction was expected to be approximately 50% higher. Testing of Computer-Generated Data An extensive series of tests using computer-generated data containing no gross errors as well as single and multiple gross errors of known magnitude and location was conducted to validate the computer program, to determine the effect of MIMT parameters, and to compare performance to that reported by Serth et al. (1987)for a different system. In all computer tests the data sets were generated about a specified mean value which was representative of actual reaction conditions and, when used without adjustment, provided for perfect material balance closure. Data Containing No Gross Errors. Initially, 100data sets containing no gross errors were tested to validate the data reconciliation capability. Results for one example test are presented in Table 11. Although reconciliation results in relatively small changes in each of the measured values, the changes in reaction rate and selectivity are quite dramatic. The rate of the primary reaction was either 0.333or 0.394mol of D/h from the raw data and 0.341 mol of D/h after reconciliation. Using unreconciled data, the rate of the secondary reaction varied from -0.061to 0.118 mol of F/h, depending upon the calculation method, with the reconciled value being 0.069 mol of F/h. Selectivity, which ranged from 85.0to 108.4% using unreconciled data, was found to be 90.9% after reconciliation. In the Table I1example, the magnitude of the average percent deviation of the 23 computer-generated values from the mean values was 2.735%. Reconciliation reduced the magnitude of the average percent deviation to 2.50%. The overall average standard deviation of the 23 computer-generated values for the 100 data seta was 2.81%. This was reduced to 2.39% after reconciliation.

Table 11. Results of Reconciliation of a Typical Computer-GeneratedData Set Containing No Gross Error computer-generated data reconciled data % % change from % index deviatn computerdeviatn (see from generated from Figurel) value mean value value mean 1 3.4620 1.59 3.4819 0.57 1.03 2 0.7346 -1.89 0.7363 0.23 -2.12 3 0.4671 0.41 0.4637 -0.73 1.13 4 1.5851 -0.45 1.5906 0.34 -0.80 -3.57 1.4548 -1.49 5 1.4769 -1.15 6 0.8761 0.34 0.8737 -0.27 -0.07 7 0.1239 2.36 0.1263 -1.94 0.47 8 0.4145 2.01 0.4141 -0.10 2.10 9 0.8389 0.60 0.8384 -0.06 0.66 10 0.0469 -4.22 0.0468 -0.21 -4.00 11 0.1142 -2.88 0.1148 0.53 -3.42 12 0.5472 0.78 0.5458 -0.26 1.03 13 0.0816 1.09 0.0813 -0.37 1.45 14 0.0912 0.65 0.0935 2.52 -1.85 15 0.2799 -2.08 0.2794 -0.18 -1.90 16 2.0289 -3.30 2.0181 -0.53 -2.75 17 0.5906 6.08 0.6079 2.76 3.32 18 0.0974 -7.51 0.0991 1.75 -9.38 19 0.3120 -11.22 0.2931 -6.06 -4.49 20 0.2522 2.63 0.2599 3.05 -0.34 21 0.1969 1.94 0.1956 -0.66 2.59 22 0.6132 -1.81 0.6159 0.44 -2.26 23 0.1899 3.56 0.1885 -0.73 9.27 basis

D C F B, D

B,C E, D E, C A, D A, C F, D F, C B,D B, C E, D E,C A, D A, C

before after reconciliation reconciliation Primary Reaction Rate (mol of D/h) 0.333 0.341 0.394 Secondary Reaction Rate (mol of F/h) 0.049 0.069 0.071 0.050 0.118 0.057 0.061 -0.061 Selectivity of A to D (7%) 93.1 90.9 94.1 90.4 94.0 85.0 93.3 91.6 108.4

The MIMT method, combining gross error detection and data reconciliation, was then tested using 100 computer-generated data sets which again contained no gross errors. The purpose was to determine the frequency at which the program identified gross errors when none existed and to evaluate the effect of the parameter a.For a = 0.05, the program incorrectly identified five measurements in the 100 test cases as containing gross errors while the number of incorrect identifications increased to seven with a = 0.10. From a statistical viewpoint, the fraction of incorrect identifications should approach a as the number of simulations approaches infinity. The latter value of a was used in subsequent testing. Data Containing a Single Gross Error. A single gross error of known magnitude and location was then superimposed upon the computer-generated data containing random error. Due to the virtually unlimited number of combinations of gross error size and location, the testing was limited to three types of errors, those involving single component flow rate, mixture flow rate, and mixture composition. Selected results are summarized in Table

Table 111. Summary of Results Using Computer-@nerated Data Containing One Gross E r r o r error magnitude identificn % of no. of location mean standard success (see Figure 1) value devns rate (%) 1 +25 16.7 96 1 -25 16.7 94 1 +15 10 90 1 +10 6.7 64 4 +25 16.7 96 5 +25 5 48 5 +25 8.3 76 5 +25 16.7 95

Ind. Eng. Chem. Res., Vol. 32, No. 11, 1993 2533 Table IV. Rate Equation Regression Results: Primary Reaction.

gross error

tme pure component flow rate

mixture flow rate composition prenormalization postnormalization postnormalization

12 12 13

+25 +25 +25

10 10 10

0 60 100

111; each entry represents results from 50 or 100 tests. In most tests the magnitude of the gross error was varied while holding the standard deviation of the random error constant at “best-estimate” values. In other tests the variations were reversed with the magnitude of the gross error held constant while the standard deviation of the random error was varied. When the magnitude of the error was expressed in terms of the number of standard deviations, the success rates in identifying the gross error were comparable. Success in identifying pure component flow rate errors depended on the magnitude of the error with greater than 90% success achieved when the gross error was greater than 10 times the standard deviation of the mean error. In those cases where the error in measurement 1was not correctly identified, the program either incorrectly identified a gross error in measurement 4 or failed to identify any gross error, in which case data reconciliation spread the error over the composition measurements. A +25% error in measurement 5 (mixture flow rate) was correctly identified in only 48% of the test cases. However, this value was only five times the estimated standard deviation of the random error. Decreasing the standard deviation of the random error while holding the size of the gross error constant increased the success rate to levels comparable to those associated with pure component flow rates. Mixture composition in the experimental system was determined by gas chromatographic analysis with composition normalized to 100%. Two types of gross composition errors are possible. An error in the calibration factor for a single component is spread among all the components of the mixture. Such an error is designated as a prenormalization error. In contrast, a reporting or recording error is limited to a single component and is referred to as a postnormalization error. Both types of errors were considered in the computer simulations. None of the prenormalization errors in measurement 12were correctly identified. By spreading the error among the four components, the effective error in each component was reduced to the point that each error was considered to be random. However, when the composition error was added after normalization, the program was more successful. 60% and 100% of the postnormalization errors in measurements 12 and 13, respectively, were correctly identified. The greater success in identifying measurement 13error is due to the fact that component B is the limiting reactant and ita relative change across the reactor is greater. Data Containing Multiple Gross Errors. Results of testa in which two and then three gross errors were added to the data sets were not encouraging. In all multiple error tests, the magnitude of the error was held at 25%.

1 rate param Ai

E1 bi aA1 aB1

ac1

4 5 gross errors reconciled 2 identified data with data raw data 3 and complete seta containing gross errors (prodctn reconciled data set of D) raw data rereconciled eliminated 11.07 10.95 10.97 15.39 12010 11340 11 420 15 520 0.001 92 0.001 59 0.001 57 0.002 25

NS

NS

NS

NS

0.544 0.170

0.451

0.481

NS NS NS

NS NS

0.590 0.190 -0.241

NS NS NS

aD1

NS NS NS correl coeff 0.915 0.875 0.869 degof 32-6=26 32-4=28 32-4=28 aEl aF1

NS NS 0.972 19-6=13

freedom

NS: not significant at the 95% confidence level.

The success rate was highly dependent on which measurements contained the gross errors. In the best case, 60% of the errors were correctly identified when postnormalization composition errors were assigned to measurements 9 and 19. With errors in measurements l and 10, the error in 1 was correctly identified 100% of the time, but the error in 10 was found only 10% of the time. With errors assigned to measurements 5 and 20, the error in 5 was correctly identified 20 % of the time but the error in 20 was never found; instead, measurement 16 was often incorrectly identified as containing a gross error. Comparison with Serth et al. (1987). The results using computer-generated data sets containing a single gross error are generally consistent with those of Serth et al. (1987). For gross errors in the 8-20 standard deviation range, Serth et al. reported 80% success. In this study, the success rate was greater than 90% when the gross error was greater than 10 standard deviations from the mean, except when the gross error was assigned to a prenormalization composition error. With two to four gross errors ranging in magnitude from 4to 40standard deviations, Serth et al. reported an overall 80 % success rate. The poorer performance in this study may be attributed to two factors. First, the magnitudes of the gross errors in this study were generally smaller, ranging from 5 to 17 standard deviations. The second factor is the specific nature of the problem considered. The total number of measurement values was 23 in each case. However, the Serth et al. problem contained four components, four nodes, and a single exit stream, compared to seven components, asingle node (the reactor), and three exit streams in this study.

Experimental Data Analysis In the following sections the results of gross error detection, data reconciliation, and reaction rate equation determination are compared for a number of cases ranging from the use of raw data without gross error detection or reconciliation to complete treatment of the data using the MIMT method. Both reactions were analyzed, and, as might be expected, the results for the primary reaction are more satisfactory than for the secondary reaction. Primary Reaction. Preliminary examination of the raw data suggested that the rate of formation of D should be used for the primary reaction. Final results showing the regression coefficients are tabulated in column 2 of Table IV. Fivevariables-Al,El, bl, aB1, and acl-satisfied the significance criterion and remained in the final rate equation. With 27 degrees of freedom, the overall cor-

2534 Ind. Eng. Chem. Res., Vol. 32, No. 11, 1993 Table V. Comparison of Raw and Reconciled Data from a Data Set Containing a Gross Error in Measurement 2 reconciled without rereconciled after error detection error detection index (see raw data % change % change value from raw data value from raw data value Figure 1) 3.4990 3.5546 3.4998 1 3.1 0.0 -12.9 0.7400 0.7231 0.6445 -2.3 2 0.4800 0.4815 0.3 0.4781 -0.4 3 1.5780 1.5731 0.7 -0.3 1.5897 4 1.4583 1.4779 0.5 1.3 1.4657 5 0.4 0.8841 0.8873 0.8849 0.7 6 -2.7 0.1159 0.1127 0.1153 -0.5 7 0.4247 0.1 0.4250 0.4249 0.0 8 0.8446 0.8454 0.1 0.0 0.8448 9 0.0447 0.0448 0.0447 10 0.0 0.2 0.1107 0.1098 0.1105 11 0.2 -0.8 0.5601 12 -0.2 -0.8 0.5558 0.5600 0.0894 0.0 5.1 0.0940 0.0894 13 0.0851 1.6 -0.9 14 0.0865 0.0843 -0.2 0.8 0.2643 0.2637 0.2663 15 1.8540 1.8510 1.8670 0.7 -0.2 16 -1.5 0.7470 0.7448 0.7361 17 -0.3 -1.1 0.0561 0.0555 0.0560 18 -0.2 0.1992 0.1969 5.8 1.2 0.2084 19 0.2540 0.2492 0.2520 -0.8 1.9 20 0.4134 0.4151 0.4138 21 0.1 0.4 0.3937 22 -0.2 -0.8 0.3904 0.3929 0.1929 0.1932 0.1945 23 0.2 0.8 reaction rate primary reactn secondary reactn

0.185 (Dbasis) 0.028 (Fbasis)

0.182 0.043

relation coefficientwas a respectable 0.915. The activation energy was smaller than expected, but the remaining regression coefficients were reasonable. The complete set of experimental data was then reconciled using the least-squares technique without any attempt to identify gross errors. Results similar to those presented in Table I1 were obtained from each run. In general, the reconciled rate of the primary reaction was closer to the experimental rate based upon the production of D, which justified the more or less arbitrary selection of this rate basis in the previous paragraph. Regression of the reconciled data was carried out as before with the least significant parameter removed one by one until all of the remaining parameters were significant at the 95% level. Results are shown in column 3 of Table IV. Regression of reconciled data produced one fewer statistically significant variable and a somewhat poorer fit of the final rate equation to the reconciled data. Reaction order with respect to C, which was the least significant of the five variables in the regression of raw data, no longer satisfied the significance test. The other four significant variables were the same, and their order of significance,as measured by the t-ratio, was unchanged. The value of each variable changed, and the final equation resulted in a decrease in the overall correlation coefficient from 0.915 to 0.875. The magnitude of the activation energy became even smaller. These results suggest the presence of gross error in some data sets. Since such errors are not normally distributed, reconciliation apparently skewed the final values. The gross error portion of the MIMT program was then used without reconciliation to identify the tests which contained gross errors and the specific location of the errors. Gross errors were identified in 13 of the 32 experiments. In seven cases the error was associated with measurement 2, the pure component feed rate of reactant B; other measurements identified as being in error were 14 (twice), 1, 3, and 16 (once each). Since seven of the identified errors were associated with measurement 2, it is useful to examine the experimental aspects of this

0.184 0.016

measurement in more detail. Several problems were known to exist with this mass flow controller. First, a check valve in the feed line was known to stick periodically. Second, laboratory safety procedures required that the component B mass flow controller be operated with only 5 psi pressure drop, instead of 20 psi normally recommended. Finally, during the succeeding experimental program using this apparatus (studying a different reaction), this flow controller failed completely and had to be replaced. Thus, substantial empirical evidence exists to support the identification of the data sets containing a gross error in measurement 2. Two approaches were followed after the experiments containing gross errors were identified. In the first, the full capabilities of the MIMT method were tested. Those measurements identified as containing a gross error were transferred to the set of calculated variables and the data were rereconciled. Reconciliation results for an example data set in which a gross error was identified in measurement 2 are shown in Table V. Raw experimental data are compared to reconciled data without gross error detection and to rereconciled data after the gross error in measurement 2 was identified and removed from the set of measured variables. Without gross error detection, the error in measurement 2 was distributed over a number of the measured variables. In particular, reconciliation resulted in “corrections” of greater than 5% in both measurements 13and 19. After the -12.9% gross error in measurement 2 was identified, the adjustments to measurements 13 and 19 were 0.0% and 1.2%, respectively. Similarly, the magnitude of the adjustment after gross error identification was smaller in 19 of the remaining 22 measurements. Neither of the reconciliation approaches produced a significant change in the primary reaction rate. However, the sensitivity of the secondary reaction to small changes in flow rates and compositions is evident from the wide variation in the secondary reaction rate. The regression results followinggross error identification and rereconciliation are summarized in column 4 of Table IV. Only four significant variables remain, and the

Ind. Eng. Chem. Res., Vol. 32, No. 11, 1993 2535 0,350

t10 J '

Data Sets Containing

0.300-

0.300i 0

>0.2504

No Gross Errors

/

$0.250-

x Gross Errors

-

E

-

,0.200v L

2 0.150c -a0 -0 2 0.1000.0500.000

v

.OW

I

.050

I

I

I

I

.IO0 ,150 ,200 .250 Measured r( l), mole/hr

I

.300

,350

.ooO

I

I

,050

.lo0

I

.I50

I

I

I

.ZOO

250

.3W

.:

Measured r(l), rnole/hr

Figure 2. Parity plot comparing measured and calculated rates of the primary reaction. Measured rate based upon rereconciliation of data after gross error detection. Calculated rate using regression parameters from column 4 of Table IV.

Figure 3. Parity plot comparing measured and calculated rates of the primary reaction. Data seta containing gross errors eliminated. Measured rate based upon reconciled data. Calculated rate using regression parameters from column 5 of Table IV.

activation energy is again smaller than expected. Results are quite similar to those achieved following regression of reconciled data without error detection. The same four variables are significant, and their magnitudes are almost the same. Similarly, the overall correlation coefficient is approximately the same. Figure 2 is a parity plot which compares the measured reaction rate using reconciled data to the calculated rate using the regression coefficientsfrom column 4 of Table IV. While the deviation between measured and calculated rates was less than 10% in 23 of the 32 tests, six of the nine tests in which the deviation exceeded 1076 were identified as containing gross errors. This suggests that the reconciliation analysis for these runs did not yield the correct reaction rate. In the second approach, the 13 experiments identified as containing gross errors were eliminated from the overall data set, and regression was performed using reconciled data from the remaining 19 experiments. Since elimination of a large fraction of a designed data set could cause confounding of the variables, the correlation matrix of the reduced data set was examined. All correlation coefficients between variable pairs were less than 0.8, which is often established as the maximum acceptable value (Barnes and Conley, 1986). Regression results using reconciled data from the 19 data sets containing no gross errors are found in column 5 of Table IV. Six significant parameters were identified-AI, E l , bl, UB1, acl, and aD1. The activation energy increased significantly and, for the first time, fell into the expected range. A moderate retarding effect due to product D was indicated. The correlation coefficient increased to 0.972, and the magnitude of the average deviation between measured and calculated reaction rate was 6.2 '36, In only one of the 19 data seta was the deviation greater than 10%. A parity plot for this analysis is shown in Figure 3. Secondary Reaction. The rate of the secondary reaction was from 2.5 to 10times smaller than the primary reaction rate. As a consequence, the scatter in the reaction rate and selectivity calculated by different methods was much larger (see Table I), and correlations for the rate equation were much less satisfactory. At least one negative reaction rate resulted when each of the seven alternates was used to calculate the reaction rate using raw experimental data. In fact, 28 of the 32 data sets produced negative rates when the calculation was based upon

components A and C. Reaction rate data based upon the formation of F were selected as the basis for regression of the raw data. Only 30 data sets were included in the analysis since two of the calculated rates based upon F were negative. Regression results using the raw data are summarized in column 2 of Table VI. Only three kinetics parameters-b2, aA2, and apz-satisfied the significance criterion. The Arrhenius constants, Az and E2, were retained in the correlation in order to satisfy kinetics principles even though they did not satisfy the significance test. The correlation coefficient was a poor 0.512, and contrary to expectations, product F produced a small positive effect of the reaction rate. The activation energy of 9670 kcal/ mol was considerably lower than expected. Regression of the reconciled data without gross error detection (column 3 of Table VI) produced significant improvement in the correlation coefficient to 0.766. Although six parameters satisfied the significance test, the model was deemed not acceptable because of positive reaction orders with respect to both products E and F. The activiation energy increased significantly but was still below the expected value. A further decrease in the correlation coefficient to 0.317 resulted when the full capabilities of the MIMT method were invoked. Regression results for this case are listed in column 4 of Table VI. The resultant rate equation was independent of all partial pressures leaving only time and temperature as significant parameters. The activation energy increased to the level expected, but otherwise the result was unacceptable. When the 13 data sets which were identified as containing gross errors were eliminated from the regression analysis the results shown in column 5 of Table VI were obtained. In spite of the relatively high correlation coefficient, the rate equation was unacceptable. Product E exhibited a strong positive effect on the rate; the deactivation constant bz was no longer significant, and the activation energy was the smallest value yet attained. An alternate regression approach was followed to obtain the reaction rate parameters listed in column 5a of Table VI. The data set included the 19 tests which contained no gross errors, and the reaction order with respect to product E was arbitrarily set equal to zero. Otherwise, the nonsignificant variables were eliminated one by one as previously described until only significant variables

2636 Ind. Eng. Chem. Res., Vol. 32,No. 11,1993 Table VI. Rate Equation Regression Results: Secondary Reaction. ~~

__

5a

2 raw data (prodctn of F) 5.129* 9670* 0.001 44 0.744 NS NS NS NS 0.121 0.512 30-5=25

1

rate param

In Az Ez bz u z aB2

acz aD2 032 aF2

correl coeff deg of freedom

4 gross errors identified and complete data set rereconciled 21.52 21 050 0.001 73 NS NS NS NS NS NS 0.317 32-3~29

3 reconciled raw data 9.329 13 270 O.OO0 89 NS 0.613 NS NS 0.718 0.087 0.766 30-6~24

5 reconciled data with data seta containing gross errors eliminated 2.059* 7840 NS NS 0.560 NS NS 1.22 NS 0.861 19-4=15

reconciled data with data seta containing groee errors eliminated (am = 0) 20.81 21 820 0.001 17 NS 0.664 NS NS 0**

NS 0.451 19-4 = 15

a NS: not significant at the 95% confidence level. *: not significant at the 95% confidence level but retained in the rate correlation. reaction order with respect to E set equal to 0.

**:

data from all sets, (c) reconciled data with data seta containing gross errors removed, and (d)all data seta after gross error detection and rereconciliation. The fact that the ubest" rate equations were obtained in case c again suggests that the method was successful in identifying gross errors but less successful in reconciling the data from the reduced data sets.

> =E 0.050] /

,

Y0.040

Nomenclature Aj = frequency factor for reaction j

= reaction order for component i in reaction j bj = catalyst deactivation constant in reaction j Ej = activation energy for reaction j aij

/

//

0.000 1 .OOO

P;= partial pressure of component i in reactor product

I

' I

I

I

I

I

I

I

,010

,020

.030

.010

.050

,060

,070

,080

Measured r(2), mole/hr

Figure 4. Parity plot comparing memured and calculated rates of the secondary reaction. Data seta containing gross errors eliminated. Measured rate baaed upon reconciled data. Calculated rate using regression parameters from column 5a of Table VI.

remained. The resultant model was more acceptable in that the activation energy was in the expected range and catalyst deactivation once again became significant. The positive reaction order with respect to reacant B was acceptable. However, the correlation coefficient dropped to 0.451. The parity plot in Figure 4 shows that in four of the 19 cases the deviation exceeded &30%.

Conclusions The MIMT method of gross error detection and data reconciliation performed well when tested using computergenerated data. Single gross errors involving pure component and mixture flow rates whose magnitudes were at least 10times the magnitude of the random measurement error were successfully identified in more than 90% of several hundred test cases. The method was less successful in identifying single prenormalization composition errors and multiple gross errors in single data sets. When applied to experimental kinetics data from a pilot plant reactor, the MIMT method identified potential gross errors in 13 of 32 experiments. Seven of these errors were associated with a single pure component feed rate. The fact that known experimental difficulties existed with the mass flow controller for this feed stream provided empirical evidence of the validity of the gross error identifications. Reaction rate equations were obtained by linear regression using (a) raw data from all sets, (b) reconciled

R = gas constant rj = rate of reaction j T = temperature t = time Literature Cited Barnes, D. W.; Conley,J.M. Statistical Evidence inlitigation;Little, Brown and Co.; Boston, 1986. Cropley, J. B.; Burgess, L. M. Feed Forward Experimental Design for Catalytic Reactor Studies. Presented at the Annual Meeting, American Institute of Chemical Engineers, 1986; Paper 167a Mah, R. S.Procese Data Reconciliationand Rectification.In Chemical Process Structures and Information Flows; Butterworths Publishers: Boston, 1990,pp 385-466. Mah, R. 5.;Tahmane, A. C. Detection of Gross Errors in Process Data. AIChE J. 1982,28,828-830. Phillips, A. G. Application of Data Reconciliation and G r w Error Detection to a Reaction Rate Modeling Problem. M.S. Thesis, Louisiana State University, 1991. Serth, R. W.;Heenan, W. A. Gross Error Detection and Data Reconciliation in Steam Metering Systems. AIChE J. 1986,32, 733-742. Serth, R. W.;Valero, C. M.; Heenan, W. A. Detection of Gross Errors in Nonlinearly Constrained Data: A Case Study. Chern. Eng. Commun. 1987,51,89-104. Serth, R. W.; Tsang, J. Y.T.; Heenan, W.A. Gross Error Detection and Data Reconciliation in Steady-State Distillation. Prwented at the AIChE National Meeting, Denver, 1988. Received for review January 21, 1993 Revised manuscript received April 20,1993 Accepted July 27,1993. Abstract published in Advance ACS Abstracts, October 1, 1993.