Ind. Eng. Chem. Res. 1987,26, 1239-1248
1239
Lumping Strategy. 1. Introductory Techniques and Applications of Cluster Analysis Pamela G. Coxson Department of Mathematics, The Ohio State University, Columbus, Ohio 43210
Kenneth B. BischofPr Department of Chemical Engineering, University of Delaware, Newark, Delaware 19716
A systems-mathematics approach t o a chemical-kinetic lumping analysis is illustrated with simple examples, showing how the classic results of Wei and Kuo can be obtained more directly. In addition, the techniques of cluster analysis are shown to provide a basis for analyzing large quantities of empirical data to provide a rational way to lump them into fewer groups, one of the dilemmas stated by Weekman (1979). Specifically, the Mobil 10-lump catalytic cracking model is further lumped by this technique into groupings that are known to be rational from prior chemical knowledge. It is hoped that the technique will be useful for new systems where extensive’chemical knowledge is not available. 1. Introduction Weekman (1979) has provided a comprehensive overview of lumping from both practical as well as theoretical viewpoints. He has pointed out some of the remaining problem areas. The purpose of this paper is to consider some of these issues. Lumping analysis, as introduced by Wei and Kuo (1969) and Kuo and Wei (19691, seeks to represent large monomolecular reaction systems by low-order, linear differential equations involving lumped pseudospecies. The pseudospecies are formed by associating chemical species having “similar” behavior in a single lump. Wei and Kuo then gave a detailed analysis of lumped-system reaction dynamics. Their “principle of invariant response” indicated that species should be lumped together only if the dynamic behavior of the resulting pseudospecies is independent of the composition of species. In other words, five units of a pseudospecies formed from species A and species B should behave the same in the lumped system if it consisted of five units of species A and none of species B, or two units of species A and three of species B, etc. In this paper, we describe a procedure for selecting the number of lumps and the distribution of species between lumps in a large system. Invariant response is used as a criterion for evaluating particular lumping schemes, while lumping strategies are guided by a statistical analysis of the system’s response data, relative to an initial coarse lumping scheme. Once an appropriate lumping scheme is established, the new lumped-system rate matrix is constructed from the response data. Full theoretical justification and motivation for this approach are given in Coxson and Bischoff ’(1987). In addition to simple examples to illustrate the concepts, the Mobil “10-lump cracking model” of Jacob et al. (1976) will be used extensively as a source of data that are chemically realistic. This was done in lieu of having raw, totally unlumped data available. In other words, we will ask what happens when the 10-lump “pseudodata” are further lumped. 2. Linear Reaction Systems and Proper Lumping A linear reaction system is described by the linear differential equation
-dx(t) - - K x ( t ) x ( t ) is contained in R” (1) dt where x ( t ) is the n-dimensional species composition vector 0888-5885/87/2626-1239$01.50/0
at time t and K is the reaction rate coefficient matrix. Given an initial composition, x (0) = x o ,the composition at time t > 0 is x ( t ) = e K t x g . It is more convenient in the analysis to use a discrete form of eq 1;in fact, this is also the form in which real sampled data are supplied. If we consider the discrete sequence of time, t h = h7, where T is a fixed time interval, then x ( t h f l )= eK(h+l)rx 0 -eKieKhrX 0 or X ( t h + l ) = eKrx(th) (2) In this case, we define x ( h ) = x ( t h ) and G = eKrso that eq 2 translates to the difference equation x(h ,+ 1) = Gx(h) (3) The solution for the initial condition x (0) = x o is x (h)= Ghxg. This discrete representation will be useful later. Example 1. The following three-species reaction was considered by Ozawa (1973) and earlier by Wei and Kuo (1969):
k2kA2 3
A1
A3
I:[
x(f)=
-= dX2 =
( t ) =composition of
x2
“,:” i
+
4x3(t)
-12x2(t) +
6x3(t)
-
10X3(f)
-13x,(f)
3x,(t)
dt
+ 2X2(f)
1 0 X l ( f ) -I-l O X p ( t )
-
A1-A3
[-:: l]
i ( t )= K i ( t )
K =
-122
10 -10
The discrete representation with 0.55700
[
G = eKT= 0.12694 0.31606
T
= 0.05 gives
1
0.08463 0.12643 0.59931 0.18964 0.31606 0.68394
A lumped vector is, in the most general sense, a vector with components which are linear combinations, z ( t ) = Mx(t) of the original species vector x(t). The lumping 0 1987 American Chemical Society
1240 Ind. Eng. Chem. Res., Vol. 26, No. 6, 1987
of similar species described in the Introduction is a special case, for which Wei and Kuo introduced the term “proper lumping”. The composition, z ( t ) = Mx(t), is a proper lumping of x if each component of z is a sum of components of x. Thus, proper lumping is characterized by a lumping matrix, M, of full rank, in which each column is a canonical unit vector, e.g.,
[:I
Rh = [(j(7)])~~:::;;~
having a single non-zero entry of 1. Example 2. A proper lumping scheme is given by z=
[;1 3%
original species
lumped species
-
12
Contrast this with the lumping scheme,
which is not proper. Here the second species, x2, is divided between the two lumped species.
y X1
x1
where c(7)is the quantity of pseudospecies i at time h7 if the initial composition is x(0) = (0,...,O,l,O ,...,0 ) T
The vector, x,consisting of three species, is transformed into a lumped vector with two pseudospecies, as illustrated by the two symbolic reactions
x3
problem, the first model was based on three lumped pseudospecies. This was the most coarse or simplest lumping scheme possible in view of the objective to describe the production of gasoline. Later models were refinements of the three-lump scheme. The approach described below follows this pattern. Given a lumping scheme with m lumped pseudospecies formed from n original species, we define the hth stepresponse matrix,
J i.e., Rh is the hth step response. In example 3 below, we calculate the response values, given the rate matrix, G. These values can also be determined experimentally in the absence of knowledge of G. The j t h column of Rh is composed of the measured amounts of each of the m pseudospecies at the hth sampling instant when the system is started with one unit of pure species j. In particular, note that r: is 1 if species j is assigned to lump i and 0 otherwise, so that & = M, the lumping matrix associated with the given scheme. We can suppress 7 with the understanding that it is fixed and given. The response matrix, R, is formed by catenating the hth step responses. Definition. The response matrix, R, is the matrix [hT, RIT,..., Rn-lTT,where Rh is the hth step-response matrix. All rows of the response matrices, R,, R,+l, etc., are linearly dependent on the rows of R and, therefore, never need to be considered. (See Coxson and Bischoff (1987)). Example 3. Calculation of R for the Three-Species System of Example 1. For each of the three species, one response column representing one “experiment” is calculated by using the formula, z(h) = Mx(h) = MGhx(0). Experiment 1: x(0) = (1 0 O)T. Response
Zl
/&
z(0)=
x2
x3
z2
PONA
-
gasoline C-lump
20)
z ( 0 )=
[3
=
0.56767
[ ] 0.31606
z(2) = [o.4323J
0.68394
-
0.56767
[ ] [ ]
z(1) = 0.31606
z(2) = [o,4323J
Experiment 3: x ( 0 ) = (0 0 l)T. Response z ( 0 )=
[;]
0.31606
z(l) =
0.43233 z(2) =
0.68394
original model
-Paraffins 1 -
0.68394
Experiment 2: x ( 0 ) = (0 1 O)T. Response
Proper lumping is exemplified by the PONA scheme (Weekman, 1979) for a model of the catalytic cracking of gas oil in which lumped species are formed from paraffins, olefins, naphthenes, and aromatics, plus a gasoline lump and a lump of coke and light fuels (the C-lump). An earlier model with just three lumps (gasoil feed, gasoline, C-lump) was found to be inadequate in that the parameters had to be adjusted for different feedstocks (Weekman, 1979). original species
[;I
experiments 1
2
3
0.31606 0.68394
0.31606 0.68394
0.68394 0.3160];;;
0.56767 0.43233
0.56767 0.43233
0.43233 0.56768
Zl
1 2
23
Here we restrict our attention to proper lumping schemes. 3. Invariant Response and the Response Matrix To check invariant response, it is necessary to have a lumping scheme in hand. For the catalytic cracking
R=l
Note that Ro = M, R1 = MG, and R2 = MG2. In general,
Rh = MGh. The response matrix, R,has an important and very simple relationship to the invariant response principle. A lumping scheme satisfies the invariant response require-
Ind. Eng. Chem. Res., Vol. 26, No. 6, 1987 1241 Chart I
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00-122.07 0.00 0.00 -166.20 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -20.49 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -33.29 0.00 0.00 0.00 0.00 22.50 0.00 0.00 0.00 -74.33 0.00 0.00 0.00 0.00 19.00 0.00 0.00 0.00 -22.13 0.00 0.00 0.00 0.00 50.00 5.86 0.00 0.00 0.00 -1.00 84.70 63.00 0.00 23.85 66.15 18.50 0.00 -4.40 7.85 14.87 34.20 14.63 9.44 8.18 3.63 1.00 4.40
K=
ment (and therefore its pseudospecies will follow a linear reaction scheme corresponding to the original true monomolecular kinetics), if and only if the columns of R associated with species belonging to the same lump are identical. If species A and species B are lumped together, then the lumped response to one unit of species A must be the same as the response to one unit of B. (See examples 4 and 5.) In Coxson and Bischoff (1987),we show that it is, in fact, sufficient to check the columns of R1 to establish invariant response. Example 4.
=
[: : 3
1. P,
2.
ri
R=
0.00 0.00 0.00 0.00 0.00
3. SH 4. AH
----
-wI. % naphlhenk molewler,&5O'F+
wt.%aromatic side chains, 650'F+ wt.%caton aloms amow ammatk rings, 650'F+
wt. % paraflinic rmleatles, 430'-650'F wt % naphthenic molecules,430'-650'F wt. % aromatic side chains,430'-650'F
8. AL = wt. % c a m n atoms amoq aromatic rings, 430'-650'F
-1; 10 10 -10 r = 0.05
Figure 1. Mobil 10-lumpkinetic scheme for catalytic cracking.
0.12694 0.59931 0.18964 0.31606 0.31606 0.68394
Invariant response is not satisfied but might be considered sufficiently close. ( c ) A larger perturbation of the system of example 1is
-0.55700 0.08463 0.12642 G=
0.00
wt. X pdraflinic molecules, 650'F+
6. NL
i]
0.00
. i
%
7. SL
K =
0.00
The ten lump species:
5 . PL
(a) The system of example 1 is
0.00-
0.00
-1.ooO00 1.00000 0.00000' o.ooooo 0.o0o00 1.ooooo 0.68394 ,0.68394 0.31606 0.31606 0.31606 0.68394 0.56767 0.56767 0.43233 0.43233 0.43233 0.56767. y y Y
9. G
10. C
-
= Qawlinelump (C5
. 430'F)
C-lump (C, lo C 4 + mke)
K =
[::
2 -22 20 -16-,
r = 0.05
Invariant response is satisfied. (b) A small perturbation of the system of example 1 is
K =
2 -13 :.6~ 10 11 -10.6
r = 0.05 -0.55694 0.08564 0.12584
G=
0.12846 0.57967 0.20082 0.31460 0.33470 0.67334 1.oooO0 1.00000 0.00000' 1.00000 0.68540 0.66530 0.32666 0.31460 0.33470 0.67334 0.56996 0.55368 0.43989 0.43004 0.44632 0.56019. o.oooO0 0.0oooo
chrr
"nearly" equal
not equal
Invariant response is not satisfied. Exact matrix calculus can only differentiate between the exactly lumpable case, i.e., equal response columns, and the nonlumpable cases (e.g., b and c) where the response columns are not identical. Approximation methods will be introduced next to quantify the notion of "nearly equal" responses. Example 5. The Mobil 10-lump scheme is shown in Figure 1. The differential equations for the various reactions were written and solved with the rate constants (in units of lo3 X h-l) from Gross et al. (1976), given in Chart I.
1242 Ind. Eng. Chem. Res., Vol. 26, No. 6, 1987 Chart I1
-
G =
0.4337 0.0000 0.0000 0.0000 0.1166 0.0000 0.0000 0.0000 0.3803 0.0694
Chart I11
0.0000 0.2950 0.0000 0.0000 0.0000 0.0851 0.0000 0.0000 0.5157 0.1042
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1898 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0807 0.2422 0.3085 0.1788
0.8147 0.0000 0.0000 0.0000 0.0527 0.0000 0.1326
0.0000 0.7168 0.0000 0.0000 0.0000 0.1982 0.0849
0.0000 0.0000 0.4755 0.0000 0.0000 0.4554 0.0691
0.0000 0.0000 0.0000 0.8015 0.0000 0.1622 0.0363
0.0000 0.0000 0.0000 0.0000 0.9900 0.0000 0.0100
0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.9570 0.0430
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000
-
-
1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000
0.5503 0.3801 0.5127 0.8674 0.7168 0.4755 0.8015 0.9900 0.0000 0.0000 0.3803 0.5157 0.3085 0.0000 0.1982 0.4554 0.1622 0.0000 0.9570 0.0000 0.0694 0.1042 0.1788 0.1326 0.0849 0.0691 0.0363 0.0100 0.0430 1.0000 0.3222 0.1526 0.4017 0.7589 0.5139 0.2261 0.6424 0.9802 0.0000 0.0000 0.5520 0.6844 0.3669 0,0000 0.3318 0.6523 0.2853 0.0000 0.9158 0.0000 0.1258 0.1630 0.2313 0.2411 0.1544 0.2315 0.0724 0.0198 0.0842 1.0000 0.1997 0.6264 0.1740 0.1296 0.6538 0.2166
R =
0.0643 0.7297 0.2061 0.0281 0.7291 0.2428
0.3655 0.3752 0.2593 0.3459 0.3720 0.2820
0.6700 0.3684 0.1075 0.5148 0.9704 0.0000 0.0000 0.0000 0.4194 0.7272 0.3772 0.0000 0.8763 0.0000 0.3300 0.2123 0.1652 0.1080 0.0296 0.1237 1.0000 0.5970 0.2641 0.0511 0.4126 0.9608 0.0000 0.0000 0.0000 0.4743 0.7449 0.4445 0.0000 0.8386 0.0000 0.4030 0.2616 0.2039 0.1429 0.0392 0.1614 1.0000
0.0870 0.0126 0.3317 0.5370 0.1893 0.0243 0.3307 0.9512 0.0000 0.0000 0.6578 0.7110 0.3652 0.0000 0.5063 0.7361 0.4923 0.0000 0.8025 0.0000 0.2552 0.2764 0.3031 0.4630 0.3045 0.2395 0.1770 0.0488 0.1975 1.0000 0.0598 0.6495 0.2907 0.0418 0.6346 0.3236
0.0058 0.6862 0.3080 0.0027 0.6594 0.3379
0.3200 0.3566 0.3233 0.3102 0.3470 0.3428
0.4877 0.0000 0.5123 0.4470 0.0000 0.5530
0.1357 0.5220 0.3423 0.0973 0.5264 0.3763
0.0116 0.7155 0.2729 0.0055 0.6900 0.3045
0.2651 0.5248 0.2102 0.2124 0.5452 0.2424
0.9418 0.0000 0.0582 0.9324 0.0000 0.0676
0.0000 0.0000 0.7680 0.0000 0.2320 1.0000 0.0000 0.0000 0.7349 0.0000 0.2651 1.0000
0.0295 0.0013 0.3018 0.4133 0.0697 0.0026 0.1703 0.9231 0.0000 0.0000 0.6161 0.6322 0.3366 0.0000 0.5230 0.6628 0.5562 0.0000 0.7033 0.0000 0.3544 0.3665 0.3616 0.5867 0.4072 0.3346 0.2736 0.0769 0.2967 1.0000 0.0209 0.0006 0.2946 0.3854 0.0500 0.0012 0.1365 0.9139 0.0000 0.0000 0.5957 0.6056 0.3257 0.0000 0.5143 0.6355 0.5598 0.0000 0.6730 0.0000 0 3834 0.3938 0.3797 0.6146 0.435'7 0.3633 0.3037 0.0861 0.3270 1.0000-
-
The discrete form is computed from G = eKr where T was chosen such that the significant dynamics occurred within 107. The factor of 10 was based on the requirement that the response matrix must contain at least 10 rows or time increments. The result is given in Chart I1 for T = 10-5. Next the differential equations would be solved to generate the pseudodata for the time response of the various species. Of course, in a real experimental situation, the data would be the measured values of the compositions from 10 experiments, each beginning with a given pure species. (Other types of experiments will be discussed later.) From these data, the response matrix is computed for the coarsest lumping scheme that could be of interest-here it is the classical three-lump triangular scheme (Nace et al., 1971) A - G
L
where G and C are the gasoline and coke lumps and A represents a lump of all other species. Then, the response matrix is computed for the lumping matrix M=
1
1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
[
and is given in Chart 111. Each block of three rows is for a given time increment, and the three rows are the coarse lumps. We will use example 5 in the remainder of this paper to illustrate handling of a system which is not exactly lumpable. This example is essentially the 10-lump Mobil model for catalytic cracking (Gross et al., 1976) lumped into three lumps (i.e., the three original (pseudospecies)). Clearly, the three-lump model will behave very poorly, with responses in the first lump ranging from 0.3801 to 0.9900 in R, and from 0.0006 to 0.9139 in &. In the next section, we discuss how to use the information contained in R.
Ind. Eng. Chem. Res., Vol. 26, No. 6, 1987 1243 Table I. cluster pattern
1 1 1 1 1 1 1
2 1 1 1 1 1 1
3 3 3 3 3 1 1
4 4 4 4 4 4 1
5 5 5 3 3 1 1
2 1 1 1 1 1 1
7 7 5 3 3 1 1
8 8 8 8 4 4 1
no. of clusters 7 6 5 4 3 2 1
error increase 0.0190 0.0777 0.1328 0.3081 1.4391 1.4739 7.8138
total error 0.0190 0.0967 0.2295 0.5376 1.9767 3.4506 11.2644
4. Selection of Lumps
If the appropriate columns of the response matrix are identical, then the initial lumping scheme is valid. When the columns are not identical, the variation among the columns can suggest a refinement of the initial scheme. While it might be possible to make some judgement in this matter by inspection of the data, a more reliable approach can be based on cluster analysis, borrowed from statistics. The data sets to be analyzed are the columns of the response matrix corresponding to the species in one lump. Cluster analysis examines sets of data and determines whether they “cluster” into similarity classes; see, for example, Anderberg (1973). Ward’s method of clustering, illustrated below, is an hierarchical approach. I t begins by putting each data set in a cluster by itself. At each step in the procedure, two of the previous clusters are fused, reducing the number of clusters by one. The criterion for which data sets to fuse is minimization of the increment of the total within group error sum of squares caused by the fusion. In other words, the most similar data sets (here, columns of the response matrix), which have the least mean square difference for the time steps considered, are fused. I t can be shown (see: Wishart, 1978, p 114; Anderberg, 1973, p 142) that this increment, AE,, due to the fusion of cluster p and cluster q, can be computed from the equivalent formula
where n, is the number of data sets in cluster r, and dp, is the Euclidean (squared) distance between the means of cluster p and cluster q n
1=1
For any particular number of clusters, Ward’s method can lead to a local, rather than a global, minimum. The routine RELOCATE tests for this and readjusts the clusters appropriately. A more exhaustive search for optimal clusters can be conducted by using the algorithm EUCLID, in which multiple trials are initiated with random selections of starting clusters. No changes in the clustering in the present application were observed, but this does not absolutely guarantee a global minimum, of course. The three species systems of example 4 are really too small to illustrate the use of clustering analysis fully. There, the analysis can only be applied to the two-species lumps and the clustering possibilities are limited to two choices: keep the two species separate (two clusters) or together (one cluster). Ward’s algorithm calculates the error sum of squares, MI2,incurred by fusing the two clusters into one, as 0 for 4a, which was exactly lumpable, 0.00067 for 4b, and 0.02920 for 4c. Example 6. In Table I, Ward‘s method, implemented in the numerical package CLUSTAN (Wishart, 1978), is applied to the eight-species lump of the three-lump model
in example 5. Recall that this was the coarsest possible lumping scheme in the catalytic cracking process. Clearly it would not be reasonable to mix species between these three lumps-e.g., gasoline species with heavier molecules-so we consider the first lump alone. The lists on the left in Table I indicate how the clusters evolve at each step. The error is the additional sum of squares error incurred at that step. In the first step, columns 2 and 6 are fused, forming seven clusters with a 0.0190 error. All other possible pairings lead to a greater error term. An additional error of 0.0777 accumulates when column 1 is fused to the cluster with columns 2 and 6 in the second step. Columns 5 and 7 are fused with an error of 0.1328, creating a cluster pattern with five lumps, and then the number of lumps is reduced to four by fusing column 3 to this last lump. A large relative increase in error occurs in going from four to three clusters, indicating that the lump of eight species might reasonably be refined into four smaller lumps: (xl, x2, x g ] , ( ~ ~ ( 1x 3, , x5, x7), {x8). In this way, a cluster analysis suggests an appropriate number of lumps and the assignment of species to those lumps. The lumping strategy determined in example 6 can be interpreted in terms of the catalytic cracking model it represents. Working back from the single lump of eight species, we see that a substantial increment (7.8138) in the sum of squares is avoided if the aromatics (x4 and are separated from the nonaromatics. A third lump is created by splitting the nonaromatics into naphthenes plus heavy paraffins {xl, x2, x g ] and side chains plus light paraffins (x3, x 5 , x 7 ) , reducing the sum of squares by 1.4739. Further improvement results when the heavy aromatics (x4)are differentiated from the light aromatics, creating four lumps. The next species to be separated are, successively, the heavy side chains (3) from the light side chains (7) plus light paraffins (5); then split the light side chains from the light paraffins and then the heavy paraffins from the naphthenes. These successive groupings make sense chemically, but recall that no chemical heuristics were involved in the computations-the information was somehow contained in the data. Moving up the hierarchy to 5,6,7, or 8 lumps will lead to successive improvement, but that must be weighed against the advantage of having a small number of lumps. This is the tradeoff inherent in lumping systems which are not exactly lumpable. A rational decision could be based on comparison of the clustering “error” with the actual experimental errors. RELOCATE was applied to the clustering results given above, but no improvement was detected. EUCLID, too, produced the same clusters. It is easy to construct examples for which Ward’s method does not find the optimal clusters at some stage. In those cases, RELOCATE is generally adequate to detect the misappropriation. EUCLID is more thorough but correspondingly more costly in computer time, and it did not yield an improvement over Ward’s method in conjunction with RELOCATE in any of the examples we examined. 5. Construction of the Lumped-System Rate Matrix If the initial lumping scheme is exact, as in example 4a, then the discrete time rate matrix, G,,for the lumped system is constructed from the distinct columns of R1. Example 7. From example 4a, M=[‘
‘1
0 0 1
1
0.68394 0.68394 0.31606
R , = [ 0.31606 0.31606 0.68394
1244 Ind. Eng. Chem. Res., Vol. 26, No. 6, 1987
The two-lump rate matrix is 0.68394
(
Gasoline
t
0.31606
1
0.31606 0.68394
That G, can be constructed as above is deduced from the fact that, if
z(0) = (0,...,0,1!0 ,...,oY I
then The six lump pseudo-species:
z(1) = G,z(O) = the ith column of G,
1. P
Since z ( 1 )is the one-step response of the ith lump, it is just the appropriate column of R1, as claimed. When the columns belonging to one lump are close but not identical, so that the lumping scheme given by M is not quite exact, we proceed by forming a smoothed matrix, R, in which the columns desired to be equal in R are each replaced by the average of all columns in that lump. A discrete approximate rate matrix, G,, is then constructed as before from R,. Example 8. If
1'
[
0.68540
M = [ l0 0 1.
R1=
1
0.66530 0.32666
0.31460 0.33470 0.67334
as in example 4h an approximate rate matrix is constructed from R,
1
0.67535 0.67535
0.32666
Rl =
0.32465 0.32465 0.67334
8, =
ro.67535
1
0.32666
1
r-10.5032
10.5682
L10.5032
-10.56821
K, =
-
L0.32465 0.67334J
Note: (1)If the columns are identical, then R, = R, and G, is just G, as computed in example 7. (2) In the case where the variations in the columns arise_ solely from Gaussian zero-mean measurement errors, G, is just the minimum-variance, least-squares estimate of the exact rate matrix (see: Coxson and Bischoff, 1987). I t should be noted that there may be better ways to fit the kinetic parameters-e.g., lump the experimental data by the above and then fit by regression. With these first two cases resolved, we address the problem of finding a rate matrix when the initial lumping scheme is found to be inadequate. The cluster results of the previous section provide a new lumping scheme. By regrouping the response data into the new lumped values, a new response matrix is available, and the rate matrix can be extracted as before. This approach is illustrated in example 9, where the data from example 5 are reformed into six lumps (determined by the clustering in example 6) instead of three. The new first-step response is averaged, and an approximate rate matrix, new G,, results with z now denoting the six pseudospecies composition. Example 9.
r
new M = R, =
I L
0 0 0 0
0 0 0 0
1 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 1 0 0
0 0 1 0
+ NH+N
s n + P + sL.
3. A H 4. A ,
5. G 6. C-lump
Figure 2. Six-lump scheme from cluster analysis that will approximate the complete 10-lump scheme.
lumping matrix (here calculated as (new) M.G, G as in example 5) are given in Chart IV. The new rate matrix implies the lumped reaction structure shown in Figure 2 and also provides values for the rate coefficients. Figure 3 compares the predicted gasoline yield over time (T = for the original 10-species model (labeled 0) and the 6-lump model (labeled L) of this example which approximates it. Since the 10-species model is not exactly lumpable, the lumped-model prediction depends on the composition of the species within each lump. The charge (PH, NH, SH, AH, PL, NL, SL, AL, G, C) consists of heavy and light paraffii, naphthenes, side chains, and aromatics, plus gasoline and the C-lump. Figure 4 provides the same comparison for the predicted C-lump yield. Four basic charge compositions are considered: (a) neutral = (0.16, 0.16,0.16,0.16, 0.16, 0.16,0,0,0,0); (b) paraffinic = (0.3, 0.1, 0.15,0.15,0.2,0.05,0.03,0.02,0,0); (c) aromatic = (0.1, 0.1,0.2, 0.4, 0.05, 0.05, 0.05, 0.05, 0, 0); (d) naphthenic = (0.15, 0.4,0.1,0.08,0.07,0.2,0,0,0,0). The approximation in each case seems adequate. 6. Using Experimental Data from Mixtures
So far, we have relied on the availability of a response matrix, R, which was constructed from experiments beginning with pure species. While it is preferable to measure R directly from such experiments, it is possible to calculate R from experimental data beginning with mixtures of species. A mixture will be represented by its composition vector, b. An experiment beginning with the mixture b , i.e., x ( 0 ) = b , in the system x ( h + 1) = Gx(h), z(h) = Mx(h), produces the lumped response, z (0) = Mb, z (1) = MGb , z ( 2 ) = MG2b, ..., z ( n - 1) = MG"-lb. These responses are assembled into a single long column:
1
1 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0
0 0 0 0
2.
0 0 0 1
I
1
The new first-step response- (RJ, new averaged response (RJ, and new rate matrix (G,) recomputed for the new
Additional experiments starting with different mixtures form additional columns. A set of n mixtures, b l , b2,...,
Ind. Eng. Chem. Res., Vol. 26, No. 6, 1987 1245 Chart IV
R1=
R, =
[ [
0.0000 0.0000 0.0000 0.4755 0.0000 0.2704 O.OOO0 0.7168 0.0000 0.8015 0.0000 0.8147 0.0000 0.0000 0.0000 0.2422 0.0527 0.0000 O.oo00 0.0000 0.3803 0.5157 0.3085 0.0000 0.1982 0.4554 0.1622 0.0694 0.1042 0.1788 0.1326 0.0849 0.0691 0.0363
0.4337 0.1166 0.0000 0.0000
0.3801 0,0000 0.0000 0.0000
0.4298 0.4298 0.0000 0.0000 0.0389 0.0389 0.5963 0.0000 0.0000 0.0000 0.0000 0.8147 0.0000 0.0000 0.0807 0.0527 0.4505 0.4505 0.2230 0.0000 Q.0809 0.0809 0.1000 0.1326
G, =
i
0.4298 0.0389 O.OOO0 0.0000 0.4505 lo809
0.0000 0.0000 0.0000 0.9900 0.0000 0.0100
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
O.oo00
0.8147 0.0807 0.0527 0.2230 0.0000 0.1OOO 0.1326
GASOLINE LUMP PREDICTED VALUES ( 0 1 6 . 0 16.0 16,O 16.0 16 0 16.0.0 0 0 ) NEUTRAL CHARGE 0 = ORIGINAL MODEL (10 species) L = L U M P E D MODEL ( 6 species)
0.0000 0.0000 0.0000 0.9900 0.0000 0.0100
0.0000 0.0000 0.0000 0.0000 0.0000 1,0000
0.0000 0.0000 0.0000 0.0000 0.9900 0.0000 0.0000 0.9570 0,0100 0.0430
0.0000 0.4298 0.0000 0.5963 0.0389 0.5963 0.0000 0.0000 0.0000 0.0807 0.0000 0.0807 0.2230 0.4505 0.2230 0.1000 0.0809 0.1000
0.0000 0.0000 0.5963 0.0000
1
0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.9570 0.0430 1.000
1
0.9570 0.0000 0.0430 1.0000
GASOLINE LUMP PREDICTED VALUES PARAFFINIC CHARGE ( D 3.0 1 .O 15.0 15,O 2.0 0 5 . 0 03 0 02.0 0 ) 0 = ORIGINAL MODEL (10 species) L = LUMPED M O D E L (6 species)
(0
:?
:
T-SECONDS
0 0
0
1
1
1
2
,
3
,
4
,
5
,
6
,
7
,
8
,
9
I
10
T-SECONDS GASOLINE LUMP PREDICTED VALUES AROMATIC CHARGE ( 0 . 1 , O 1 , 0 . 2 , 0 . 4 . 0 . 0 5 , 00 5 . 0 05.0.05.0.0) 0 = ORIGINAL MODEL (10 species) L = L U M P E D MODEL ( 6 species)
GASOLINE LUMP PREDICTED VALUES NAPHTHENIC CHARGE ( 0 1 5 . 0 4 0 1 . 0 0 8 , O 07.0 2 . 0 . 0 . 0 . 0 ) 0 = ORIGINAL MODEL ( 1 0 species) L = L U M P E D MODEL ( 6 specles)
0 0 -
t
K O -
+
3
Figure 3. Comparison of the 6-lump scheme simulations with the full 10-lump model for gasoline yields for (a, top left) neutral feed Cas indicated), (b, top right) paraffinic feed, (c, bottom left) aromatic feed, and (d, bottom right) naphthenic feed.
1246 Ind. Eng. Chem. Res., Vol. 26, No. 6, 1987 C-LUMP PREDICTED VALUES (0 1 6 , O 16.0 16.0 15.0 1 6 . 0 1 6 , 0 , 0 0 , 0 ) NEUTRAL CHARGE o = ORIGINAL MODEL ( 1 0 species) = LUMPED MOOEL (6 specles)
C-LUMP PREDICTED VALUES PARAFFINIC CHARGE ( 0 3 . 0 1 .0 1 5 . 0 1 5 3 2 0 3 5 C 01 0 3 2 2 5 , 0 = ORIGINAL MODEL ( 1 0 s p e c i e s ) i = LUMPED MODEL ( 6 s p e c s e s )
2
0 0 -
1
7-SECOYDS C-LUMP PREDICTED VALUES AROMATIC C H A R G E (0.1 ,0.1,0 2 . 0 4.0.05,O 05.0 0 5 . 0 0 5 . 0 . 0 ) 0 = ORIGINAL MODEL ( 1 0 species) L = L U M P E D MODEL ( 6 species)
NAPHT'iEhiC
C-LUMP PREDICTED VALUES CHARGE (0 1 5 . 0 4 . 0 1 . 0 08.00 7 . 0 2 , 0 . @ . 0 , 0 ) 3 = ORIGINAL MODEL ( 1 0 specNe3) L = -CiMPED MODE; ( 6 s p e c i e s )
CI
Ll 0
Figure 4. Comparison of the 6-lump scheme simulations with the full 10-lump model for C-lump yield for the same charges as in Figure 3.
b,, which are linearly independent, leads to a response matrix RB,formed from the n response columns: MGb1
RB
=
MGbz
MGb,
-
We see that RB = R-B, where R is the pure species response and B is the matrix formed from the n mixtures. Thus, R can be recovered from RB as the equation, R = RBB-'. Once R is determined, the methods discussed above for pure species data can be applied. Example 10. Mixed feedstock response for the 10-lump model of example 5 is given in Chart V. In example 10, the first eight columns of matrix B represent eight different mixed feedstock compositions. Columns 9 and 10 represent gasoline and C-lump compositions, respectively. The ith column of the response
matrix RB contains measurements of an "experiment" beginning with the feedstock composition in the ith column of B. Here, as in our earlier example with pure species feedstocks, the response matrix is simulated by computing the exact response for the given rate matrix G. Thus, the kth block of three rows is computed as MGk-'B. In practice, of course, RB would be the experimental response matrix from the various feeds. The response matrix R of example 5 is recovered by multiplying RB by B-I. Quivalently one could solve the linear system of equations RB = RB for the desired matrix R using Gaussian elimination. For numerical computation reasons, this latter approach is preferable. Several cautions are needed a t this point. Linear independence of the columns of B is required for the existence of B-l. Nearness to dependence (usually measured by a condition number-see e.g., pp 25-28 of Golub and Van Loan (1983))can lead to numerical instability. In such cases, even if B and RB are known exactly, the computation of R may yield a poor result due to machine round-off error. If there are substantial measurement errors in B and RB, the computational difficulties are compounded. These numerical issues will be considered in more detail in a future report. 7. Conclusions
We have addressed two major problems arising in the search for lumped-system representations of large first-
Ind. Eng. Chem. Res., Vol. 26, No. 6, 1987 1247 Chart V
-0.45 B =
0.35 0.15 0.05 0 0 0 0 0 0
-
1.0000 0.0000 0.0000 0.5009 0.3979 0.1011 0.2966 0.5430 0.1604 0.2007 0.5935 0.2058 0.1499 0.6052 0.2449 0.1202 0.5996 0.2802 0.1013 0.5860 0.3127 0.0886 0.5684 0.3430 0.0796 0.5490 0.3713 0.0731 0.5289 0.3981
-
0.3 0.1 0.15 0.15 0.2 0.05 0.03 0.02 0 0
0.1 0.1 0.2 0.4 0 0 0.05 0.15 0 0
0.15 0.4 0.1 0.08 0.07 0.2 0 0 0 0
0.7 0.2 0.07 0.03 0
0 0 0 0 0
0 0 0 0 0.45 0.35 0.15 0.05 0 0
0.65 0.30 0.02 0.03 0 0 0 0 0
0
0 0 0 0 0.65 0.3 0.02 0.03 0 0
0 0 0 0 0 0 0 0 1 0
-
0 0 0 0 0 0 0
c>
r
-
1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.0000 0.000u 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.6211 0.7311 0.5005 0.5231 0.2792 0.1594 0.3991 0.3910 0.0997 0.1094 0.1003 0.0859 0.4390 0.6105 0.2915 0.3070 0.3966 0.2113 0.5469 0.5490 0.1644 0.1782 0.1616 0.1441
0.6587 0.5080 0.6543 0.0000 0.0000 0.2729 0.4081 0.2687 0.9570 0.0000 0.0683 0.0839 0.0770 0.0430 1.0000 0.4558 0.2860 0.4441 0.0000 0.0000 0.4204 0.5715 0.4171 0.9158 0.0000 0.1238 0.1425 0.1388 0.0842 1.0000
0.3356 0.5388 0.1931 0.1983 0.3291 0.1765 0.3111 0.0000 0.4487 0.2295 0.5982 0.6107 0.4998 0.6335 0.4983 0.8763 0.2157 0.2317 0.2088 0.1910 0.1710 0.1900 0.1906 0.1237 0.2701 0.4885 0.1417 0.1384 0.2467 0.1175 0.2241 0.0000 0.4703 0.2349 0.6091 0.6295 0.5408 0.6512 0.5407 0.8386 0.2596 0.2766 0.2492 0.2320 0.2125 0.2314 0.2353 0.1614 0.2257 0.4760 0.2983 0.1942 0.4729 0.3329
0.4503 0.1123 0.1027 0.1909 0.2345 0.6022 0.6282 0.5593 0.3151 0.2854 0.2690 0.2498 0.4202 0.0941 0.0800 0.1520 0.2311 0.5872 0.6169 0.5640 0.3487 0.3186 0.3031 0.2840
0.1711 0.3958 0.4645 0.2261 0.3644 0.3782 0.1539 0.3757 0.4530 0.2200 0.3931 0.4043 0.1408 0.3591 0.4396 0.2133 0.4197 0.4276
0.0000 0.0000 1.oooo
0.0831 0.1655 0.0000 0.0000 0.6482 0.5598 0.8025 0.0000 0.2688 0.2748 0.1975 1.0000 0.0616 0.1252 0.0000 0.0000 0.6352 0.5644 0.7680 0.0000 0.3032 0.3103 0.2320 1.0000
0.0820 0.0649 0.1242 0.0476 0.0971 0.5685 0.6004 0.5602 0.6173 0.5601 0.3495 0.3347 0.3157 0.3352 0.3428 0.0736 0.0544 0.1040 0.0380 0.0772 0.5481 0.5813 0.5508 0.5969 0.5499 0.3783 0.3643 0.3452 0.3652 0.3729 0.0674 0.0469 0.0891 0.0312 0.0630 0.5273 0.5609 0.5378 0.5754 0.5362 0.4053 0.3922 0.3731 0.3934 0.4008
order reaction systems. The first is the problem of deciding how to partition the species into lumps. Our approach is summarized below. (1) Determine the coarsest lumping scheme that could possibly describe the essential features of the system. (2) Construct a response matrix from measurements of the coarse lumped outputs resulting from reactions, starting with pure species. (3) Apply clustering analysis to the columns associated with one coarse lump, and use the results to establish a lumping scheme. In section 5, we indicated how to obtain the pure species response matrix, R,for step 2 from data based on experiments, beginning with mixtures of species. Once a lumping scheme is fixed, a rate matrix for the lumped species is determined as follows: (1)The data are reorganized to reflect the discrete experimental responses of the new lumps. (2) A new rate matrix is formed from the averaged columns of the new R1,as described in section 5 and illustrated in example 9. I t is hoped that these methods can provide guidance in
0.0000 0.0000 1.0000
0.0000 0.7349 0.2651 0.0000 0.7033 0.2967
0.0000 0.0000 1.0000 0.0000 0.0000 1.0000
0.0000 0.0000 0.6730 0.0000 0.3270 1.oooo
-
the lumping of large systems.
Acknowledgment This work was supported in part by a grant from the University of Delaware Research Foundation. Discussions and suggestions from Vishwas Gokhale, John H. Schuenemeyer, W. David Smith, and Robert M. Koros contributed to the final product and are gratefully acknowledged.
Nomenclature b , bi = initial composition vectors for mixed species experiments B = n x n matrix with independent columns, bi dpq = Euclidean distance between the means of clusters p and 4
AEPq= increase in error from fusion of clusters p and q G = discrete-time reaction rate matrix Gz= discrete-time lumped-system rate matrix G , = approximation of the discrete-timelumped-system rate matrix K = continuous-time reaction rate matrix M = lumping matrix
Ind. Eng. Chem. Res. 1987,26, 1248-1254
1248
n = number of species in the composition vector n = number of species in cluster p f = (i, j ) entry in the hth step-response matrix dh= hth step-response matrix R E response matrix R, R1,etc. = bar indicates some columns have been averaged x ( t ) = species composition vector at time t x i = zth species in x x ( h ) = composition vector at time th = h.7 xo = initial composition z ( t ) ,z (h) = lumped composition vectors Greek Symbol 7 = fixed sampling-time interval
Literature Cited
Coxson, P. G.; Bischoff, K. B., submitted for publication in Ind. Eng. Chem. Res. 1987. Golub, G. H.; Van Loan, C. F. Matrix Computations; Johns Hopkins University Press: Baltimore, 1983. Gross, B.; Jacob, S. M.; Nace, D. M.; Voltz, S. E. US Patent 3960707, 1976. Jacob, S. M.; Gross, B.; Voltz, S. E.; Weekman, V. W. AIChE J. 1976, 22, 701-713. Kuo, J. C. W.; Wei, J. Ind. Eng. Chem. Fundam. 1969,8,124-133. Nace, D. M.; Voltz, S. E.; Weekman, V. W. Ind. Eng. Chem. Process Des. Deu. 1971, 10, 530-541. Ozawa, Y. Ind. Eng. Chem. Fundam. 1973, 12, 191-196. Weekman, V. W., Jr. AZChE Monogr. Ser. 1979, 75(11), 3-29. Wei, J.; Kuo, J. C. W. Znd. Eng. Chem. Fundam. 1969,8,114-123. Wishart, C. CLUSTAN User Manual, 3rd ed.; University of Edinburgh Edinburgh, Jan 1978; Inter-University/Research Council Report 47. Received for reuiew December 26, 1985 Accepted October 8, 1986
Anderberg, M. R. Cluster Analysis for Applications; Academic: New York, 1973.
Momentum and Heat Transfer in Non-Newtonian Fluids Flowing through Coiled Tubes Yoshinori Kawase and Murray Moo-Young* Department of Chemical Engineering, University of Waterloo, Waterloo, Ontario, Canada N2L 3GI
A simple approach to momentum and heat transfer in inelastic non-Newtonian fluids flowing through coiled tubes is discussed. The proposed approach is based on the integral momentum boundary layer method and is applicable to both laminar and turbulent flows. T h e predicted values of the friction factor and the Nusselt number are compared with experimental data and other previous correlations. Satisfactory agreement is found. Coiled tubes have been widely used as heat exchangers and reactors because of more efficient heat transfer, reduced backmixing, and longer average residence time per unit floor space as compared with straight tubes. Therefore, the Newtonian fluid flow in coiled tubes has been extensively investigated. A flow in a coiled tube develops secondary flow caused by the difference in centrifugal force which is exerted on elements of fluid moving with different axial velocities. The fluid in the central part of the coiled tube is driven toward the outer wall by the centrifugal force and flows inward along the wall by a pressure gradient. The resultant secondary flow appears as twin counterrotating vortexes in the cross section of the tube and increases heat-transfer rates in addition to the rate of momentum transfer. Dean (1927, 1928) showed that a single dynamic similarity parameter, De R e ( d / D ) ' f 2 ,can characterize Newtonian flow in a coiled tube. The use of tubular coiled reactors for continuous fermentation and polymerization reactions has considerable advantages, and most fermentation broths and polymerization reaction mixtures are viscous and non-Newtonian in nature. In coiled tube blood oxygenators, the secondary flow which is relatively gentle can be expected to improve the gas-transfer rate. Despite a variety of practical applications, relatively little attention has been paid to the flow of non-Newtonian fluids in coiled tubes. Particularly, little theoretical work has been published dealing with the flow of inelastic non-Newtonian fluids. The pseudoplasticity flattens the primary velocity profile and enhances the secondary circulation flow close to the tube wall due to the reduction of the apparent viscosity in the high shear rate region. Mashelkar and Devarajan (1976a, 1977) analyzed the laminar and turbulent flows of a power-law fluid in coiled
tubes by using a boundary layer approximation. In spite of the fact that they applied the momentum integral approach, they had to solve the governing equations numerically and present numerical results in a form of a formula suitable for use in engineering design. Later, Shenoy et al. (1980) extended this approach to the turbulent flow of mildly viscoelastic fluids. Although Mashelkar and Devarajan (1976a) stated that their result for laminar Newtonian fluid flows is in excellent agreement with Ito's (1956) solution for Newtonian fluids, as pointed out by Hsu and Patankar (1982), their correlation for n = 1 cannot describe the laminar Newtonian flow data. It should be noted, furthermore, that in their paper (1977) there is confusion in the definition of the friction factor. Recently Hsu and Patankar (1982) solved numerically the governing equations for the momentum in the laminar flow of a power-law fluid. Very little information is available on heat transfer in non-Newtonianfluids flowing through coiled tubes. Oliver and Asghar (1976) measured heat transfer to laminar pseudoplastic liquids through coiled tubes. Numerical solutions for the laminar power-law fluid flow heat transfer in coiled ttibes under the condition of constant wall temperature were presented by Hsu and Patankar (1982). Incidentally, it is to be mentioned that the Nusselt number ratio in their work should be the ratio of their solutions to the corresponding solutions of the modified Graetz problem under the condition of constant wall temperature instead of those under constant wall heat flux. As shown in the measurement of the velocity distributions by Ranade and Ulbrecht (1983), the existing numerical solutions are not in satisfactory agreement with the data of the secondary flow. Moreover, it is conceivable that the results of numerical calculations are difficult to
0888-588518712626-1248$01.50/0 0 1987 American Chemical Society