A New Structural Algorithm for Observability Classification - American

Planta Piloto de Ingenierı´a Quı´mica, Universidad Nacional del Sur, CONICET,. Camino La Carrindanga km 7-CC 717, 8000 Bahı´a Blanca, Argentina...
1 downloads 0 Views 114KB Size
Ind. Eng. Chem. Res. 1999, 38, 3027-3035

3027

A New Structural Algorithm for Observability Classification Ignacio Ponzoni, Mabel C. Sa´ nchez, and Ne´ lida B. Brignole* Planta Piloto de Ingenierı´a Quı´mica, Universidad Nacional del Sur, CONICET, Camino La Carrindanga km 7-CC 717, 8000 Bahı´a Blanca, Argentina

A new structural method to classify unmeasured variables for plant instrumentation purposes, i.e., to carry out the observability analysis, is presented in this paper. The technique, called the global strategy with first least-connected node (GS-FLCN), basically consists of making a structural rearrangement of the process’ occurrence matrix. GS-FLCN is applicable to strongly nonlinear models, including special features to avoid unsolvable subsystems. The new proposal is compared with the existing structural dominant column block (DCB) approach, both theoretically and through practical examples. The results for two industrial plant sections of small and medium size are put forward. GS-FLCN proved to be much more effective than DCB, thus being highly recommendable. 1. Introduction Evaluations of the performance, control, and optimization of a chemical plant are based upon knowledge about its current state. The existing conditions are determined by the values of the process variables contained in a model that adequately represents plant operation. Mathematically speaking, a steady-state model of a chemical process plant is made up of a system of nonlinear equations E, which corresponds to mass balances, energy balances, and relationships employed to estimate thermodynamic properties such as densities, enthalpies, and equilibrium constants. A variety of problems associated with plant monitoring are solved by means of classification algorithms. These algorithms yield four kinds of variables according to feasibility of calculation:1 (1) Redundant variables: measured variables that can be also computed from the balances and the rest of the measured variables. (2) Nonredundant variables: measured variables that cannot be computed from the balances and the rest of the measured variables. (3) Observable variables: unmeasured variables that can be evaluated from the available measurements using the balance equations. (4) Unobservable variables: unmeasured variables that cannot be evaluated from the available measurements using the balance equations. In turn, E’s equations can be classified into three categories: (1) Assigned equations: those that will be employed to find the value of the observable variables. (2) Redundant equations: those that have not been assigned and whose variables are either measured or observable. (3) Unassigned equations: those that have not been assigned and contain at least one unobservable variable. * Author to whom all correspondence should be addressed. Current Address: PLAPIQUI - Complejo CRIBABB km 7 Camino La Carrindanga - C.C. 717 - 8000 - Bahı´a Blanca Argentina. Phone: 54 291 4861700. Fax: 54 291 4861600. E-mail: [email protected].

In particular, when one aims at reducing the amount of sensors, the best classification for unmeasured variables can be defined as the one that determines the greatest number of observable variables for a given set of measurements. It is possible to associate this definition with an economic objective: the fewer the measurements, the lower are both investments and operating costs. Investments are reduced because no unnecessary instruments are required; operating costs are lower because fewer instruments have to be monitored. To achieve this goal, it is essential to use a methodology for the classification of variables that exhibits the following features: 1. Effectiveness: capacity of determining the maximum amount of observable variables from a given set of measurements. 2. Efficiency: potential to perform the classification in reasonably low run times for large-size industrial problems. In this work, an effective strategy for unmeasured variable classification of strongly nonlinear systems that overcomes deficiencies of existing structural techniques is presented. Section 2 contains a critical literature review of classification algorithms and points out the need for improvements in observability techniques. Our proposal is presented in section 3 and compared with its predecessor in sections 4 and 5. Section 4 is devoted to a theoretical comparison, whereas section 5 deals with several examples of industrial interest. The main conclusions are summarized in section 6. Finally, we have included an appendix with all of the most important algorithms. 2. Classification Methodologies The classification of process variables basically comprises two main stages: the observability analysis and the determination of the redundant measurements. During the last few decades, two main research lines have been developed to carry out these tasks. In the first one, called the topology-oriented approach, the variables are classified by analyzing the cycles and cutsets which appear in the undirected graph underlying the process topology. The second line, known as the equationoriented approach, makes use of different matrixes

10.1021/ie980747m CCC: $18.00 © 1999 American Chemical Society Published on Web 06/26/1999

3028 Ind. Eng. Chem. Res., Vol. 38, No. 8, 1999

associated to the system of equations employed to model the process. 2.1. Topology-Oriented Approach. These methods employ the undirected graph G underlying the digraph that represents the process topology, whose nodes and edges correspond to process units and streams, respectively. This graph also contains an additional node, called the environmental node, which represents the surroundings. All feeds leave this node, and all product streams are its inputs. Vaclavek2 used this representation. For the classification of redundancies, he joins all pairs of connected nodes by means of a stream that contains at least one unmeasured variable. The resulting graph G′ only involves redundant measurements. As to observability, he eliminates all of the fully known streams in G, i.e., those streams whose variables are all measured, thus obtaining another reduced graph G′′. Finally, he searches for all cycles in G′′. The flow rates that do not belong to any of those cycles are observable. The above-mentioned technique is adequate when dealing with linear relationships, such as the global mass balance equations. When you incorporate more complex formulations, like component mass balances or chemical reaction terms, the corresponding system of equations becomes nonlinear. If the latter is the case, it is advisable to carry out the observability analysis by using specific tools that carefully account for nonlinearities. In 1976, Vaclavek3 extended his own linear method to take into account bilinear equations, but the methodology only considered streams where either all or none of the mass fractions were known. In contrast, Kretsovalis and Mah4 introduced another algorithm without assumptions about composition measurements. It includes the balances around clusters of nodes, with each cluster being made up of a set of process units. They also incorporated energy balances, considering a univocal relation between the temperature of a stream and its specific enthalpy. Nevertheless, the method is not applicable to chemical reactions or splitters in a direct way. Both Meyer et al.5 and Kretsovalis and Mah6,7 proposed modifications to overcome this problem. Although these techniques have proved to be very efficient, they have been devised for linear and bilinear models, thus being less rigorous for those pieces of equipment whose functionalities are strongly nonlinear, such as reactors and flashes. 2.2. Equation-Oriented Approach. Along this line, two categories can be distinguished: nonstructural and structural techniques. The nonstructural techniques are based on calculations made using model coefficients. Crowe,8 for example, builds a projection matrix from the incidence matrix, and Madron,9 in particular, employs the matrix of coefficients associated to the linear representation of the process. On the other hand, the structural techniques (Romagnoli and Stephanopoulos10 and Joris and Kaliventzeff11) consist of rearranging the process’ occurrence matrix appropriately. Crowe et al.8 presented an approach for reconciliation of linear models using a method of matrix projections. In 1986, the technique was extended to enable the treatment of bilinear cases.12 On the basis of this philosophy, he later proposed an algorithm for observability and redundancy analysis,13 where he reduces the constraints by means of projection matrixes. This matrix is partitioned into three categories according to the occurrence and location of measurements. As to observability in particular, he defines and proves four lemmas

Figure 1. Occurrence matrix resulting from Romagnoli and Stephanopoulos’ rearrangement.

that give either sufficient or necessary and sufficient conditions for unobservability. After initializing all unmeasured quantities as observable, he applies these lemmas to individualize the unobservable variables. The procedure allows the inclusion of an arbitrary location of measurements, chemical reactions, flow splitters, and pure energy flows. Bilinear energy balances are included in the formulation, assuming there is a one-to-one correspondence between temperature and energy per unit mass. Crowe’s approach is quite complete as far as linear or bilinear terms are concerned. Nevertheless, it has never been extended to systems with strongly nonlinear equations. Madron9 builds the augmented matrix of the linearized system of equations so that the coefficients of the unmeasured variables appear in the first columns. Then, he processes the data by applying Gauss-Jordan elimination in two stages. At the beginning, he only takes into account the first columns when pivoting, because he is concerned with the classification of unmeasured variables. The second step deals with the classification of redundancies. In principle, the whole procedure is only applicable to linear systems. This constitutes a serious drawback because most process plant models are nonlinear in nature. Although there are several ways of linearizing equationsssuch as Taylor expansions, generation of linear correlations, or changes of variablessall of these techniques bring about further complications that deteriorate the quality of the final results. This is one of the main reasons why the development of nonlinear methodologies is a topic of present-day concern. In general, the nonstructural equation-oriented techniques, as well as the topology-oriented methods, have been specifically designed to model linear and bilinear relationships. Though the procedures are efficient, there is always some inherent loss of rigor for those items of equipment that include highly nonlinear functionalities, such as flashes and reactors. In this respect, the structural algorithms provide a better alternative because they allow more independence from the mathematical model’s degree of nonlinearity. In this approach, nonstructural aspects such as singularities and implicit functions can be taken into account by introducing numerical verifications. In particular, Romagnoli and Stephanopoulos10 proposed a technique that is based on Stadtherr et al.’s algorithm for precedence-order determination.14 This method builds an occurrence matrix whose first columns correspond to the unmeasured variables. Then, it performs a partitioning to obtain a categorization for the unmeasured variables. The technique leads to the rearrangement shown in Figure 1, where the shaded blocks are those that contain nonzero elements. The procedure locates the unmeasured variables in the first um columns and classifies them into observable and

Ind. Eng. Chem. Res., Vol. 38, No. 8, 1999 3029

unobservable. One equation is assigned to each observable variable, thus resulting in a pattern where a ) o. The algorithm yields a block lower-triangular square submatrix of order a, whose diagonal blocks are also square and can be solved sequentially. Each block, called an assignment subset, must undergo an allowability test before being accepted in order to avoid numerically unsolvable subsets. In this respect, some constraints are generated automatically while the process engineer defines others. The set of redundant equations r comprises some that depend on the observable variables, while the rest are solely functions of the measurements. Finally, the last ua rows are related to the unobservable variables. Later, Joris and Kaliventzeff11 proposed a completely different structural rearrangement, without partitioning the observable variables into blocks. A serious drawback of this method is that it may fail when recycles occur because some loops lead to numerical singularities that cannot be detected by the basic structural analysis. In 1992, an improved version of Romagnoli and Stephanopoulos’ methodology for the classification of unmeasured variables was included in Sa´nchez et al.’s package for plant data reconciliation.15 In particular, the implementation of the observability algorithm is called the dominant column block (DCB). Romagnoli and Stephanopoulos’ original method works under the assumption that all stream compositions are either fully known or entirely unknown. DCB does not impose this requirement, thus being more flexible than the previous proposal. It is also important to remark that the partitioning to a block lower-triangular form makes it easier to detect numerical singularities because they only appear in the square diagonal blocks inside the a × o submatrix. These blocks are usually small, so the allowability test becomes computationally cheap. In this respect, DCB constitutes an improvement over Joris and Kaliventzeff. Although DCB is remarkably efficient with respect to execution times, we have detected failures in regards to robustness. In this context, the term robustness refers to the algorithm’s capacity to detect the maximum number of blocks of minimum size. We found out that DCB, in particular, is unable to locate some blocks of size 3 or greater, thus implying a potential loss of effectiveness because block omission may lead to a classification where some observable variables are wrongly categorized as unobservable. In this paper, we propose a new observability algorithm, called the global strategy with first least-connected node (GS-FLCN), which is extremely robust and therefore better than DCB. 3. The Philosophy of GS-FLCN In general terms, GS-FLCN follows the same lines of thought as DCB. Both methodologies are based on two central ideas: 1. Finding n columns whose deletion would lead to n empty rows. Those columns and rows constitute a subset. 2. Performing an incremental search of the removable subsets by size. This means locating all of the 1 × 1 subsets first and then all of the 2 × 2 subsets, and so on. 3.1. Global Strategy. The following algorithm shows how GS-FLCN works.

1. Construction of the Occurrence Matrix M 2. Forward Triangularization 3. Set n ) 2 3.1. Call Construction of the Occurrence n-Submatrix (n, S) 3.2. Call Subroutine 2 (S) (location of 2 × 2 removable subsets) If an allowable subset is found then go back to step 2 else go to step 4 end if 4. Set n ) 3 4.1. Call Construction of the Occurrence n-Submatrix (n, S) 4.2. Call Modified Algorithm (n, S) (location of 3 × 3 removable subsets) If an allowable subset is detected then go back to step 2 else go step 5 end if 5. Set n ) 4 5.1. Call Construction of the Occurrence n-Submatrix (n, S) 5.2. Apply step 1 of Subroutine 2 5.3. Call FLCN Algorithm (n, S) (location of n × n removable subsets for n g 4) If an allowable subset is detected then go back to step 2 else If n > maximum size of subset then stop and return the classification else set n ) n + 1 and go to step 5.1. end if end if First of all, the Occurrence Matrix M is built. Its rows and columns represent the model equations and the unmeasured variables, respectively. M has a nonzero element M(i,j) if and only if the jth variable appears in the ith equation. Then, Forward Triangularization is performed to detect all of the 1 × 1 removable subsets. The procedure looks for rows containing only one nonzero element. This means that those equations contain only one unknown for which they can be solved. After removal, the search must be repeated until no more equations with a single unknown are left. The Construction of the Occurrence n-Submatrix consists of building convenient submatrixes S from M. The procedure involves two steps: 1. Choose all rows in M that contain n nonzeros at the most, together with the columns in which those elements appear. 2. The columns with only one nonzero are deleted. The removal might lead to rows with only one nonzero. Those rows are also eliminated. The procedure is repeated until all rows and columns contain at least two elements. By using S instead of M, the search is performed only over those columns that might take part in the n × n subsets. In this way, the amount of explored combinations is drastically reduced, without loss of potential subsets. A justification of this statement can be found in Stadtherr’s paper.14 The procedures Subroutine 2, Modified Algorithm, and FLCN Algorithm aim at processing S in order to find all subsets of sizes 2 and 3 and n g 4, respectively. It is important to remark that the allowable assignment

3030 Ind. Eng. Chem. Res., Vol. 38, No. 8, 1999

subsets are removed from both S and M as soon as they are detected. A more detailed description of each subroutine is given in the appendix. 4. DCB vs GS-FLCN: Theoretical Comparison The basic difference between DCB and GS-FLCN lies in the number of combinations explored and the way in which these combinations are built, when trying to find assignment subsets of size greater than 2. DCB employs Subroutine N that is a natural extension from Subroutine 2, whereas GS-FLCN makes use of two different routines. For the detection of 3 × 3 blocks, the authors designed a more robust variation of Subroutine N, called Modified Algorithm. This procedure has optimal performance for this size only. Therefore, the FLCN, which is a completely different approach, is suggested for sizes greater than 3. The subroutine called FLCN explores the undirected graph G from STS, where S is the submatrix obtained from M in step 5.1 of the global strategy. G’s nodes correspond to the system’s variables. The edges indicate which variables have rows in common. In other words, an edge joins nodes (ni, nj) if and only if there is at least one row in S that contains xi and xj, where xi and xj are associated to ni and nj respectively. The method was named after the heuristics, called FLCN, that guides the depth-first search procedure employed to build the combinations of nodes. This heuristics gives priority to the appearance of the leastconnected nodes on the way, to build as many combinations as possible. It is interesting to note that although both the topology-oriented algorithms and GS-FLCN involve the exploration of graphs, the techniques are conceptually different. The former make use of the graph underlying the process topology, whereas GSFLCN employs another graph, the one that represents the structural relationship among the rows and columns of the occurrence matrix. FLCN and DCB can be contrasted in terms of S. When looking for subsets of size n, DCB picks out a column c1 among those which have the maximum amount of nonzero elements. Then, it builds combinations using other n - 1 columns c2, c3, ..., cn, each of which can be associated with at least two rows, iP1 and iP2, so that

S(iP1,c1) ) S(iP1,cP) ) 1 S(iP2,c1) ) S(iP2,cP) ) 1

∀ P ) 2, ..., n

If it is impossible to find out sets of columns that satisfy this requirement, column c1 is eliminated and the process is repeated. In this way, DCB omits exploring many potential assignment subsets. Imposing the presence of a base column c1, strongly associated to the rest because it acts as a center around which the combinations are built, restricts the possibility of detecting very sparse blocks whose variables (columns) are less connected. In contrast, FLCN builds combinations of n columns c1, c2, ..., cn so that for each cP, 1 < p < n, there are at least two rows, iP1 and iP2, that verify

S(iP1,cP-1) ) S(iP1,cP) ) 1 S(iP2,cP) ) S(iP2,cP+1) ) 1

∀ P ) 2, ..., n - 1

Figure 2. Subset from Pissanetzky’s example detected by GSFLCN.

Figure 3. Process topology (Joris and Kalitventzeff11). Table 1. Model Variables (Joris and Kalitventzeff11) 1 2 3 4 5 6

a S1 b S1 c S1 T S1 T S4 T S5

7 8 9 10 11 12

T S5′ T S6 T S6′ a S5 b S5 c S5

13 14 15 16 17 18

U R1 a S6 b S6 c S6 U R2 T S2

19 20 21 22 23

T S3 Fr S2 Fr S3 T S2′ T S3′

with the boundary condition

S(i11,c1) ) S(i11,cn) ) 1 This methodology is more robust. A greater amount of combinations can be built because no base column is required. This makes it possible to detect more assignment subsets. By way of illustration, let us consider the block in Figure 2 taken from one of Pissanetzky’s examples.16 Although this 7 × 7 subset is detected by GS-FLCN, DCB fails to locate it because none of its seven columns can be taken as base column. For instance, if column 1 is chosen as the base, columns 3 and 5, 6-7 do not share any rows with it. Therefore, they are not taken into account as possible members of the assignment subset. The same problem arises for any other base column. Besides, DCB’s overall policy is not altogether appropriate if the main goal is to get the maximum number of minimum-order blocks. After having built a submatrix, DCB’s Subroutine 2 or Subroutine N looks for all of the possible 2 × 2 or n × n (n g 3) subsets, respectively, without going backward to lower levels after having found a subset. This is not advisable in regards to robustness because the removal of each subset frequently causes further decoupling of smaller assignment subsets, which remain undetected when using DCB. In contrast, GS-FLCN always yields irreducible blocks, thus ensuring subsets of minimum size. 5. DCB vs GS-FLCN: Performance in Industrial Applications Both strategies were tested on a variety of problems in order to assess their performance. In this section two

Ind. Eng. Chem. Res., Vol. 38, No. 8, 1999 3031 Table 2. Assignment Subsets Found by GS-FLCN for Case Study I (see Joris and Kalitventzeff11) measured variables 1 R1 mass balance comp. a R1 mass balance comp. b R1 mass balance comp. c R1 energy balance R2 mass balance comp. a R2 mass balance comp. b R2 mass balance comp. c R2 energy balance SP energy balance 1 SP energy balance 2 SP mass balance HTX1 energy balance HTX2 energy balance MX energy balance

2

3

4

5

6

7

unmeasured variables 8

9

10

1 1 1

11

12

1 1

1 1 1

1

1

1 1

1

1 1

13

14

1

1

16

1 1

1

15

1

1 1 1

17

1 1 1

1 1 1

19

20

21

1 1

1

22

23

1

1

1 1

1 1 1

1 1 1 1 1

18

1 1 1

1 1 1 1

1

1

1 1

1

1

1

1

1

1

1

1

1 1 1

1

1 1

Figure 4. Ethane plant’s flowsheet.

examples that clearly reveal GS-FLCN’s advantages are put forward. 5.1. Case Study I: Academic Example. This problem, presented by Joris and Kalitventzeff, corresponds to a plant section that basically consists of two reactors R1 and R2, two heat exchangers HX1 and HX2, one separator SP, and one mixer MX. Figure 3 shows a schematic representation of the process. Table 1 lists the process variables. There are 3 compounds (a, b, c) and 10 streams (1, 2, 2′, 3, 3′, 4, 5, 5′, 6, 6′). The letter T stands for temperature, Fr means flow rates, and U represents the heat-transfer coefficients. In this case, DCB’s assignment could only identify two 1 × 1 subsets linking eq 9 with variable T S2 and eq 10 with variable T S3. In contrast, GS-FLCN succeeded in finding all of the assignment subsets reported by Joris and Kalitventzeff. The results yielded by GS-FLCN are shown in Table 2, where the detected subsets have been shadowed. It is interesting to note that the 4 × 4 blocks are very sparse. Because none of their columns is strongly connected with the rest, DCB could not detect them. 5.2. Case Study II: An Industrial Ethane Plant. This example concerns the classification of unmeasured variables for a plant that separates ethane from natural

Table 3. List of Process Units equipment

quantity

air coolers isothermal pumps distillation columns partial condensers compressors dividers general units heat exchangers mixers with enthalpy changes partial reboilers splitters isothermal splitters turbines valves

16 8 6 1 10 9 2 10 7 4 2 6 3 3

gas. Figure 4 shows a schematic representation of its flowsheet. The process can be divided into three main sections: gas compression and dehydration, cryogenic separation, and fractionating. The analysis involves 87 units interconnected by 185 streams containing 12 compounds. A list of process units is shown in Table 3. The size of the occurrence matrix is directly linked to the level of detail considered in the model. In this example, the set of equations includes mass and energy balances as well as thermodynamic relations for densities, enthalpies, and equilibrium constants. Measure-

3032 Ind. Eng. Chem. Res., Vol. 38, No. 8, 1999 Table 4. Results for the Ethane Plant size of blocks 1 2 3 4 5 6 7 19 pav

DCB 715 4 5 1

52.07%

GS-FLCN 897 7 5 1 1 3 2 2 70.53%

ments of flow rates, compositions, temperatures, and pressures are included in the formulation. The resulting model contains 1830 equations and 1425 unmeasured variables. Because of the magnitude and complexity of this model, the corresponding system of equations and occurrence matrix, as well as some constraints for the allowability test, were generated automatically by using the graphic interface developed by Vazquez et al.17 Table 4 summarizes the number of assignment subsets obtained using each algorithm. The first column indicates the size of the detected blocks. The last row refers to the percentage of assigned variables (pav), calculated with respect to all of the unmeasured variables considered. At this stage, it is important to point out that the results reported here were obtained with GS-FLCN accelerated by means of “branching factors”. GS-FLCN’s combinatorial exploration can be viewed as a tree walking along the edges of G(STS). The acceleration technique that we devised consists of pruning some branches of the tree in order to reach deeper levels more quickly. The deeper the level, the greater the order of the potential assignment subsets. As a result, big subsets appear sooner. This becomes extremely useful to increase the efficiency of the strategy when applied to real large-size problems. GS-FLCN definitely exhibits better performance in regards to robustness because it manages to detect a greater amount of assignment subsets. The appearance of more subsets also leads to an increase in the number of observable variables, thus constituting a significant improvement in regards to effectiveness. This is clear from a comparison in terms of the pav. Table 4 reveals that GS-FLCN exhibits better performance than DCB because the former has significantly higher pav values, with more blocks of smaller size. Another interesting aspect is the fact that, although both strategies apply the same routines for the detection of 1 × 1 and 2 × 2 subsets, GS-FLCN finds 25% more blocks of these sizes. The reason for this increment is that whenever a new subset is detected, GS-FLCN restarts the search from size 1 because the isolation of a subset frequently leads to further decoupling into other subsets of lower size. It is also important to remark that the assignment subsets found by both strategies coincide at the beginning, though the order in which some blocks appear differs. Then, at a certain point, GS-FLCN discovers a 5 × 5 block that DCB is unable to detect. The subset is shown in Figure 5, where it can be noticed that there is no base column c1 that satisfies the requirements imposed in (1). This is the first major difference between the results. Later, two blocks of order 7, which would never have been located by DCB for the same reason, appear. Both have the same pattern because each of them corre-

Figure 5. Assignment subset of size 5.

sponds to a set of mass balances for a different compound, namely, propane or carbon dioxide, around the same process units. The plant section, shown in Figure 6, is basically a heat integration scheme. The cycle detected by the algorithm, comprising streams S1-S9, mainly involves the hot streams of a series of heat exchangers that evaporate liquid withdrawn from the demethanizer at different points, which is recycled back into the column at a lower level. One of the assignment subsets is presented in Figure 7. Its pattern is almost bidiagonal, with a nonzero element at the bottom left corner, whose presence makes decoupling into smaller subsets impossible. GS-FLCN always detects structurally nonsingular blocks, but the subsystem of equations involved in each assignment subset needs to be numerically solvable because the variables classified as observable are going to be calculated through those equations. The GS-FLCN strategy presented in this paper is inherently structural. Therefore, we had to design an allowability test on the basis of global observability concepts, i.e., without giving specific values to the process variables which may be either unavailable or inaccurate. The classification procedure is applied iteratively. After each iteration, the engineer analyzes the results in order to check solvability and determines whether any subsets should be banned by means of a conceptual and/or symbolic analysis. In many cases, visual inspection is enough. Once an unallowable subset has been detected, the engineer introduces this information to the program as an additional constraint and starts a new iteration. From this point onward, whenever a prospective assignment subset appears, the program automatically checks that it is not included in the list of constraints. Moreover, careful observation of the matrix’s structure may help the designer find a convenient location for a measurement. Let us suppose, for instance, that a numerically singular block with the structure shown in Figure 7 has occurred. (This is hypothetical because the block given in the example is solvable.) If the first column were removed by measurement of xS2, the remaining matrix would decouple totally into 1 × 1 assignment subsets after application of forward triangularization. So, the addition of only one measurement would suffice to make all of the remaining variables observable. Another aspect that should be taken into account is whether the system will be easy to solve. Systems of linearly independent mass balance equations, such as those involved in the 7 × 7 subset shown in Figure 7, can be solved easily. However, when the set of equations involves implicit functions and further treatment would become extremely difficult, it may be convenient to add a few measurements to simplify further calculations. In these cases, observing the structure of the assignment subset yielded by GS-FLCN is helpful to decide on the best location for the additional measurements.

Ind. Eng. Chem. Res., Vol. 38, No. 8, 1999 3033

Figure 6. Set of heat exchangers in the cryogenic area.

Figure 7. Assignment subset for cycle S1-S9 in Figure 6.

6. Conclusions A new algorithm for observability classification in process plant instrumentation is presented in this paper. It is called global strategy with first leastconnected node (GS-FLCN). The proposed method classifies the unmeasured variables by means of a structural rearrangement of the corresponding occurrence matrix. The permutation consists of an incremental search for assignment subsets, which are associated to the observable variables. One of the main features of this technique is its capacity to deal with highly nonlinear systems. Besides, the incorporation of allowability tests makes it possible to take into account general constraints that go beyond a mere structural analysis. To assess the performance of this technique, a comparative study between GS-FLCN and DCB, comprising both theoretical aspects and industrial examples, was carried out. GS-FLCN proved to be much more robust and effective than its predecessor, yielding a greater amount of both assignment subsets and observable variables. Because of its combinatorial nature, the methodology becomes computationally expensive in regards to run times for problems that contain assignment subsets of great size, which are costly to detect. Nevertheless, this drawback can be overcome successfully by using branching factors. The structural decomposition resulting from this classification technique leads to a configuration with the minimum amount of instruments and makes it easier to visualize the most convenient places to locate or

remove measurements, thus contributing to generate significant savings in both investments and operating costs. In view of the fact that the underlying problem in this application is the development of a matrix-reordering methodology, the strategy may be useful for a wide variety of applications outside the field of instrumentation design. Assessing the scope of the GS-FLCN procedure involves placing it in each context, as well as carrying out specific analyses of its advantages and limitations in comparison with existing techniques for the same purpose. This constitutes an interesting topic for future research. Nomenclature E: model’s system of nonlinear equations um: unmeasured variables in E m: measured variables in E o: observable variables in E uo: unobservable variables in E a: assigned equations in E ua: unassigned equations in E r: redundant equations in E M: occurrence submatrix from E S: occurrence submatrix in M used for subset detection G(STS): undirected graph used for subset detection pav: percentage of assigned variables

Appendix: Algorithms 1. Construction of the Occurrence Matrix M (1) Make up an occurrence matrix M for the system of equations to be solved by filling position (i, j) with a 1 if the jth unmeasured variable appears in the ith equation. Otherwise, place a 0. (2) Define a rearranged occurrence matrix R that will contain the removable subsets found on completion of the work. 2. Forward Triangularization (1) Find a row in the occurrence matrix M with only one nonzero. This entry represents a removable 1 × 1 subset. (2) If this subset is allowed, then remove it by deleting the row and column in which it occurs, and place it in

3034 Ind. Eng. Chem. Res., Vol. 38, No. 8, 1999

the first free row and column in the reordered matrix R. (3) Repeat the steps 1 and 2 until no more entries can be added to R. 3.1. Construction of the Occurrence n-Submatrix S (1) Choose all rows in the occurrence matrix M that contain n nonzeros at the most, together with the columns in which those elements appear, and use them to build a submatrix S. (2) Delete from S all of the columns with only one nonzero. The removal might lead to rows with only one nonzero. Eliminate those rows too. Repeat the procedure until all of the rows and columns contain at least two elements. 3.2. Subroutine 2: Location of 2 × 2 Subsets (1) Find a column in submatrix S which contains only one nonzero. Delete this column, together with the row in which the entry appears. Delete all columns without entries. Repeat this step until no columns with less than two entries remain in S. (2) Find the column of S with the greatest number of entries. Delete it, together with all of the rows in which it has entries. Look for a column whose number of entries has been reduced by two. Form a removable 2 × 2 subset using this column along with the deleted column and the two rows containing common entries. (3) If this subset is allowed then Delete the second column that had been found Place these rows and columns in the first two free rows and columns of the reordered matrix R end if Go to step 4 end if (4) If an allowable subset has been found or there are less than two rows in S then stop; else repeat steps 2-4 end if 4.2. Modified Algorithm: Location of n × n Subsets (1) Apply step 1 of Subroutine 2. (2) Find the column of S with the greatest number of entries and delete it. Delete the rows in which that column has entries. If there are n - 1 columns whose number of entries has been reduced by one or more, and if n rows are empty after those columns have been removed, delete those n - 1 columns temporarily. The n deleted columns and the n empty rows form a removable n × n subset. (3) If this subset is allowed then Place these rows and columns in the first n free rows and columns of the reordered matrix R Eliminate the columns that had been temporarily deleted in step 2 else Recover the columns that had been temporarily deleted in step 2 end if Go to step 4 (4) If an allowable subset has been found or there are less than n rows in S then stop; else repeat steps 2-4 end if 5.2. First Least-Connected Node Algorithm (FLCN)

Parameters: S n node stamps success

R

submatrix found in the previous step size of the desired subset current node logical array that indicates whether a node has been visited logical variable valued as true if an n × n removable set has has been located and false otherwise reordered matrix

Main internal data: rows

array containing the row numbers of the detected subset columns array containing the column numbers of the detected subset

FLCN(S, n, node, stamps, success, R) If n ) 0 then If there are n empty rows after having deleted the n columns associated with the n nodes in the path then an n × n removable subset made up of the n empty rows and the n columns associated with the n nodes in the way has been found If IsAllowable(subset), then Assign the n empty rows to the n × n subset Add this subset to the first free row and column in the reordered matrix R success ) true else success ) false end if else success ) false end if else finish ) false While (not finish) and (not success), do [*] Choose an unstamped node adjacent to node. If there are several adjacent nodes, pick out the least-connected one. If there are several leastconnected nodes, select the one with the lowest label. If there is no such node then finish ) true success ) false else node ) node chosen in [*] stamps(node) ) true FLCN(S, n - 1, node, stamps, success) If success then Define the column associated with node of the n × n set as the nth column end if end if end while end if Literature Cited (1) Romagnoli, J. A.; Sa´nchez, M. C. Data Processing and Reconciliation for Chemical Process Operations; Academic Press: New York, 1999; to be published.

Ind. Eng. Chem. Res., Vol. 38, No. 8, 1999 3035 (2) Vaclavek, V. Studies on System Engineering. III: Optimal Choice of the Balance Measurements in Complicated Chemical Systems. Chem. Eng. Sci. 1969, 24, 947-955. (3) Vaclavek, V.; Loucka, M. Selection of Measurements Necessary to Achieve Multicomponent Mass Balances in Chemical Plants. Chem. Eng. Sci. 1976, 31, 1199-1205. (4) Kretsovalis, A.; Mah, R. S. H. Observability and Redundancy Classification in Multicomponent Process Networks. AIChE J. 1987, 33, 70-82. (5) Meyer, M.; Koehret, B.; Enjalbert, M. Data Reconciliation on Multicomponent Network Process. Comput. Chem. Eng. 1993, 17, 807-817. (6) Kretsovalis, A.; Mah, R. S. H. Observability and Redundancy Classification in Generalized Process NetworkssI. Theorems. Comput. Chem. Eng. 1988, 12, 671-687. (7) Kretsovalis, A.; Mah, R. S. H. Observability and Redundancy Classification in Generalized Process NetworkssII. Algorithms. Comput. Chem. Eng. 1988, 12, 689-703. (8) Crowe, C. M.; Garcı´a Campos, Y. A.; Hrymak, A. Reconciliation of Process Flow Rates by Matrix Projection Part I: Linear Case. AIChE J. 1983, 29, 881-888. (9) Madron, F. Process Plant Performance. Measurement and Data Processing for Optimization and Retrofits; Ellis Horwood Ltd.: Chichester, England, 1992. (10) Romagnoli, J. A.; Stephanopoulos, G. On the Rectification of Measurement Errors for Complex Chemical Plants. Chem. Eng. Sci. 1980, 35, 1067-1081. (11) Joris, P.; Kalitventzeff, B. Process Measurement Analysis and Validation. Proceedings XVIII Congress on The Use of Com-

puters in Chemical Engineering, CEF’87, Giardini Naxos, Italy, 26-30 April 1987; pp 41-46. (12) Crowe, C. M. Reconciliation of Process Flow Rates by Matrix Projection Part II: The Nonlinear Case. AIChE J. 1986, 32, 616-623. (13) Crowe, C. M. Observability and Redundancy of Process Data for Steady State Reconciliation. Chem. Eng. Sci. 1989, 44, 2909-2917. (14) Stadtherr, M. A.; Gifford, W. A.; Scriven, L. E. Efficient Solution of Sparse Sets of Design Equations. Chem. Eng. Sci. 1974, 29, 1025-1034. (15) Sa´nchez, M. C.; Bandoni, A. J.; Romagnoli, J. A. PLADAT: A Package for Process Variable Classification and Plant Data Reconciliation. Comput. Chem. Eng. 1992, S499-S506. (16) Pissanetzky, S. Sparse Matrix Technology; Academic Press: London, 1984. (17) Vazquez, G. E.; Ponzoni, I.; Sa´nchez, M. C.; Brignole, N. B. ModGen: a Model Generator for Instrumentation Analysis. Industrial Application using New Observability Techniques. AIChE Annual Meeting, Miami, Nov 1998; Paper 228ac.

Received for review November 30, 1998 Revised manuscript received April 14, 1999 Accepted April 20, 1999 IE980747M