identification of nonlinear systems - ACS Publications

G. H. HARRIS

L. LAPIDUS

THE IDENTIFICATION OF The theories of Wiener and Bose are used t o c o n s t r u c t a general method f o r identifying a nonlinear Dhysical process f r o m t s input-output records.

4n example is given in which 3

continuous stirred-tank

r e a c t o r system with two-level inputs is characterized and simulated using a digital computer

ne of the more important problems which confronts 0 the chemical engineer is the identification of the dynamic characteristics of nonlinear systems. In its most general interpretation, knowledge of the process dynamics implies cognizance of the functional relations between the inputs and outputs of a system. Such an identification, in which the input-output relations are determined, has relevance to such applications as control system design, optimization, and many others. Whatever the precise reason, however, for wanting to identify the process, what is ultimately desired is a mathematical representation, or model, of the system. Having obtained by some means such a mathematical model of the actual physical process, the investigator can then simulate its response to any input. Because the presence of the physical system is no longer required, the investigator is free to conduct endless “experiments” upon the model by varying the input conditions at will. Often this is preferable to working with the real process itself. The problem of identifying nonlinear systems may be formulated from two broad viewpoints, the first of which is incorporated in the second. In the first case, the mathematical structure of the model is given but ita parameten are not, and in the second case, no a ptimi information whatsoever is specified regarding the struc66

INDUSTRIAL A N D ENGINEERING CHEMISTRY

ture of the model. In the present paper, attention will be focused on the second of these viewpoints subject to the fundamental premise that an experimental basis is mandatory if initially the process characteristics are truly unknown. Although systems of chemical engineering interest are almost universally nonlinear in nature, there have been few previously reported attempts to characterize the dynamics of these processes through the application of a general theory of nonlinear systems. Attempts have been made to characterize nonlinear system using linear techniques (7, 27, Z6), but the results have been discouraging. This reluctance to try to identify nonlinear processes is readily understood because a general theory of nonlinear systems (Wiener’s theory) was not available prior to about 1950, and it has not been widely publicized since then. Furthermore, the original presentation of Wiener’s theory (70, 53) was particularly obscure to the nonmathematidan and left unanswwered many important questions regarding the implementation of the theory. The basic Wiener theory of nonlinear system characterization and synthesis is presented here in its entirety. Because this important work is little known outside the immediate areas of information and communication theory, chemical engineers are virtually unaware of it and subsequent theories of nonlinear systems. Wiener’s theory provides the necessary background for an understanding of the general nonlinear identification problem, and it forms the basis for an extension to a workable algorithm. This paper will outline the basic features of the nonlinear identification problem as characterized by Wiener’s theory and its subsequent development into a useful algorithm. Details of this latter point will be illustrated by a computer identification of a continuous stirred-tank reactor system. The development will pertain specifically to time-invariant nonlinear systems or to those systems whose process parameters vary slowly over experimental observation intervals.

GENERAL IDENTIFICATION PROBLEM Relevance to Engineering

The problem of identifying unknown or incompletely specified processes cannot be separated from research in the general areas of prediction, filtering, and estimation theory, which are encompassed in the disciplines of information and communication engineering (5, 75, 23-25, 28, 29, 58). These fields are unfamiliar to the chemical engineer and, because of their inherent proba-

NONLINEAR SYSTEMS bilistic character, applications to chemical engineering might seem remote. This, however, is not the case. The identification problem, that of determining the dynamic characteristics of an unknown process by experimental means, is usually first encountered in the domain of automatic control theory. It is of primary concern in the general field of adaptive control systems (76, 30, 33, 47, 43-45, 49, 57). Owing to the fundamental nature of adaptive control, process dynamics must be measured frequently during normal operation of the system. Once the process dynamics are known, the controller characteristics can be automatically adjusted to satisfy some prescribed performance criterion. If process characteristics vary with time, identification will have to be conducted many times. Until conventional or feedback control systems are designed, it is also essential that one has an accurate knowledge of the relevant process parameters. However, in this case there are no limitations imposed on the test inputs used to probe the system. MnHion of Ihm Problem Before the identification problem is stated in more explicit terms, it will be useful to discuss what is meant by an unknown process-Le., a process whose dynamic characteristics are unknown. Surber (50) p i n t s out that there are three possible interpretations:

-The system structural configuration is known but all or some of the model parameters are-unknown. This is the simplest (and a frequently occurring) situation. For example, the system is known to be represented accurately by an nth order (where n is known), ordinary, nonlinear, time-invariant differential equation but the (n 1) constant coefficients are unknown. -Neither the system configuration nor all the model parameters are completely known. One has, therefore, some idea of the physical processes occurring within the system. -There is no available information about the process dynamics that can be used to formulate a mathematical model of the system. The system can be considered a “black box” about which nothing is known except the number of inputs and outputs.

+

Clearly, the last case is rarely, if ever, encountered in practice; however, it does provide the most completely general background for the development of a theory of system identification. Any theory advanced from the black box viewpoint, while being completely universal in scope, will suffer from its lack of concern for the physical processes involved. The system is

viewed as an input-output device with little regard for its internal physical structure. Hence, it becomes d&cult to incorporate within the framework of the theory any apriori knowledge about the system. Eykhoff (77) distinguishes two classes of problems, which depend on the initial and the desired knowledge of the process: (a) “Identification-the

determination of a toplOgy

of the process, considering it as a celebrated black box, and (b) Parameter estimation-the determination of the parameter values of the process, assuming the topology to be known.” Solution of problem a, of course, must always be followed by solotion of problem b. EykhoPs classification is essentially the same as Surber’s. I n (77) only problem b is considered in detail, and mainly for linear systems. However, the paper does present a coherent picture of process-parameter estimation and introduces a philosophy of a generalized model, which can be extended to nonlinear systems. Zadeh (57) has formulated the problem of identifying a black box as follows: “Given (1) a black box, B, whose input-output relationship is not known a frimi; (2) the input space (domain) of B-Le., the class of all time functions on which operation with B is defined; (3) a class of black boxes, R, which on the basis of a Priori information about B is known to contain B. Determine, by observing the response of B to various inputs a member of r which is equivalent to B in the sense that its responses to all time functions in the input space of B are identical with those of B.” Zadeh’s formulation, as he indicates, is somewhat presumptuous in the mere knowledge of the class u containing B. In order that the objective in Zadeh’s last statement can be attained, the class R must necessarily be very narrow. A more satisfying expression of the problem would be to retain Zadeh’s first two conditions and add (50): Determine, by observing the responsc of B to various inputs, “a set of parameters which are sufficient to enable the computation of a satisfactory approximation to the present and future values of the output variable from: -The present and future values of the input signal (or signals, some of which might be random disturbances), and -The initial state of the process, or the past history of the input signal which determines its initial state.” V O L 5 9 NO. b J U N E 1 9 6 7

67

At the present time a completely general theory along the foregoing lines seems only remotely possible; however, the objectives have been clearly stated. Classes of Systems

Two major classifications within the identification problem concern whether the system is time invariant or time varying and whether the system is linear or nonlinear. The first point really concerns whether the system can be considered to be time invariant during the interval of measurement or identification period. The time required to complete the identification of a given unknown system is an important aspect of any identification procedure and is strongly influenced by the extent to which the system is time varying. The identification of time-invariant systems is enhanced by long operating records, not only as a result of prolonged model coefficient averaging, but also because of the effect of input-output noise smoothing. Such is not the case, however, for time-varying systems. Frequently, for such systems, the identification model is time varying in the sense that it consists of a time sequence of time-invariant models, each model persisting until the next system identification is completed. Because a sequence of time-invariant models is often used to approximate a time-varying system, it is desirable to base the system identifications on observation periods as short as possible, consistent with the decreased effect of noise smoothing. As Kerr and Surber (32) have observed, “for a given rate of (process) parameter variation, a given noise level, and a given type of control signal variation, there should exist an optimum record length if an approximate time-invariant model is used to represent the system for a definite time into the futurei.e., until the results of the next computation of parameters are available.” For a time-invariant system there will be no optimum identification time because there is no conflict of requirements. However, there will exist a minimum identification time to yield values of the model coefficients which meet specified reliability criteria. Kerr and Surber refer to this as the “sufficient test signal problem.” Developments in Theory of Nonlinear Systems

The theory of linear system identification is well established, for the principle of superposition is applicable (23-25). A linear process ir characterized by its response to any known aperiodic signal, such as a step function. Thus, oncc this response is known, in theory the system’s response to any other input can be determined. I n the identification of nonlinear systems, there is no principle comparable to that of superposition upon which to develop a general theory. Aside from linearization techniques, there are several approximate methods for identifying processes within certain restricted classcs of nonlinear systems. It will be assumed henceforth that the system is time invariant unless specified to the 68

INDUSTRIAL AND ENGINEERING C H E M I S T R Y

contrary. If the system structural configuration is known a priori-e.g., the form of the governing nonlinear differential equation is known but certain model parameters are unknown-the identification may not be more difficult than for a linear system (4, 77, 37). This is, of course, the simplest case. Other limited but often practical techniques can be found in many references (13, 76, 25, 34, 36, 44). I n recent years many investigators have been engaged in developing a general theory of nonlinear system characterization and synthesis. Ft’iener is usually credited with first applying the concept of functional representation to the study of nonlinear systems. I n 1942, he used a series of functionals to describe the output of a nonlinear electrical circuit with a white gaussian input (54). The concept of expanding the input-output relationship of a nonlinear system in a functional power series is originally expounded by Volterra (52) in the latter part of the nineteenth century. A functional is a function whose argument is a function and whose value is a number. That is to say, a functional is a function of a function. FVhereas a function associates a value, J ( x ) , with each value of the independent variable, x , within some domain of the independent variable, a functional associates a valuc, F [ x ( t ) ] ,to every function, x ( t ) , which is in some domain of definition in function space. A functional can be considered as the mathematical equivalent of a system wherein the output a t a particular time, t , depends on the previous input. The functional relationship between the input, x ( t ) , and the output, y ( t ) , can be expressed as

Equation 1 states that the value of the output at a particular time t , y ( t ) , is dependent only upon that part of the input signal occurring prior to time, t. A familiar example of a functional is the impulse response formulation of a linear, time-invariant system :

where h ( t ) is the impulse response function. Thus, the present output of the system, whose value is a number, is a functional of the input. The continuous functional power series of Volterra, which represents a general nonlinear, time-invariant system, is y ( t ) = ho

+f

$2 h l ( 7 M

-

T1)dTl

+

f f h 2 ( 7 1 , 7 2 ) X ( t - r l ) x ( t - 7a)dTldTz + f f f~h 3 ( 7 1 , T Z , T 8 ) X ( t - 7 1 ) X ( t - 7 2 ) X ( t Q

Q

Q

73)dTldT2d87 f

...

(3)

where ,rl is the domain of integration on which x ( t ) is defined. The second term in the series is the ordinary convolution integral of linear system analysis. An excellent presentation of the use of functionals in the analysis of nonlinear systems is given by Barrett (6).

Wiener was the first to make use of functional representation for the analysis of nonlinear circuits. He introduced the concept of functionals that are orthogonal with respect to Brownian motion (white gaussian noise) to characterize nonlinear systems. Under Wiener’s guidance his ideas have been extended by several writers, notably Singleton (48), Bose (8-70), Brilliant (72), and George (20). The works of Zadeh (59) and Barrett (6) have also been influenced by that of Wiener. T o a large extent, these investigators have been concerned with applications to communication theory and stochastic processes in general. The work of Bose is particularly interesting and important. By introducing the concept of gate functions, he overcame some of the difficulties of Wiener’s method by partitioning the function space of the past of the input into nonoverlapping cells. This permitted an orthogonal expansion of the system output as a function of the input regardless of the nature of the input process. It is Bose’s extension of Wiener’s theory that leads to a practical solution of the characterization of nonlinear systems with two-level inputs (22, 46, 47). McFee (42) has extended the approach of Singleton (48) and Zadeh (59) to the characterization of nonlinear systems by a series of impulses. The technique is basically an analytical procedure. Flake (78), also employing a functional series approach, has developed a different analytical technique for solving nonlinear differential equations. The technique permits the general solution of a differential equation with nonzero initial corditions. More recently, Flake (19) has extended his method to time-varying systems. Katzenelson and Gould (31) have presented a systematic analytical approach to the synthesis of continuous nonlinear systems and have attempted to apply their theoretical results to various problems in control, communication, and characterization of nonlinear systems. This work, like that of their predecessors, begins from the concept of functional representation of a system. However, unlike the vast majority of previous investigators, they attempt experimentally to apply the theory to a real, physical problem: the characterization of the servomechanism associated with the pupil of the human eye. This ambitious project does not meet with complete success, but interesting results are, nevertheless, obtained. The application of their method to real systems, while difficult and complex, seems feasible. Lubbock (39) has formulated a statistical theory for the identification of a restricted class of nonlinear systems called multipath memory systems. This class includes those nonlinear systems that may be reduced to linear, parallel-path systems. The identification procedure is simple when applied to zero-memory systems; however, characterization of multipath systems with memory involves the solution of a set of simultaneous integral equations and requires some knowledge of the input statistics. Lubbock and Barker (40) have refined this approach by employing orthogonal functions of input functionals, thereby obviating the necessity of measuring certain statistical properties of the system

input and output and avoiding often intractable mathematical optimization problems. They obtained excellent results for the characterization of a heat exchanger with essentially linear process dynamics. Mention should also be made of the work of Wolf and others to apply white-noise probing to the identification of nonlinear, physical systems. K u (35) provides a general introduction to the area of nonlinear systems with random inputs. He advocates combining the WienerBose approach and the Ku-Wolf approach to study the problem of nonlinear network analysis and synthesis. Wolf (55) discusses the proposed method in some detail, and he and Dietz (56) present a general theory of whitenoise probing of nonlinear systems. I n this work it is assumed that the structural configuration of the system is known a priori but the model parameters are not. Wolf and Dietz imply that the use of their method under certain conditions may lead to the determination of the system configuration as well. Although some specific techniques exist for characterizing certain nonlinear systems, a general theory of identification would be of tremendous value. I t appears that the framework for such a theory has been developed, if not indeed the theory itself. Unfortunately, there are essentially no experimental results to report on the identification of real systems. I t will now be seen how Wiener’s theory naturally leads to the discovery of a n important feature of nonlinear systems.

WIENER’S T H E O R Y FOR CHA RA C T ER I 2A T ION AN D SY NTH ES I S OF NONLINEAR SYSTEMS Description of the Method

Wiener’s theory (70, 53) provides a systematic, analytical approach to the characterization and synthesis of nonlinear systems. The theory is the basis for the experimental analysis of a broad class of nonlinear systems, being applicable to nearly any physically realizable, time-variant, nonlinear system. Included in this class are linear systems and zero-memory (variously called time-independent, no-storage, or noninertial) systems. Unlike other techniques which are limited to a narrow class of nonlinear systems-e.g., the describing-function approach (44)-Wiener’s method is general in scope, treating systems from the black box viewpoint. I n its simplest conceptual form, Wiener’s method consists of exposing an unknown nonlinear system to a random noise input and then measuring certain averages of products of its output with nonlinear functions of its input. These averages, the characterizing coefficients of the nonlinear system, can be used to synthesize a model of the nonlinear system under test. Basic Formulation

Wiener’s theory is limited to stable systems with finite memories. These systems possess the necessary property that the present output is influenced to an arbitrarily VOL 5 9

NO. 6 J U N E 1 9 6 7

69

small extent by that portion of the past of the input beyond some arbitrarily large but finite time. This means, generally, that in such systems the influence of the past input on the present output attenuates with time. I n addition, attention is restricted to nonlinear systems operating on continuous, real-valued functions of time that are integrable square on the semi-infinite interval ( ,O), thereby generating continuous, real time functions as outputs. All physical systems with finite physical inputs, both in amplitude and duration, are included in this classification. Wiener’s major contribution to nonlinear system analysis was the use of random noise as an input probe. He postulated that the response to white gaussian noise (a gaussian amplitude distribution and a flat or white power density spectrum) can serve to identify nonlinear systems. With white gaussian noise (referred to by Wiener as Brownian motion or shot noise) as the test signal, there is a finite, nonzero probability that any given time function will be approximated arbitrarily closely over any finite time interval by some sample of this noise. Because white noise possesses a flat frequency spectrum over all frequencies, the system is effectively being exposed to every physical input. Two systems which respond identically to white gaussian noise are said to be equivalent and will have the same response to any given input. Fortunately, white gaussian noise, while not physically realizable, can be approximated by the output of a shot-noise generator (14). The shot effect, output of a shot-noise generator is, then, the actual physical input postulated as the universal probe for nonlinear systems. For a general nonlinear system with memory, the present output is a function of the past and present values of its input. The system may be considered to perform a mapping of the function space of the past input onto a line that corresponds to the amplitude of the present output; this is indicated as

The nonlinear transformation F maps the past of the input, x i t ) , for t in ( - a,O), into the present output y(0). Here the present time is denoted t = 0, and the more remote past becomes increasingly negative. This transformation is an expression of the functional relationship between the input and the output, and is alternatively given by Equation 1,

y ( t ) = Flt;x(7), 7

5

tl

(1)

where now t is the present time. F is a functional because its present value (a number) depends on the past of the input. The mapping F can be performed in two successive operations. First, the past and present of the input are characterized by a set of coefficients { e n ) , [x(t),

t

I 01 -5 b n )

where the i f , ) are an ordered set of real numbers obtained from mapping L. Second, the {e,) are trans70

I N D U S T R I A L A N D E N G I N E E R I N G CHEMISTRY

formed into the present value of the output by mapping N : {cn)

ZY(O)

Characterizing the Past of a Time Function

The set of coefficients { e n ) characterizes the past of the time function x ( t ) if x ( t ) can be reconstructed from the If the time function x ( t ) is continuous, realvalued, and integrable square over the interval (a, b)-i.e.,

IC,).

then it can be expanded in a complete set of orthonormal functions { p, (t>] : n=O

The coefficients of the functional expansion determine almost everywhere and form a spectrum by which the analyzed function can be identified. The coefficients { c n ] for certain series expansions can be realized by rather simple analog circuits known as ladder networks. When the input x ( t ) is applied to the input terminal of the ladder circuit, the output from each of the various sections represents at each instant of time the corresponding orthogonal expansion coefficient. Wiener selected a Laguerre function expansion of the past of the input because it satisfies certain mathematical requirements and simplifies the engineering problem of generating the characterizing coefficients from analog instruments. The expansion of the past of the input in Laguerre functions, { p,(t) ) , is x(t)

m

x(-t)

=

C cnan(t); n=O

t 2 0

(4)

where the sense of time has been reversed, the present time being t = 0. The { c n ) are the desired Laguerre coefficients characterizing the past of the input. The set of Laguerre functions, { q , ( t ) ) , is complete in the interval, ( 0 ,),~ and constitutes an orthonormal system. The nth Laguerre function is defined by

fort>O,n=0,1,2,.,.; a>O =

0

fortO

(5)

where a is the time-scale factor. To simplify matters, a will be taken as unity for the present. Because the Laguerre functions form an orthonormal system : pm(t)pn(t)dt = 0

for m f n

= 1

form=n

Applying this to Equation 4, one obtains cn

sow

x(-t)pn(t)dt

(6)

I

I

As previously indicated, these coefficients are readily generated as the outputs of a simple ladder circuit known as a Laguerre network. When x(t) is applied to the input terminals of the Laguerre network (Figure I), the Laplace transform of the signal at the output of the (k 1)th section is

+

where L is the Laplace transform operator. The Laplace transform of the kth Laguerre function is

which is also the transmittance between the input and the first (k 1) sections in tandem. Inversion of Equation 8 with the aid of the convolution theorem

+

gives

- T)dr

r.(t) = j”:(r)va(:

Suhtituting t‘ = t r.(t) =

-

(10)

changes Equation 10 to

T

l

x(f

- t’)q&’)dt‘

(11)

If the instant of measurement of r. is taken as f = 0, then r,(o) =

l

x(-t’)+oo,(t‘)dt’

(12)

This expression is identical with the definition of the

kth Laguerre coefficient in the expansion of the past

of&)

given in Equation 7 ; thedore,

r,(O)

= c.

(13)

+

I n general, the output of the (k 1)th section of the Laguerre network at any time t is equal to the kth Laguerre d c i e n t of the past of the input x ( t ) up to time t. Huggins (27)has pointed out an important phenomenon which has not been mentioned thus far and is

inherent in all methods of signal analysis. The application of x ( t ) to the input of the Laguerre network, Figure 1, or to any orthogonal network, gives the spectrum of x ( - t ) , the minor image of x(f) in the t = 0 line. This reversal has actually been shown in the preceding development (Equations 10 through 12). I n many applications thii result would not be desirable, the spectrum of x(t) being sought. A simple and practical solution to this difficultyis to record x ( f ) in some memory device-e.g., magnetic tape-and then to reverse the sense of time by reversing the direction of playback of the stored signal. Fortunately, these difficulties are obviated in the present situation because the spectrum of x ( - t ) is as useful as that of x ( f ) . The application of x(t) to the input terminals of the Laguem network directly yields the spectrum of x ( - t ) , the past of the system input. Although an infinite number of Laguem coefficients are required to completely characterize the past of the system input, in practice only a finite number can be evaluated. Because the Laguerre functions decay in almost exponential fashion with inmasing argument, any finite s u m of these functions evaluated at a suitably large value of the argument will be negligibly small. This erroneously implies that in the finite Laguerre function expansion of the past of the input there exists some finite time in the past beyond which the input all but vanishes. Thus, Wiener’s method is restricted to systems with arbitrarily large but finite memories. Unstable systems and limit-cyde oscillatora are not in this classification because once these systems are excited, the output will exist for all future time independent of the input. The Laguerre coefficients of the past of any integrable square input are identically equal to thc outputs of the Laguerre network. Each coefficient, c.(t), for a white gaussian input is a sample function of a.continuousparameter, stationary random process and has the following three statistical properties: -Each c, has a gaussian distribution. -The coefficients (c.) , evaluated at the same time, are statistically independent random variables. -The coefficients (cJ all have the same variance. The validity of these assertions has been demonstrated (74 22). It has also been shown that iffor convenience V O L 5 9 NO. 6 JUNE 1 9 6 7

71

The parameters characterizing t h e nonlinear system can be evaluated experimentally by subjecting i t t o stationary white gaussian noise

the coefficients are assigned a unit variance, then the (c.) are standardized random variables-i.e., they each have a mean value of zero and a variance of unity. Because the Laguerre coefficients for a white gaussian input are statistically independent, standardized, gaussian random variables, they have the following joint probability distribution: p(c0,

c1,

. . ., CJ

The multivariable Hermite function expansion of the Laguerre coefficients of the past of the time function x ( t ) is.

= (2*)--n/Z exp

exp

[ - i-

(coz

+ clz + . . . + }]c):

(19)

+

=

n

i-0

( 2 ~ ) - ' /exp(-c:/2) ~

(14)

for n coefficients evaluated at the same time. This result will be of great value in the later development of the theory.

The past and present of a time function having been characterized by a set of Laguerre coefficients, these coefficients are now to be transformed by a suitable mapping into the present value of the system output. A functional expansion for the system output y ( t ) ,

-

. . ., cd

-90

I(-.-

(15)

is required, where the { c.) are the Laguerre coefficients of the past of the input x ( t ) at time 1. The (cn) are real numben in the interval (- m,+ m). An expansion F using Hermite functions is chosen because they are orthogonal with respect to a white gaussian random process and form a complete orthonormal set over the interval (--,+-). The nth normalized Hermite polynomial is defined as

d"

= ( - ~ ) ~ ~ - R + I ~ z ( ~ ~ ~ ! ~ / z ) - I / z ~ ' ~' e-~'s' dZ*

n = O , 1 , 2 ,..., b # O

(16)

and b is a scale factor, chosen for the present to be unity. The corresponding H e p i t e functions are ~*(z= )

C-QC'/Z

(17)

V"(2)

-

The Hermite functions constitute an orthonormal system over the infinite interval ( - m , ) ; hence

+

J---~ ( z ) ~ ( z ) d=z o = 72

for m

z

n

1 form = n

Va) = ?,(CO)?,(CJ . . . ?&.) and

Output Functional Expanaion in Laeuura Coefficients

y ( t ) = l i F [ ( c , , ) ] = lim F(co, CI,

There are (n 1) summations indicated in Equation 19, one for each Laguerre coefficient. This is the desired general input-output relation for a time-invariant, nonlinear system. This equation can be simplified in appearance by the substitutions

(18)

INDUSTRIAL AND ENGiNEERiNG CHEMiSTRY

Am =

'%,j,...,k

to yield

The index of summation a extends over all permutations of the indices i, j , . , . ., k-Le., over all permutations of the integen 0, 1, 2, . . . to infinity. The infinite set of Hermite expansion coefficients (A,) characterizes, or serves to identify, the unknown nonlinear system because the system response to any physical input can be evaluated through Equation 20 once the {A,] are known. A procedure whereby the parameters (A,) characterizing a given nonlinear system can be evaluated experimentally will now be developed. In the experimental method the unknown system is subjected to a stationary white gaussian signal to expose it effectively to all inputs. It will be shown that the particular functional expansion in Equations 19 and 20 is especially suited to this stochastic input. I n any practical application of Wiener's theory, only a finite, number of Laguerre coefficients can be generated and for each coefficient only a finite number of Hermite functions can be constructed. Suppose in Equation 19 the number of Laguerre coefficients is limited to (n 1) and each Hermite function index can take on only (j 1) values, ranging from 0 top. Equation 20 now becomes a finite sum,

+

+

where j ( t ) is the finite approximation to y(t). For any bounded, continuous input to the system-i.e., any physical input, Equation 21 yields a bounded continuous output j ( t ); hence, the aforementioned restriction to real physical systems becomes necessary. Because the Hermite functions form a complete orthonormal set on the interval (, 03 ), j ( t ) can be made to approximate y ( t ) within any desired degree of accuracy by increasing the number of Laguerre coefficients and the number of Hermite functions in the expansion. The index a ranges over the members of the set T , where r includes all permutations of the integers 0, 1, 2, . . . , p taken (n 1) at a time. There are, therefore, ( p 1)"+ Hermite coefficients to be evaluated in the finite Hermite function expansion. At this point there are several ways to proceed to obtain an explicit expression for the { A , ] suitable for experimental determination. Perhaps the most direct way is to multiply both sides of Equation 21 by V(P),take the time average, invoke the eigodic principle, and finally use the orthogonality of the Hermite functions to deduce Equation 28. It is more enlightening, however, to proceed by choosing the { A , ) so that the finite sum of Hermite functions best approximates the true system output y ( t ) with respect to some - error criterion. The weighted mean-square error, 82, in the time domain furnishes a convenient measure of the error between y ( t ) and j ( t ) :

+

+

-

+

J 2T

T

ld7

J O m

+

the ergodicity of the gaussian random processes is assured. By the ergodic theorem, the time average on the right-hand side of Equation 24 can be replaced by the corresponding ensemble average over the (n 1) dimensional space of the Laguerre coefficients; this operation yields

+

y

o

) = Jm

...

Jw -m

Jm - m

- m

[V ( P >r"I exp v=o

A , V ( C X ) ~ (~C1O.,,. ., C, a€r

>I

(-c:/2

dcodcl . . . dc,

x (25)

Using the distribution p ( c 0 , c1, . . ., c,) of Equation 14 in Equation 25 and interchanging the order of integration and summation, one obtains

m: J

V(a)V(P)

r"I

exp(-c,2)dcodcl

...

v=o

&] (26)

The right-hand side of Equation 26 can be separated into a product of integrals, each of which involves but one Laguerre coefficient :

n

e2 = lim T--,~

which satisfy the condition

-Tv=O n

exP(C,2/2)[Y(t) -

..

exp( -cO2)dco.

A , v ( ]2dt ~ ) (22)

II exp( -c,"/2) aEs

U=O

n

The weighting function fl exp(c,2/2) is chosen solely for Y

-0

the mathematical simplification it produces in the minimization process to follow. The weighted meansquare error is now minimized with respect to the coefficients { A , ) . For the particular coefficient A,,

exp(-cC,2/2)

aEs

1

A,V(a) dt

(23)

For the error to be a minimum, Equation 23 must be equal to zero, in which case

S_mm

vk(c,)Tx'(Gn)

exp(-cn2)dcn]

(27)

where i, j , . . ., k are the components of a and i', j ' , . . ., k' are the components of /3. By the orthogonality property of Hermite functions, the product of integrals in Equation 27 vanishes unless i = i', j = j ' , . . . , k = k'--i.e., unless a = &-in which case the product has the value unity. Equation 27 simplifies to

which is the desired expression for A , suitable for experimental evaluation. If this result is substituted into Equation 22, the minimum weighted mean-square error for a white gaussian input and a given number of Laguerre coefficients and Hermite functions is obtained, emin2 = y2(t)

fi exp(c,2/2) - ( 2 ~ ) - ~ /Aa2 ~ 20 v

=o

(29)

afr

~

Since the outputs of the Laguerre network are zeromean, stationary gaussian random processes with continuous autocorrelation functions :

emin2 can be reduced to any preassigned value by increasing the number of Laguerre coefficients and Hermite functions, for it tends in the limit to zero. Owing to the orthogonality of the model, the { A , ) values are determined independently and have the important property of finality-Le., improving the characterization by using (n 1) Hermite expansion coefficientswill not necessitate re-evaluating the previously determined n coefficients.

+

VOL. 5 9

NO. 6 J U N E 1 9 6 7

73

Figure 2. Chorclctniurtimr of a d i n c a r s y s h acamfiw

to WinMlJ'r nuthd

I

I unlrnam nonliir

ExpWimenbl M.lhod8

The parameters, (A,), which characterize, or identify, an unknown nonlinear system can be experimentally determined through Equation 28. The details are shown schematically in Figure 2. The characterizing coefficients having been evaluated, the nonlinear model is synthesized according to Equation 21. The block diagram for the synthesis procedure is shown in Figure 3. In the experimental procedure of Figure 2, the presence of the unknown nonlinear system is obviated, provided recordings are available of an ensemble member of the white gaussian input x(t) and the corresponding sptem output y o ) . The recordings of xrt) and y ( t ) are fed simultaneously into the Laguerre network and the temporal averager, respectively. The ensemble members of x(t) and y ( t ) having been previously recorded, the entire set of characterizing coefficients { A , ] can be determined consecutively. The serial evaluation of the coefficients from this datively small amount of data is possible because the original data [recordings of x ( t ) and y ( t ) ] can be reused as often as n e w . If the input-output data were not recorded, the serial determination of the {A,) would require that the system be operated for an excessive and perhaps unacceptable length of time. A consideration of the nature and complexity of the

computing machinery necessary to implement the characterization and synthesis techniques is of direct interest. Because all quantities with the exception of the {A,} in Figures 2 and 3 are continuously varying time functions, an analog-computer realization of the bloek diagrams seems imperative. In the following analysis it is assumed that n Laguerre coefficients {c,) andp Hermite functions { #,(cI)) per Laguerre coefficient are used in the approximation of the output of the nonlinear system. The Laguerre network, then, will consist basically of n integrators as shown in Figure 4. The outputs of the Laguerre network rk(t) satisfy the following set of linear differential equations: dt

drk dt

= x(t)

-

It-1

r,(t),

k = 1, 2,

. .., n - 1

.-0

ro(0) = Tl(0) =

. .. = T,l(O)

= 0

(31)

This set of differential equations is the basis of the depicted analog-computex representation. I t will be recalled that one reason for selecting the Laguem function expansion of the past of the input was the

J y S h Omodiing

lo Winwr'snahod

exponential tuc4tion generator


1 + -r* 2

= x(t)

with initial conditions

Figure 3. of a nanliwar

74

1

- + 2ro

dro

simplicity by which the Laguerre coefficients could be generated from analog equipment. The Hermite polynomial generator consists of np piecewise linear function generators to yield the np polynomials { q,(ct)} and p” multipliers to form the { V(m)} from the Hermite polynomials. p” integrators are required for the time-weighing process to provide the p“ characterizing coefficients { A a ) . This much equipment is required to perform the characterization of a given nonlinear system so that the { },c may be computed simultaneously in the least amount of time. If the time of identification is of no concern, a considerable reduction of equipment is possible. If each Laguerre coefficient is evaluated serially, then only n function generators, one multiplier, and one integrator are used. Synthesis, however, will necessitate, in addition to np

function generators and p” multipliers to form the { v(m)),an exponential function generator consisting of mafiyoperational units (there are several different ways to construct this generator), p” amplifiers for introducing the {A,}, one summer, and one multiplier. Because of the excessive demands on hardware, the characterization and synthesis of nonlinear systems entirely by analog means is beyond the scope of present day analog facilities. Unfortunately, purely digital methods are not encouraging either. Aside from problems arising from the discretization of a continuous process, there is still the sheer size of the set of characterizing coefficients {A,) to contend with. Although the digital computer can accommodate a significantly greater number of coefficients { A a ) than the analog machine can, for any system of practical interest the set {A,) may be’expected to exceed considerably the storage capacity of the largest current digital computer. Certain operations are, nevertheless, most effectively performed on a digital computer. All function generation can best be done on the digital computer, as can the entire synthesis procedure exclusive of generating the Laguerre coefficients. The integration involved in generating the Laguerre coefficients is most efficiently done on an analog computer, as is the time averaging. In fact, the rapid and accurate generation of the Laguerre coefficients is the principal deterrent to the use of a digital computer. Because certain aspects of the computation are best performed on different computers, the availability of a hybrid analog-digital machine might, aside for problems concerning the plethora of Laguerre coefficients, render the solution feasible. Jbservalionr and Comments

Although the application of the Wiener theory does lead to the unique determination of the transmission characteristics, or coefficients {A,], of a given system, the synthesis of this system is by no means unique. Many systems with different physical configurations can exhibit the same transmission characteristics. Earlier it was stated that if two systems respond identically to white gaussian noise, they will have the same response to any common input; hence, they are equivalent. The weighted mean-square error of Equation 22 provides a measure of the degree to which the synthesized system approximates the given system, the error tending to zero as the number of Laguerre coefficients and Hermite functions increases. The fact that this error vanishes for a particular set (infinite) of coefficients {A,) does not necessarily mean that the two systems have the same response to white gaussian noise. Therefore, although the two systems have the same transmission characteristics, they do not in general have the same physical configuration or structure. Practical application of Wiener’s method is handicapped by the vast number of characterizing coefficients to be evaluated and the sophisticated computer facility needed to analyze the nonlinear system. To quote Bose, “It is safe to say that, at present, the Wiener theory is of greater theoretical than practical interest.” VOL 5 9

NO. 6 JUNE 1 9 6 7

75

..

ifficult t o incorporate into Wiener’s method any a priori information concerning the system

There are two major theoretical contributions to nonlinear system analysis inherent in the Wiener theory. First, stationary white gaussian noise is the most general probe for time-invariant nonlinear systems; the response to this noise is sufficient to permit the complete characterization of the system. Second, any nonlinear system, of the class to which this theory is applicable, is equivalent to a linear system with multiple outputs cascaded with a nonlinear, zero-memory system. The Hermite polynomial generator and accompanying exponential generator perform a nonlinear transformation of the present outputs of the linear Laguerre network, which characterize the past of the input, to yield the model output. Unfortunately, the above equivalence property also applies to linear and nearly linear systems. These systems form an imwrtant subclass of time-invariant nonlinear systems. If the Wiener representation of Figure 3 is to effect a linear transformation on the input, the combined result of the Hermite and exponential function generators must be linear. This means that the highly nonlinear effect of the exponential function generator must be canceled by the sum of the Hermite polynomials. A large number of Hermite polynomials and their related coefficients { A m ] will generally be required to produce this cancellation. Therefore, the Wiener theory is not well suited to linear systems because a highly complex model must be synthesized to simulate a relatively simple linear system. This example points out the inherent weakness of most universal methods of analysis: their inability to recognize and treat accordingly simple, straightforward situations. For this reason it is difficult to incorporate into Wiener’s method any a priori information concerning the system. The knowledge that an unknown system belongs to the important class of nonlinear, zero-memory systems is, however, of direct value. Because the system output is a function of the instantaneous value of the input, there is no need to use the Laguerre network to characterize the past of the input. The Laguerre coefficients {on] may be replaced by the single variable x ( t ) , the input to the system, and Equation 19 becomes P %t) =

i-0

a o t d x ) exp(-$/2)

(32)

while Equation 28 now reads at = (2+@y(tht(x) ’6

(33)


Although the above formulation for ?(t) appears to be fairly simple, a large number of Hermite functions will in general be needed to simulate the system output. For example, suppose that the unknown, nonlinear, zeromemory system is in fact a square-law device with transfer characteristic Y(0 =

XV)

Because the general test pr& is stationary white gaussian noise, the coefficients { a t ) can be evaluated by replacing the time averagey(t)Vi(x) by an ensemble average because of the ergodicity of the input. Thus, ~

ai

=

(2r)1/2

lo..

x*qi(x)p(x)dx

The probability distributionp(x) is gaussian; hence at =

J1-9

exp(-3/2)qt(x)dx

Performing the integration, one obtains

=o

for i odd

The square-law device can now be synthesized according to Equation 32. The Hermite function expansion ofj(t), however, cannot be expected to converge rapidly, prompting Zadeh (57) to remark that even a simple operation such as “squaring the input has a complex representation in terms of Laguerre-Herniite expansions.” There remain three fundamental decisions to be made before the Wiener theory can be implemented (not necessarily in this order) : the choice of scale factors in the Laguerre and Hermite functions, the choice of the number of Laguerre coefficients and Hermite functions, and the choice of the functions to characterize the past of the system input (they need not be Laguerre functions). The time-scale factor of the Laguerre functions and the scale factor in the argument of the Hermite functions have been taken as unity in the preceding development to simplify the presentation. By a judicious choice of these scale factors, it should be possible to realize a more efficient characterization and synthesis (in the sense of least mean-square error with a prescribed number of Hermite expansion coefficients) than would result from the use of unit scale factors. Unfortunately there is no

simple procedure for determining the optimum values of these scale factors. The optimum values will depend upon the number of Laguerre coefficients and Hermite functions taken to represent the system. For a given unknown system the optimum scale factors associated with n coefficients { A o l ]will not necessarily be the optimum scale factors when ( n 1) coefficients are used. The choice of the number of Laguerre functions to characterize the past of the system input and the number of Hermite functions associated with each Laguerre coefficient is at the discretion of the investigator; little help can be offered to make this decision a rational one. Of course, once this choice is made and the sets of coefficients { c n ] and { A , ) have been evaluated, increasing the number of members in each set to improve the accuracy of the identification will not necessitate reevaluation of the previously determined quantities. This property is a direct result of the final character of the coefficients. Examination of the Laguerre functions reveals that the high order functions weight the remote past more strongly than do the low-order functions. For equally large values of their arguments, the absolute value of the high-order Laguerre functions exceeds that of the low-order functions. Because a strong weighting of the remote past of the input is not desirable, small changes in the amplitude of high-order Laguerre coefficients should be assigned less significance than small changes in low-order coefficients. This can be accomplished by choosing fewer Hermite functions in the expansion for high-order Laguerre coefficients than for loworder ones. The last decision open to the investigator is the choice of the functions to characterize the past of the system input. Laguerre functions were originally selected because they form a complete orthonormal set on the interval (0,a), they decay with increasing argument, and the Laguerre spectrum of a given time function may be readily obtained from analog equipment. There are, however, many other sets of functions which satisfy these three requirements. For example, a set of orthogonalized exponentials can be constructed from the component functions {exp(-nnt)], n = 1, 2, . . ., which meet the above requirements. Huggins (27) has investigated the representation of signals using damped, orthogonalized exponentials, which include the Laguerre functions. H e presents a particularly attractive method for constructing the set of orthonormal functions on the semi-infinite interval ( 0 , a ) from an arbitrarily prescribed set of exponential components. I t is evident, then, that there are many complete, orthonormal sets of functions suitable for characterizing a signal. In any practical application only a relatively small number of characterizing coefficients ( c % ) can be handled. The requirement of completeness is, therefore, unnecessary, and the number of sets of suitable functions for characterizing the input is limitless. Of course, it is still desirable that the functions decay into the past and can be easily generated on physical instruments.

+

Bore’s Extension of Wiener’s Theory

Bose (8-70) was concerned with the application of Wiener’s theory to the design of optimum nonlinear filters, or systems. The filter problem is closely related to the identification problem in that an optimum mathematical model is to be synthesized from input-output data. Specifically, it is desired to determine that system, of a class of systems, which operates on the past of a given input time function to yield an output best approximating a given desired output with respect to some error criterion. Wiener’s method determines the optimum nonlinear system with respect to a weighted meansquare error criterion when the given input function is white gaussian noise. In general, however, the given filter input will not be white gaussian noise, and Wiener’s method is no longer applicable. Bose overcame this difficulty by introducing a series expansion (gate-function expansion) for the output of a nonlinear system in which the terms of the series are mutually orthogonal in time. This condition of orthogonality is valid regardless of the input time function. The orthogonal expansion is achieved by partitioning the input space into nonoverlapping cells and letting each term in the expansion represent the system output for a particular cell in the input space. At any instant, the past of the input resides in a particular cell in the input space, and only the term in the series assigned to that cell is nonzero. Thus, all the terms are mutually orthogonal in time regardless of the input function. I n the next section, the concept of partitioning the input space into nonoverlapping cells, thereby achieving an orthogonal expansion for the system output, is applied to the identification of nonlinear, continuous stirredtank reactor systems with two-level inputs. The results show the applicability of the method and should be of direct interest to chemical engineers.

NONLINEAR SYSTEMS W I T H TWO-LEVEL INPUTS Theory of Identification Model

As previously mentioned, the work of Bose provides the framework for the identification of nonlinear systems with two-level inputs. The necessity of restricting a general theory, such as that of Bose, to a narrower class of systems arises because of the extreme complexity inherent in any general approach. Nevertheless, systems having inputs which switch between two states are of considerable interest in their own right (2, 7, 38). Because the theory of the identification model has been presented in detail elsewhere (22, 46, 47), only a brief description will be given here for completeness. The model is restricted to the class of nonlinear, time-invariant processes with switched two-level inputs and finite memories. The input, x ( t ) , to the system is assumed to switch between two levels denoted by amplitudes of -1 and + l as shown in Figure 5. If the input to a real system switches instead between levels A and B, where A < B , VOL. 5 9

NO. 6 J U N E 1 9 6 7

77

F i p n 5. T w o - h l input

a linear transformation can be used to change these levels to -1 and +1, respectively. Because the present value of the output of a general nonlinear system is determined by the past history of its input, it is natural to consider first the characterization of this time function. This characterization is most easily accomplished by sampling the input at n uniform time intervals T and constructing the finite time sequence x(t)

i[x(O),

x(T), x(2 T), . . ., x((n 1x0,

where w(6) is a nonnegative weighting function. Substitution of Equation 35 into Equation 36 gives

- 1) T ) ] =

XI,

. . .,x.-11

(34)

where (n - 1) = T J T . T,, an estimate of the effective duration of the system's memory, is frequently c k n to be the system settling time, T,. The settling time is the time required for the system response to a test signal to come within a specified proximity of the system's final value. The two test signals most hquently used are the unit impulse and the unit step function. Equation 34 shows that as T + &Le., the sampling h q u e n c y approaches infinity and the number of samples n + --the representation of x(t) improves. The n sampled values of x ( t ) characterizing the input past are conveniently stored in a digital shift register tapped at n points. At any given time each tap can exist at only one level so that at each instant one and only one input state (set of n tap levels) can exist in the memory device. The input space consisting of all possible input configurations, of which there are 2", has thus been partitioned into nonoverlapping cells. If an expansion is made of the system output as a function of the individual tap levels, a model orthogonal in time will result. This means that each term in the expansion represents the system output for one particular input configuration-Le., there is one term for each input state. An orthogonal expansion is required so that each term can bc evaluated independently of all other terms. The orthogonal expansion is given in Equation 35, where j ( t ) is the model output.

3(0 =

c A, *(a) a

(35)

A, is a constant assigned to the 8th configuration of tap levels, and a is an index of summation which ranges over the 2" cells of the input space. The function @(8) takes on a value of either 1 or 0, depending upon whqther the &thconfiguration is present or not, respectively. At any instant in time, the past of the input resides in 78

a particular cell in the input configuration space, and only the coefficient assigned to that cell is nonzero. Suppax this is the 8th cell. Then *(p) = 1 and 9 = A,. The remaining (2" - 1) W s are zero. Because the model output is given by a finite sequence of numben, j ( t ) is a "staircase" approximation to the actual system output. As the number of tap in the input memory is increased (the input is sampled more frequently), the input characterkition becomes more accurate, and as a direct result the orthogonal expansion can be expected to approximate more closely the actual system output. The coefficients (A,] are determined by minimizing the weighted mean-square ermr between the actual system output, y(t), and the orthogonal model output, )(f). This error is given by


When this expression is minimized with respect to the 8th coefficient, the result is

This is the desired expression relating the orthogonal coefficients to certain experimentally measured time averages of the system's input and output. The minimum mean-square ermr of the identification process is realized when the coefficients (A,) are determined according to Equation 38 and can be shown to be given by the following expression:

Inpul Switching Constraint

The number of coefficients (A,) to be evaluated for a single-input, single-output system is 2". For a system having p inputs with n, tap on the ith input memory and q outputs, the number of coefficients is P

N = 2nq,

M =

8-1

n,

(40)

Because this number may readily reach astronomical proportions, henceforth only single-input systems with the following switching time constraint will be considered: 1 T , Z 2- T , (41) This states that the time between consecutive input switches, T,*, is not less than half the effective duration of the system's memory, Tm-i.e., no more than two switches can occur during a time interval T,. This is not an unreasonable restriction; in most chemical process control systems there exists a minimum practical

.

switching time. Note, however, that the former assumption of 2” distinct input states implies that the input can switch at least as fast as T J n . The number of allowable input configurations satisfying the minimum switching time constraint is readily determined for an input memory with uniformly spaced taps (42)

where the brackets denote truncation to the nearest integer less than or equal to the expression within the brackets. An input memory with a tapered tap arrangement (nonuniform distribution) can be expected to produce a smaller mean-square error term than one with uniform spacing. This statement is valid for systems in which the influence of the past input on the present output attenuates with time. An input memory whose distribution of taps generally diminishes with distance down the memory (in the direction of the more distant past) will be more efficient than one with uniform tap spacing. However, the optimum distributjon, in the sense of realizing the minimum mean-square error of n taps along the memory, is not readily determined. Wiener’s Theory and Present Identification Model

I t may seem to the reader that the foregoing development of the identification model bears little connection to the earlier Wiener-Bose theory. However, this is not the case. The present model is merely a first approximation to the completely general and basic theory of nonlinear systems which, owing to its inherent complexity, has limited practical utility at the present time. But, as pointed out earlier, the application of hybrid analog-digital computers to Wiener’s theory may actually lead to useful and attainable results. Wiener’s theory calls for the introduction of white gaussian noise into the nonlinear system to be identified regardless of future system operation. Only in this way can a truly accurate representation be obtained. By employing two-level input switching during identification, Wiener’s theory is partially short-circuited and a practical solution is obtained to the restricted problem. The next step in generalizing the model would be to permit three-level input switching, then four-level switching, and so on until the input eventually approached continuity. A point would be reached in this process where a sampled data representation of the

George H. Harris received his P h B . at Princeton University and is now with Arthur D. Little, Inc., Cambridge, Mass. Leon Lapidus is Professor of Chemical Engineering at Princeton. The authors acknowledge jnancial su@port from tfie Food Machinery Carp. and Lever Brothers Co. They also acknowledge assistance by the National Science Foundation under Grant NSF-GK-460. The computer facilities used are supported in part by the N S F under Grant NSF-GP-579. AUTHORS

input would be impractical, and one would turn to an orthogonal expansion (such as that of Laguerre) of the input past. The input now would be white gaussian noise as prescribed by Wiener, rather than merely randomly switched values. The orthogonal expansion for the system output could continue to be a gate-function type, as used in the present identification model, or, alternately, become a Hermite function expansion. At this stage, the simple model given above will have been gradually transformed into the completely general Wiener-Bose model. Example: Identification of a Nonlinear Reactor System

The foregoing theory has been applied to the identification and synthesis of nonlinear chemical reactor systems with two-level inputs (22). The process dynamics of these systems have been simulated on a digital computer. Following simulation, a system was identified via computer by evaluating the model coefficients { A , } . With these coefficients, a model of the system could be constructed and its output, T ( t ) , generated and then compared with the actual computed process output, y ( t ) . The performance of the identification procedure is controlled. by four decisions on the part of the investigator: -Number of taps along the input memory -Spacing of taps along the input memory -Estimate, T,, of the effective duration of the system’s memory -Identification time The first three factors, in essence, define the digital input memory. The fourth factor is dictated by the investigator’s freedom to manipulate the two-level input and the criterion of constancy of the model coefficients. Several input memories and two types of input switching were investigated. First, each systeni was subjected to a randomly switched two-level signal over which the investigator had no control. In the second case, an input was so generated that the identification time was reduced to a minimum. A continuous stirred-tank reactor (CSTR) system in which an irreversible, first-order, exothermic chemical reaction A + B takes place was simulated by computer and subsequently identified. The reactor is cooled by water flowing through a jacket which surrounds it. The process model has been the subject of considerable attention in the chemical engineering literature and is adequately described elsewhere ( 3 ) . The behavior of the CSTR is governed by the following dynamic material and energy balances : dA F - = - (A0 - A ) - k A dt V

VOL. 5 9

NO. 6

(43)

JUNE 1967

79

Figure 6. Tmpnnhlra simulation of time-umying C S T R systm

k = ko exp ( - E / R T )

k is given by the (45)

The dependent variables A(t) and T(t)are to be controlled through the on-off manipulation of the coolant flow rate, Fa. This choice of control\ariable produces a slowly time-varying process. For a particular set of physical parameten and input conditions (22), this CSTR system exhibits a number of different steady-state or equilibrium points. In particular, with the input level at x = -1 corresponding to no cooling (Fc = 0), there are three steady states; and with x = +1 comespnding to maximum cooling (F. = max), there are also three (different) steady states. The estimate of the system's memory used by the model, T,, was taken as 460 seconds. This is the time required for the temperature response of the CSTR when the cooling water is shut off to rise from its lower steady state to within 1% of its upper steady state. This is defined to be the system settling time. Other settling times can be based on a falling temperature response (300 seconds) or, alternatively, on concentration measurements (220 and 380 seconds). Figures 6 and 7 show a typical model simulation of the CSTR system. The two outputs of the model, temperature and concentration, are independent of one another and dependent only upon the past history of the input as stored in the model's memory. The model employed 80

'ylh

31 tapnad taps, T , = 460 rccondr, nonrmdom inpA

31 tofined taps, T, = 4fX sacondr, nmuanabm input

where the reaction rate constant Arrhenius e x p m i o n

Figure 7 . Concentration simulation of time-varying CSTR


in this simulation had an input memory with 31 t a p in a tapered arrangement. The placement of taps along the memory can be ascertained from the pattern of steps which forms the model output. At time t = 0 the system, very near its steady state, is subjected to a step change in the input h m x = +1 tox = -1, a n d a t t = 288.13 seconds the input reverts to x = +l. In the first 50 seconds, the model output steps 16 times, thereby achieving accurate temperature and concentration situations. The mean absolute error in the temperature simulation is 0.84% of the maximum range over which the temperature can vary, and the corresponding mean absolute error in the concentration simulation is 0.87%. The effect of varying the number of taps on the accuracy of model simulations is most interesting. By use of randomly and nonrandomly switched inputs to models with uniformly spaced taps, the mean absolute error in the temperature decreased from a value of 2' K. with 10 taps to about 0.35' K. with 51 taps; in the same sense the concentration mean absolute error decreased by a factor of five for the same increase in number of taps. The nonrandom input was switched to achieve in each case an identification in minimum time. Additional features of the identification model have been investigated k d will be briefly mentioned here. Various tap distributions have been tried and certain tapered spacings are superior to the uniform distribution. Several estimates of T , have been used and the existence of an optimum T , has been demonstrated; however,

.

there seems to be no straightforward method to determine this optimum value. A change in the estimate of T , will alter the accuracy of the temperature and concentration simulations in different, and perhaps opposite, ways. A satisfactory criterion of identification appears to be merely the occurrence of all allowable input configurations. Finally, a system may be identified in minimum time with a high degree of accuracy, provided a large number (more than 20) of taps is employed in the memory. NOMENCLATURE

A A0 A,

= concentration of A in reactor and effluent

a

b

= =

c

=

cc

= = =

c,

E -

=

-

e2

-

=

e,in2

=

F

=

FC

=

h(t) = (-AH) =

K k

= = = = =

concentration of A in reactor influent az,j, ...k, = olth orthogonal expansion coefficient of model output time-scale factor in Laguerre functions time-scale factor in Hermite functions heat capacity of reactor contents heat capacity of coolant nth Laguerre coefficient of the past of the input activation energy of reaction weighted mean-square error minimum weighted mean-square error general nonlinear functional; Hermite function expansion; reactor influent and effluent flow rate coolant flow rate functional expansion term heat of reaction ( > 0 if exothermic)

UP

pcco

reaction rate constant frequency factor in reaction rate expression k0 L[ ] Laplace transform operator N general nonlinear transformation; total number of allowable input configurations n = number of taps along input memory P = jcint probability distribution of the Laguerre coefficients; number cf system inputs c7 = number of system outputs R = gas constant Rz2(7) = time autocmrelatioii function of x ( t ) Tic = response of Laguerre network through ( k 1 ) sections S = Laplace transform variable T = time averaging variable; sampling interval ; temperature within reactor and of effluent TC = inlet coolant temperature = estimate of effective duration of system’s memory T, T, = system settling time = temperature of reactor influent To T,, = switching time interval t = time U = product cf heat transfer coefficient and heat transfer area V = reactor volume 1 ) normalized V(a) = V ~ ( C O ) ~ ~ ( C I. ) .. V ~ ( C , ) = product of ( n Hermite polynomials ~ ( t ) = non-negative weighting function x(t) = system input as function of time y(t) = system output as function of time $(t) = model output 2 = arbitrary Hermite function variable

+

+

Greek Letters =

expansion indices

= nth normalized Hermite polynomial = =

P PO

7

9

No0

= = = = =

summation index set of all permutations of the integers 0, 1, 2, . . ., p taken ( n 1) at a time density of reactor contents density of coolant time variable power spectral density orthogonal function in model output expansion of Equation 35

+

an

= nth Laguerre function

iLn

= =

62

nth Hermite function domain of integration

Superscript -- vinculum, denotes time average of quantity beneath it B I BL IOG RAPHY (1) Angus, R., Lapidus, L., A.I.Ch.E. J.9, 810 (1963). (2) Aris, R., T h e Ofitimnl Design o j Chemical Reactors, Academic Pres5, New York, 1961. (3) Aris, R., Amundson, N. R., Chem. Eng. Sci.7, 121 (1958). (4) Aris,R., Chien, H., Ray, W. H., AutornnticaZ, 41, 59 (1964); 3, 53 (1965). ( 5 ) Balakrishnan, A . V., I R E Trans. Inform. Theory 9, 237 (1963). (6) Barrett, J. F., J . Electron. Control 15, 567 (1963). (7) Blakemore, N., Aris, R., Chem. E n g . Sci. 17, 591 (1962). (8) Bose, A. G., 1956 I R E A‘ati. Conu. Record, Part 4, pp. 21-30. ( 9 ) Bose, A. G., I R E Trans. Circuit Theory 6 , Special Suppl., pp, 30-40 (1959). (10) Bose, A. G., “A Theory of Nonlinear Systems” Tech. Rept. 309, Research Laboratory of Electronics, Mass. Inst. Tech., Camdridge, Mass., 1956. (11) Box, G. E. P., Chanmuqam, J.. IND.ENC.CIIEM. FUNDAMENTALS 1 , 2 (1963). (12) Brilliant, M. B., “Theory of the Analysis of Sonlinear Systems,” Tech. Rept. 345, Research Laboratory of Elcctronics, Mass, Inst. Tcch., Cambridge, Mass., 1958. (13) Cosgriff, R . L., “Nonlinear Control Systems,” McGrnw-Hill, New York, 1958 (14) DavcnpFrt, W. B., Jr., Root, W. L., “An Introduction to the Theory of Random Signals and Noise,” McGraw-Hill, New York, 1958. (15) Davidson, E. J., ZEEE Trans. Auto. Control 11, 93 (1966). (16) Donalson, D. D., Kishi, F. H., “Modern Control Systems Theory,” C. T. Leondes, ed., chap. 6, McGraw-Hill, Kew York, 1966. (17) Eykhoff, P., I R E Trans. Auto. Control 8 , 347 (1963). (18) Flake, R. H., Z’mns. AIEEPart 11, 81, 330 (1962). (19) Flake, R . H., “Volterra Series Representation of Time-Varyinn hTonlinear Systems,” Preprint of Paper No. XIII-5, Fourth Joint Automatic Cantrol Confrrence, p . 375 (abstract), Minneapolis, Minn., 1963. (20) George, D. A., “Continuous Nonlinear Systems” Tech. Rept. 355, Research Laboratory of Electronics, Mass. Inst. Tech., Camhiidge, Mass., 1959. (21) Goodman, N. R., Katz, S., Kramer, R. H., Kuo, M . T., Technometrics 3, 245 (1961). G. H., “The Identification of Nonlinear Svstems TVirh Two-Level (22) Haw:, Inputs, Ph.D. dissertation, Princeton LJniversity, 1965.’ (23) Ho, B. L., Kalman, R . E., “Effective Construction of Linear State-Variable Modelsfrom Input-Output Data,” J.Control ( S I A M ) ,to be published. (24) Ho, Y. C., J . M a t h . Analysis and Afifilicotion 6, 152 (1963). (25) Ho, Y . C., Lee, R . C . K., Injorm. Control 8, 93 (1965). (26) Homan, C. J., Tierncy, J. W., Chem. E n g . Sci. 12, 153 (1960). (27) Huggins, W. H., I R E Trans. Citcuit Theory 3, 210 (1956). (28) Kalman, R. E., SIAM J . 13, 520 (1965). (29) Kalman, R. E., Bucy, R. S., Trans. A S M E (.I. Baric Eng.) 83D, 95 (1961). (30) Karush, W . , Dear, R. E., J . Math. Psychology 3, 1 9 (1966). (31) Katzenelson, J., Could, L. A., Injorrn. Control 5 , 108 (1962); 7, 117 (1964). (32) Kerr, R . B., Surber, W. H., Jr., I R E Trans. Auto. Control 6 , 173 (1961). (33) Kopp, R., Orford, R., AZAA J. 1, 2300 (1963). (34) Ku, Y. H., “Analybis and Control of Nonlinear Systems,” Ronald Press, New York, 1958. (35) Ku, Y. H., I R E Trans. Circuit Theor)) 7, 479 (1?60). (36). Ku.. Y. H.. J . Frankiin Inst. 271., 108 11961). . ~, (37) Lapidus, L., Peterson, T., A.I.Ch.E. J . 11, 891 (1965). (38) LaSalle, J. P., “The Time Optimal Control Problem I ’ Contributions to t h e Theory of Nonlinear Oscillotions, S . Lefschetz, ed., 5 , pp. 1-24,’Princeton University Press, Princeton, N. J., 1960. (39) Lubbock, J. K., “Automatic and Remote Control,” Proceedinqs I.F.A.C., Moscow, U.S.S.R., 1960), Z,pp.761-769, Butterworths, London, 1961. (40) Lubbock, J. K., Barker, H. A,, “A Solution of the Identification Problem,” Preprint of Paper No. VIII-I, Fourth Joint Automatic Control Conference, pp. 191-199, Minneapolis, Minn., 1963. (41) Luenbcrger, D. G., IEEE Trans. M i l i t a r y Electronicr 8, 74 (1964). (42) McFee, R., Trans. AIEE, Part 11, 80, 189 (1961). (43) Magill, D. T., IEEE Trans. Auto. Control 10, 434 (1965). (44). Mishkin, E., Braun, L., Jr., eds., “Adaptive Control Systems,” McGrawHill, New York, 1961. (45) Mowery, V. O., IEEE Trans. Auto. Control 10, 399 (1965). (46) Roy, R. J., DeRusso, P. M., Ibid., 7, 93 (1962). (47) Roy, R., Miller, R. W., DeRusso, P. M . ,IEEE Tranr. Appl. Itid. 83, 173 (1964). (48) Singleton, H. E., “Theory of Nonlinear Transducers, Tech. Rept. 160, Research Laboratory of Electronics, Mass. Inst. of Tech. Cambridge, Mass., 1950. (49) Sliepcevich, C. M., Bishop, K. A., Puckett, T. H., “Techniques for Identification of Linear and Linear Time-Varying Processes,” 1962, Joint Automatic Control Conference. (50) Surber, W. H . , “Survey of Adaptive Control Systems,” Part 11, Principles of Adaptive Control Systems: T h e “Identification” Problem, ARAP Rept. 24, Aeronautical Research Associates of Princeton, Inc., Princeton, N. J., 1960. (51) Tou, J. T., Intern. J. Control 2, 21 (1965). (52) Volterra, V., “Theory of Functionals and of Integral and Integro-Differential Equations,” Blackie and Sons, London, 1930, and Dover Publications, New York, 1959. (53) Wiener, N., “Nonlinear Problems in Random Theory ” Technology Press of the Mass. Inst. of Tech., Cambridge, Mass., and Wiley, K& York, 1?58. (54) Wiener, N., “Response of a Nonlinear Device t o Noise I’ Rept. 129, Radiation Laboratory, Mass. Inst. of Tech., Cambridge, Mass., 1942: (55) Wolf, A . A., Trans. AIEE, Part 11, SO, 289 (1961). (56) Wolf, A. A,, Dietz, J. H., J . Franklin Inst. 274, 369 (1962). (57) Zadeh, L. A,, I R E Trans. Circuit Theory 3, 277 (1956). (58) Zadeh, L. A,, I R E Trans. Inform. Theory 7, 139 (1961). ( 5 9 ) Zadeh, L. A., J . Fmnklin Inst. 255, 387 (1953).

.

VOL. 5 9

NO. 6

JUNE 1967

81

identification of nonlinear systems - ACS Publications

Recommend Documents